Language Families


Introduction to the more important language families including
Indo-European, Uralic
, Altaic, Afro-Asiatic
, and others
List of the 20 most spoken languages in the world


What are Language Families?

It appears that the use of language came about independently in a number of places. All languages change with time. If two groups of people speaking the same language are separated, then their languages will change along different paths. First they develop different accents; next some of the vocabulary will change (either due to influences of other languages or by natural processes). When this happens a different dialect is created; the two groups can still understand each other. If the dialects continue to diverge there will come a time when they are mutually unintelligible (in other words, the people are speaking different languages). In time, with enough migrations, a single language can evolve into an entire family of languages.

Each language family listed below is a group of related languages with a common ancestor. Languages in the same branch are sister languages that diverged within the last 1000 to 2000 years (Latin, for example, gave rise to the Latin Branch languages in the Indo-European Family).

Languages in different branches of the same family can be referred to as cousin languages. For most families these languages would have diverged more than 2000 years ago.

Languages in the same family, share many common grammatical features and many of the key words, especially older words, show their common origin. I'll show that with the word month in several Indo-European languages:

English month
Welsh mis
Gaelic
French mois
Spanish mes
Portuguese mês
Italian mese
German Monat
Dutch maand
Swedish månad
Polish miesiac
Russian myesyats
Greek minas
Albanian muaj
Lithuanian menuo
Farsi mâh
Hindi mahina

Compare that with the word for month in languages that are not Indo-European.

Arabic shahr
Finnish kuukausi
Basque hilabethe
Turkish ay

With all this information, we can now look at a selection of language families.


The Indo-European Family

The most widely studied language family in the world is the Indo-European. There are a number of reasons for this:

The Indo-European languages tend to be inflected (ie verbs and nouns have different endings depending on their part in a sentence). Some languages (eg English) have lost many of the inflections during their evolution. These languages stretch from the Americas through Europe all they way to North India. There are ten branches.

The Celtic Branch

This is now the smallest branch. The languages originated in Central Europe and once dominated that continent. The people migrated across to the British Isles over 2000 years ago. Later, when the Germanic speaking Anglo Saxons arrived, the Celtic speakers were pushed into Wales (Welsh), Ireland and Scotland (Gaelic).

One group of Celts moved back to France. Their language became Breton spoken in Brittany. Breton is closer to Welsh than to French. Other Celtic languages (Cornish, Gaulish, Cumbrian, and Manx) have became extinct.

The Germanic Branch

These languages originate from Old Norse and Anglo Saxon. They include English (the second most spoken language in the world, the most widespread, the language of technology, the language with the largest vocabulary). A useful language to have as your mother tongue.

Dutch and German are the closest major languages related to English. An even closer relative is Frisian. Flemish and Afrikaans are varieties of Dutch while Yiddish is a variety of German. Yiddish is written using the Hebrew script.

Three of the four Scandinavian languages belong to this branch: (Danish, Norwegian, and Swedish). Swedish has tones, unusual in European languages. The fourth Scandinavian language, Finnish, belongs to a different family.

Icelandic is the least changed of the Germanic Languages - being close to Old Norse. Another old language is Faroese. Gothic and Frankish are extinct languages from this branch.

The Latin Branch

Also called the Italic or Romance Languages.

These languages are all derived from Latin. Latin is one of the most important classical languages. Its alphabet (derived from the Greek alphabet) is used by many languages of the world.

Italian and Spanish are the closest to Latin. French has moved farthest from Latin in pronunciation, only its spelling gives a clue to its origins. French has many Celtic influences. Romanian has picked up Slavic influences because it is a Latin Language surrounded by a sea of Slavs. Portuguese has been separate from Spanish for over 1000 years. The most important of these languages is Spanish, spoken in most of Latin America (apart from Brazil).

Romansh is a minority language in Switzerland. Ladino was the language spoken by Spain's Jewish population when they were expelled in 1492. Most of them now live in Turkey and Israel. Provincial and Catalan are closely related languages spoken in the south of France and the north of Spain. Note that Basque is not an Indo-European language - in fact it is totally unrelated to any other language of the world. Galician is a Portuguese dialect with Celtic influences spoken in the north west of Spain. Finally, Moldavian is a dialect of Romanian spoken in the Moldova. Under the Soviets the Moldavians had to use the Cyrillic alphabet. Now they have reverted back to the Latin alphabet.

Apart from Latin, other extinct languages include Dalmatian, Oscan and Umbrian.

The Slavic Branch

These languages are confined to Eastern Europe. The Catholic peoples use the Latin alphabet while the Orthodox use the Cyrillic alphabet which is derived from the Greek. Indeed some of the languages are very similar differing only in the script used (Croatian and Serbian are virtually the same language). One of the oldest of these languages is Bulgarian. The most important is Russian. Others include Polish, Kashubian (spoken in parts of Poland), Sorbian (spoken in parts of eastern Germany), Czech, Slovak, Slovene, Macedonian, Bosnian, Ukrainian and Byelorussian.

The Baltic Branch

Three Baltic states but only two Baltic Languages (Estonian is related to Finnish). Lithuanian is actually the oldest of the Indo-European languages. Its study is important in determining the origins and evolution of the family. Lithuanian and Latvian both use the Latin script. Prussian is an extinct language from this branch

The Hellenic Branch

Ancient Greek gave rise to one extant language - modern Greek. The other offshoots of Ancient Greek have all become extinct. Greek is one of the important classical languages. The language has its own script, derived from Phonoecian with the addition of symbols for vowels. It is one of the oldest scripts in the world and has lead to the Latin and Cyrillic alphabets.

The Illyric Branch

Another single language branch. Only Albanian (strongly influenced by the Slavic languages) belongs to this branch. It has its own script. It is spoken in Albania and Kosovo.

The Anatolian Branch

This branch includes the language of the Hittite civilisation which once ruled central Anatolia, fought the Ancient Egyptians and was mentioned in the Bible's Old Testament. Other languages were Lydian (who ruled the south coast of Anatolia), Lycian (a Hellenic culture along the eastern coastal regions) , Luwian and Palaic. All languages in this branch are extinct.

The Armenian Branch

This is represented by Armenian (influenced by Iranian). It has its own script.

The Iranian Branch

These languages are descended from Ancient Persian, the literary language of the Persian Empire one of the great classical languages. Avestan is the extinct language of the Zoroastrian religion.

The main language of this branch is Farsi (also called Iranian), the language of Iran and much of Afghanistan. Kurdish is a close relation. The Kurds are spread between Turkey, Syria, Iran and Iraq.

Pashto is spoken in Afghanistan and parts of north west Pakistan. Baluchi is spoken in the desert regions between Iran and Pakistan. These languages are written in the Nastaliq script, a derivative of Arabic writing. It is interesting that you cannot tell which family a language belongs to by the way it is written.

Ossetian is found in the Caucasus mountains, north of Georgia. Tadzhik is a close relative of Farsi, written in Cyrillic and spoken in Tadzhikistan (of the former USSR).

The Indic Branch

This branch has the most languages. Most are found in North India. They are derived from Sanskrit (the classical language of Hinduism) and later, Pali (the classical language of Buddhism).

Hindi and Urdu are very similar but differ in the script. The Hindi speakers are Hindus and use the Sanskrit writing system called Devanagari (writing of the Gods). Urdu is spoken by the Muslims so uses the Arabic Nastaliq script. These two languages are found in north and central India and Pakistan. Nepali is closely related to Hindi.

In India most of the states have their own language. These languages either use Devanagarni scrip or a derivation if the people are Hindus or use the Arabic Nastaliq script if the people are Muslims. Bengali (West Bengal as well as Bangladesh), Oriya (in Orissa), Marathi (in Maharashtra), Assamese (in Assam), Punjabi (from the Punjab), Kashmiri (Kashmir), Sindhi (the Pakistan province of Sindh - written in Nastaliq), Gujerati (Gujerat in western India), Konkani (in Goa, an ex Portuguese colony, uses the Latin script), Sinhalese (Sri Lanka - uses its own script derived from Pali), Maldivian (Maldives - with its own script).

The most surprising language in this branch is Romany, the Gypsy's language. It appears that the Gypsies migrated to Europe from India.

The fascinating point about India is that the south Indian languages (like Tamil) are not Indo-European. In other words, Hindi is related to English, Greek and French but is totally unrelated to Tamil. North Indians visiting Madras are as baffled by Tamil as I am!

The Tokharian Branch

Turfanian and Kuchean are recently identified extinct languages once spoken in north west China. Very little is known about this branch as only a few manuscripts are in existence.


Apart from the Indo-European Family, there are others. A brief description of a few of these other families follows.


The Uralic Family

Not all European languages are Indo-European. There are three European languages that are members of the Uralic Family. The family is named from the Ural mountains. The people speaking these languages originated from the Siberian side of the Urals. Over 1500 years ago they migrated to Europe and have become entirely Europeanised. Their languages tell the story of their migrations.

Finnish and Estonian are closely related, while Hungarian is very different. The other languages in this family are spoken in Siberia, apart from Sámi which is spoken in Lapland (northern Scandinavia).

The Uralic Languages have many suffixes. Finnish, for example, behaves as if it had fifteen noun cases. Country names in Finnish are difficult to recognise. Finland, for example, is Suomi.

The Altaic Family

The Altaic Family is named after the Alti Mountains, in Central Asia. These people were nomadic horsemen living in the plains. One group migrated towards Europe, the other group migrated towards the Korean Peninsula and the islands of Japan.

Turkish is the most westerly member of this family as well as the most spoken. Many of the others are spoken in former USSR republics (Azeri in Azerbaijan), Turkmen (in Turkmenia), Kazakh (in Kazakhstan), Kirghiz (in Kyrghystan), Uzbec (in Uzbekistan, land of Genghis Khan), Uigur (in Western China across the Pamir Mountains - get the Atlas out!).

Mongolian is found in Mongolia and has a script that goes down rather than horizontal. Korean and Japanese are the most easterly Altaic languages.

The scripts used by these languages depend on historical or political factors, some use Latin, the ex-Soviet ones use Cyrillic. Mongolian and Korean both have their own peculiar scripts. Korean evolved separately from all the other scripts in the world having been invented a few hundred years ago. The language used to be written in Chinese characters. Japanese is still written with Chinese characters but there are two other alphabetic scripts.

The Altaic languages have lots of suffixes and a property called vowel harmony. This means that the vowels are divided into two groups. Words will either have one type of vowels or the other. All the suffixes have two forms one for each type of vowel. In Turkish, the plural is formed by the addition of LER or LAR. The suffixes themselves can be glued on one after the other. For example, EV is house, EV-LER is houses, EVLER-IMIZ is our houses, EVLERIMIZ- E is to our houses, etc. Turkish is one of the most regular languages in the world. It has one irregular noun (water) and one irregular verb (to be).

All languages are influenced by languages they are in contact with. At the two extremes of the Altaic family, Turkish has many Arabic words while Korean and Japanese have many from Chinese.

Some linguists do not include Korean and Japanese in this family. Others link the Uralic and Altaic families together.

The Sino-Tibetan Family

The Sino-Tibetan Family is an important Asian family of languages. It contains the world's most spoken language, Mandarin.

The languages in this family are monosyllabic tonal languages. This means that words are made up of single syllables, for example in Mandarin (GUO - country, MEN - gate, WO - I, REN - person, AN - peace). The syllables themselves have tones. This means that the voice can be high, low, rising, falling, etc, just like singing. It is like the way many people raise the voice at the end of a question. As an example the syllable, MEN can mean gate or we depending on tone. Mandarin has four tones, Thai has five (MAI can mean not, burn, wood or no depending on tone), but Cantonese has nine.

The languages in the Sinitic Branch are the various languages of China. They are all written in Chinese characters. Each syllable has a different character so that the writing is not alphabetic. There are over 50,000 characters, 6000 of which are needed to read a newspaper. Even though the different languages have different pronunciations, the meanings of characters are the same.

The languages in the Tibeto-Burman Branch are spoken in Burma (Burmese, Karen, Shan), Thailand, Laos (Lao), Southern China, Tibet (Tibetan) and Nepal (Sherpa, Newari). They are written in scripts derived from the curly scripts of south India.

The Tai and Myao Branches are spoken around northern Thailand and Southern China.

Some linguists consider the Tai Languages to be a separate family.

The Malayo-Polynesian Family

Also known as Austronesian, the Malayo-Polynesian Family is made up of languages with fairly simple grammar. Malay, for example, has no inflections for tense or case. Plurals are made by doubling the word (ANAK - child, ANAK ANAK - children). The possessive pronouns (my / our) have differing forms depending on the item possessed.

They include the many languages of Indonesia (Javanese, Sundanese, Batak, Balinese), the Philippines (Tagalog, Ilocano, Visayan), the non-Chinese languages of Taiwan, some languages in Indo-China (like Cham in Vietnam), and languages of the Pacific (like Maori from New Zealand, Fijian, Tahitian, Rapa Nui spoken on Easter Island, and Hawaiian).

An interesting exception is Malagasy, which is spoken in Madagascar. Over 1000 years ago, Malay people migrated in boats across the Indian Ocean to Madagascar and picked up African culture, but their language gives away their origins.

The Afro-Asiatic Family

The Afro-Asiatic Family is dominated by Arabic, an important modern and classical language.

These languages have complex grammars based on three consonant clusters. For example, in Arabic itself, the letters KTB has to do with writing. KiTaB is book. Plurals are all irregularly formed and the usual way is to change the vowels. KuTuB is books. Other words with the KTB root have something to do with writing: KaTaBa - to write, KaTtaBa - to make someone to write (ie to teach), maKTaB - office, KaaTiB - clerk, maKTaBa - library, miKTaB - typewriter, KuTuBii - bookseller, maKTuuB - letter. The Arabic alphabet mainly uses consonants because the reader can supply the correct vowels from the context.

The other languages in the Semitic Branch of this family are Maltese (written in the Latin script because the Maltese are Catholic), Hebrew (with its own script, another important classical language), Amharic (the language of Ethiopia with its own script), and Tigrinya (spoken in the Horn of Africa).

The Berber Branch is spoken in the hills of North Africa. The Cushitic Branch by people in Ethiopia, Sudan and Somalia. Hausa, the most important member of the Chadic Branch, is the main language of Nigeria.


Other Language Families

There are over 100 language families in the world.

The Niger Congo Family features the many languages of Africa south of the Sahara. Swahili of Tanzania and Kenya, Ewe of Ghana and Benin, Yoruba of Nigeria, Wolof of Senegal, Xhosa and Zulu of South Africa.

The Chari-Nile Family includes languages of North East Africa like Nubian of Southern Egypt and Sudan.

In southern Africa there is small group of languages called the Khoisan Family. Two of its languages are Hottentot and Bushmen. These contain clicking consonants borrowed by neighbouring languages.

The Dravidian Family are the very complex languages of South India (like Tamil and Telegu).

Another Asian group is the Mon-Khmer Family. This includes Khmer (Cambodia), Mon and Palong (Burmese hills), and Nicobarese (Indian Ocean).

The Algonkian Family of languages are found in North America and include Ojibwa, Cree, Blackfoot, Micmac, Cheyenne, and Delaware.

Another North American group is the Athapascan Family which includes Navajo and Apache.

Again in North America there is the Iroquoian Family. Cherokee and Mohawk are examples.

Covering North and Central America is the Uto-Aztecan Family with languages like Hopi and Comanche from the USA as well as the language of the Aztecs (Nahuatl) and the cave-dwelling people of the Copper Canyon in Mexico (Tamahumara).

The Mayan Family of languages are spoken by the descendants of the Mayas in southern Mexico and Guatemala (Quiche, Mam, Tzotzil, Cakchiquel, Yucatec).

A smaller group in Central America is the Macro-Chibchan Family. This includes Miskito (Honduran and Nicaraguan Caribbean coast) and Cuna (Panama).

The Andean-Equatorial Family covers large areas of South America. It includes Quechua (the language of the Incas in Peru), Aymara (Bolivia), Guarani (Paraguay), Tupi (Brazil) and Arawak (Carribean Coasts).

Some languages are totally unrelated to other languages, these are called Independent or Language Isolates. These include Ainu (Japan), Basque (in the Pyrenees between France and Spain), Vietnamese, and Burushaski (spoken in a single valley in Kashmir and without a writing system).

The total number of languages in the world runs into thousands. Mexico has 52. The old USSR had 100. Nigeria has over 400. The island of Papua New Guinea has nearly 700, virtually a different one in each valley. India has over 800 languages in several families (Indo-European, Dravidian, Sino-Tibetan, Mon-Khmer, Munda).

Unfortunately, with the onset of mass communications, many smaller languages are in danger of extinction.


The 20 Most Spoken Languages of the World

Position Language Family Script Speakers
(Millions)
1 Mandarin Sino-Tibetan Chinese 900
2 English Indo-European Latin 430
3 Hindi Indo-European Devanagarni 320
4 Spanish Indo-European Latin 310
5 Russian Indo-European Cyrillic 280
6 Arabic Afro-Asiatic Arabic 185
7 Bengali Indo-European Bengali 180
8 Portuguese Indo-European Latin 175
9 Malay / Indonesian Malayo-Polynesian Latin 140
10 Japanese Altaic Chinese / Japanese 125
11 German Indo-European Latin 120
12 French Indo-European Latin 115
13 Urdu Indo-European Nastaliq 88
14 Punjabi Indo-European Gurumukha 75
15 Korean Altaic Hangul 68
16 Telegu Dravidian Telegu 64
17 Italian Indo-European Latin 63
18 Tamil Dravidian Tamil 62
19 Marathi Indo-European Devanagarni 61
20 Cantonese Sino-Tibetan Chinese 60

© 1997 Kryss Katsiavriades


Link

Yeoman's Word List
Word lists from several language families.


For more information search Encyclopaedia Britannica

 


KryssTal Banner

[Home Page] [Language Page]
[The English Language] [Borrowed Words in English] [UK and USA English]
[Cockney English] [London Place Names] [Grammar] [It's a WORLD Wide Web]

Comments and contributions to Kryss at webmaster@krysstal.com

Readers' Feedback


Books From Amazon

Click on the ISBN Number to go straight to the book.
Amazon
COM
Amazon
Co UK
The Atlas of Languages : The Origin and Development of Languages Throughout the World is a detailed atlas of language families, full of maps and photos.
0816033889
0816033889
Atlas of the World's Languages is the reference book on languages. Over 4000 languages are covered, some with very few speakers.
0415019257
0415019257

Visit Amazon by clicking on a logo below.

  


This site is a member of the DigFor Languages Webring:
[ Previous | Next | Next 5 Sites | Random Site | List Sites ]