Native Speakers: 6.86 million
Spoken Natively in: South Africa, Namibia
Official Language in: South Africa
Afrikaans is a West Germanic language, spoken natively in South Africa, Namibia and to a lesser extent in Botswana and Zimbabwe. It originates from 17th century Dutch dialects spoken by the mainly-Dutch settlers of what is now South Africa, where it began to develop independently. Hence, historically, it is a daughter language of Dutch, and was previously referred to as "Cape Dutch" (a term also used to refer collectively to the early Cape settlers) or 'kitchen Dutch' (a crude or derogatory term Afrikaans was called in its earlier days). Although Afrikaans adopted words from languages such as Malay, Portuguese, the Bantu languages, and the Khoisan languages, an estimated 90 to 95 percent of Afrikaans vocabulary is ultimately of Dutch origin. Therefore, differences with Dutch often lie in a more regular morphology, grammar, and spelling of Afrikaans. There is a large degree of mutual intelligibility between the two languages—especially in written form. With about 7 million native speakers in South Africa, or 13.5 percent of the population, it is the third most spoken mother tongue in the country. It has the widest geographical and racial distribution of all the official languages of South Africa, and is widely spoken and understood as a second or third language. It is the majority language of the western half of South Africa—the provinces of the Northern Cape and Western Cape—and the primary language of the coloured and white communities. In neighbouring Namibia, Afrikaans is widely spoken as a second language and used as lingua franca, while as a native language it is spoken in 11 percent of households, mainly concentrated in the capital Windhoek and the southern regions of Hardap and Karas. Estimates of the total number of Afrikaans-speakers range between 15 and 23 million.
Native Speakers: ca. 7.3 million (1989–2007)
Spoken Natively in: Primarily in Southeastern Europe and by the Albanian diaspora worldwide
Official Language in: Albania Kosovo and recognised as a minority language in: Macedonia, Italy, Montenegro, Serbia,Romania,Turkey
Albanian (gjuha shqipe, pronounced [ˈɟuha ˈʃcipɛ], or shqip Albanian pronunciation: [ʃcip]) is an Indo-European language spoken by approximately 7.3 million people all over the world, primarily in Albania and Kosovo but also in other areas of the Balkans in which there is an Albanian population, including western Republic of Macedonia, southern Montenegro, southern Serbia and Greece. Albanian is also spoken in centuries-old Albanian-based dialect speaking communities scattered in southern Greece, southern Italy, Sicily, and Ukraine. Additionally, speakers of Albanian can be found elsewhere throughout the latter two countries resulting from a modern diaspora, originating from the Balkans, that also includes Scandinavia, Switzerland, Germany, United Kingdom, Turkey, Australia, New Zealand, Netherlands, Singapore, Brazil, Canada, and the United States.
Native Speakers: 25 million
Spoken Natively in: Ethiopia
Official Language in: Ethiopia and the following specific regions: Addis Ababa City Council, Amhara Region, Benishangul-Gumuz Region, Dire Dawa Administrative council, Gambela Region, SNNPR
Amharic (Amharic: አማርኛ?, Amarəñña, IPA: [amarɨɲːa] is a Semitic language spoken in Ethiopia. It is the second most-spoken Semitic language in the world, after Arabic, and the official working language of the Federal Democratic Republic of Ethiopia. Thus, it has official status and is used nationwide. Amharic is also the official or working language of several of the states within the federal system. It has been the working language of government, the military, and of the Ethiopian Orthodox Tewahedo Church throughout medieval and modern times. Outside Ethiopia, Amharic is the language of some 2.7 million emigrants. It is written using Amharic Fidel, ፊደል, which grew out of the Ge'ez abugida—called, in Ethiopian Semitic languages, ፊደል fidel ("alphabet", "letter", or "character") and አቡጊዳ abugida (from the first four Ethiopic letters, which gave rise to the modern linguistic term abugida).
Native Speakers: 295 million
Spoken Natively in:
Official Language in: Algeria, Bahrain, Comoros, Chad, Djibouti, Egypt, Eritrea, Iraq, Israel, Jordan, Kuwait, Lebanon, Libya, Mauritania, Morocco, Oman, Palestine, Qatar, Saudi Arabia, Somalia, Sudan, Syria, Tunisia, United Arab Emirates, Western Sahara, Yemen, African Union, Arab League, OIC, United Nations
Arabic (العربية al-ʻarabīyah or عربي/عربى ʻarabī ) (About this sound [al ʕarabijja] (help•info) or (About this sound [ʕarabi] (help•info)) is a name applied to the descendants of the Classical Arabic language of the 6th century AD. This includes both the literary language and the spoken Arabic varieties.
The literary language is called Modern Standard Arabic or Literary Arabic. It is currently the only official form of Arabic, used in most written documents as well as in formal spoken occasions, such as lectures and news broadcasts.
This however, varies from one country to the other. In 1912, Moroccan Arabic was official in Morocco for some time, before Morocco joined the Arab League.
The spoken Arabic varieties are spoken in a wide arc of territory stretching across the Middle East and North Africa.
Arabic languages are Central Semitic languages, most closely related to Hebrew, Aramaic, Ugaritic and Phoenician. The standardized written Arabic is distinct from and more conservative than all of the spoken varieties, and the two exist in a state known as diglossia, used side-by-side for different societal functions.
Some of the spoken varieties are mutually unintelligible, both written and orally, and the varieties as a whole constitute a sociolinguistic language. This means that on purely linguistic grounds they would likely be considered to constitute more than one language, but are commonly grouped together as a single language for political and/or ethnic reasons, (look below). If considered multiple languages, it is unclear how many languages there would be, as the spoken varieties form a dialect chain with no clear boundaries. If Arabic is considered a single language, it may be spoken by as many as 280 million first language speakers, making it one of the half dozen most populous languages in the world. If considered separate languages, the most-spoken variety would most likely be Egyptian Arabic, with 54 million native speakers—still greater than any other Semitic language.
The modern written language (Modern Standard Arabic) is derived from the language of the Quran (known as Classical Arabic or Quranic Arabic). It is widely taught in schools, universities, and used to varying degrees in workplaces, government and the media. The two formal varieties are grouped together as Literary Arabic, which is the official language of 26 states and the liturgical language of Islam. Modern Standard Arabic largely follows the grammatical standards of Quranic Arabic and uses much of the same vocabulary. However, it has discarded some grammatical constructions and vocabulary that no longer have any counterpoint in the spoken varieties, and adopted certain new constructions and vocabulary from the spoken varieties. Much of the new vocabulary is used to denote concepts that have arisen in the post-Quranic era, especially in modern times.
Arabic is the only surviving member of the Old North Arabian dialect group[which?] attested in Pre-Islamic Arabic inscriptions dating back to the 4th century. Arabic is written with the Arabic alphabet, which is an abjad script, and is written from right-to-left. Although, the spoken varieties are sometimes written in ASCII Latin with no standardized forms.
Arabic has lent many words to other languages of the Islamic world, like Persian, Turkish, Bosnian, Kazakh, Bengali, Urdu, Hindi, Malay and Hausa. During the Middle Ages, Literary Arabic was a major vehicle of culture in Europe, especially in science, mathematics and philosophy. As a result, many European languages have also borrowed many words from it. Arabic influence, both in vocabulary and grammar, is seen in Romance languages, particularly Spanish, Portuguese, Catalan and Sicilian, owing to both the proximity of European and Arab civilizations and 700 years of Muslim (Moorish) rule in some parts of the Iberian Peninsula referred to as Al-Andalus.
Arabic has also borrowed words from many languages, including Hebrew, Greek, Persian and Syriac in early centuries, Turkish in medieval times and contemporary European languages in modern times, mostly from English and French.
Native Speakers: 6,700,000
Spoken Natively in: Armenia, Nagorno-Karabakh (not recognized internationally), Georgia
Official Language in: Armenia, Nagorno-Karabakh (not recognized internationally) Minority language: Iran, Cyprus, Poland, Romania, Russia, Georgia, Bulgaria
Armenian language (հայերէն in TAO or հայերեն in RAO, Armenian pronunciation: [hɑjɛˈɾɛn]—hayeren) is an Indo-European language spoken by the Armenian people. It is the official language of the Republic of Armenia as well as in the region of Nagorno-Karabakh. The language is also widely spoken by Armenian communities in the Armenian diaspora. It has its own script, the Armenian alphabet, and is of interest to linguists for its distinctive phonological developments within Indo-European. Linguists classify Armenian as an independent branch of the Indo-European language family. Armenian shares a number of major innovations with Greek, and some linguists group these two languages together with Phrygian and the Indo-Iranian family into a higher-level subgroup of Indo-European which is defined by such shared innovations as the augment. More recently, others have proposed a Balkan grouping including Greek, Armenian, Phrygian and Albanian. Armenian has a long literary history, with a fifth-century Bible translation as its oldest surviving text. Its vocabulary has been heavily influenced by Western Middle Iranian languages, particularly Parthian, and to a lesser extent by Greek, Latin, Old French, Persian, Arabic, Turkish, and other languages throughout its history. There are two standardized modern literary forms, Eastern Armenian and Western Armenian, with which most contemporary dialects are mutually intelligible. The divergent and almost extinct Lomavren language is a Romani-influenced dialect with an Armenian grammar and a largely Romani-derived vocabulary, including Romani numbers.
Native Speakers: 15 million (2007)
Spoken Natively in: India, Bhutan and Bangladesh
Official Language in: India (Assam)
Assamese or Asamiya (অসমীয়া Ôxômiya) (IPA: [ɔxɔmija]) is an Eastern Indo-Aryan language used mainly in the state of Assam in North-East India. It is the official language of Assam. It is also spoken in parts of Arunachal Pradesh and other northeast Indian states. Nagamese, an Assamese-based Creole language is widely used in Nagaland and parts of Assam. Small pockets of Assamese speakers can be found in Bhutan and Bangladesh. The easternmost of Indo-European languages; it is spoken by over 13 million native speakers.
Along with other Eastern Indo-Aryan languages, Assamese evolved circa 1000–1200 AD from the Magadhi Prakrit, which developed from a dialect or group of dialects that were close to, but different from, Vedic and Classical Sanskrit. Its sister languages include Bengali, Chittagonian, Sylheti (Cilôţi), Oriya, the Bihari languages. It is written with the Assamese script. Assamese is written from left to right and top to bottom, in the same manner as English. A large number of ligatures are possible since potentially all the consonants can combine with one another. Vowels can either be independent or dependent upon a consonant or a consonant cluster.
The English word "Assamese" is built on the same principle as "Sinhalese", "Japanese" etc. It is based on the name "Assam" by which the tract consisting of the Brahmaputra Valley was known. The people call their state Ôxôm and their language Ôxômiya.
Native Speakers: 1.8 million in Bolivia (1987) 660,000 in Peru (2000–2006)
Spoken Natively in: Bolivia, Peru, and Chile.
Official Language in: Bolivia, Peru
Aymara (Aymar aru) is an Aymaran language spoken by the Aymara people of the Andes. It is one of only a handful of Native American languages with over three million speakers. Aymara, along with Quechua and Spanish, is an official language of Bolivia. It is also spoken around the Lake Titicaca region of southern Peru and, to a much lesser extent, by some communities in northern Chile and in Northwest Argentina. Some linguists have claimed that Aymara is related to its more widely spoken neighbour, Quechua. This claim, however, is disputed — although there are indeed similarities such as the nearly identical phonologies, the majority position among linguists today is that these similarities are better explained as areal features resulting from prolonged interaction between the two languages, and that they are not demonstrably related. The Aymara language is an agglutinating and to a certain extent polysynthetic language, and has a subject–object–verb word order.
Native Speakers:23 million (2007)
Spoken Natively in: Iran, Azerbaijan, Turkey, Armenia, Georgia, Russia, Afghanistan, Iraq, Syria
Official Language in: Azerbaijan (North Azerbaijani) Russia - One of the official languages of Dagestan.
Azerbaijani or Azeri (Azərbaycanca, Azərbaycan dili) is a language belonging to the Turkic language family, spoken in southwestern Asia by the Azerbaijani people, primarily in Azerbaijan and northwestern Iran. Azerbaijani is member of the Oghuz branch of the Turkic languages and is closely related to Turkish, Qashqai, Turkmen and Crimean Tatar. Turkish and Azerbaijani are known to closely resemble each other, and the native speaker of one language is able to understand the other, though it is easier for a speaker of Azerbaijani to understand Turkish than the other way around.
Native Speakers:715,000 (2012)
Spoken Natively in: Spain, France
Official Language in: Basque Country Navarre
Basque (endonym: Euskara, IPA: [eus̺ˈkaɾa]) is the ancestral language of the Basque people, who inhabit the Basque Country, a region spanning an area in northeastern Spain and southwestern France. It is spoken by 27% of Basques in all territories (714,136 out of 2,648,998). Of these, 663,035 live in the Spanish part of the Basque country and the remaining 51,100 live in the French part. In academic discussions of the distribution of Basque in Spain and France, it is customary to refer to three ancient provinces in France and four Spanish provinces. Native speakers are concentrated in a contiguous area including parts of the Spanish Autonomous Communities of the Basque Country and Navarre and in the western half of the French Département of Pyrénées-Atlantiques. The Basque Autonomous Community is an administrative entity within the binational ethnographic Basque Country incorporating the traditional Spanish provinces of Biscay, Gipuzkoa, and Álava, which retain their existence as politico-administrative divisions.
These provinces and many areas of Navarre are heavily populated by ethnic Basques, but the Euskara language had, at least until the 1990s, all but disappeared from most of Álava, western parts of Biscay and central and southern areas of Navarre. In southwestern France, the ancient Basque-populated provinces were Labourd, Lower Navarre, and Soule. They and other regions were consolidated into a single département in 1790 under the name Basses-Pyrénées, a name which persisted until 1970. A standardized form of the Basque language, called Euskara Batua, was developed by the Basque Language Academy in the late 1960s. Euskara Batua was created so that Basque language could be used—and easily understood by all Basque speakers—in formal situations (education, mass media, literature), and this is its main use nowadays. The role of this standard Basque language depends on the linguistic educational model of each region and each school. In most areas of the Basque Country, the educational Model D, where all subjects are taught in Basque, except "Spanish language and literature" (which is taught in Spanish) is now the predominant model. In France, the Basque language school Seaska and the association for a bilingual (Basque and French) schooling Ikasbi meet a wide range of Basque language educational needs up to the Sixth Form, while often struggling to surmount financial and administrative constraints. Apart from this standardized version, there are five main Basque dialects: Bizkaian, Gipuzkoan, and Upper Navarrese in Spain, and Navarrese-Lapurdian and Zuberoan (in France). Although they take their names from the mentioned historic provinces, the dialect boundaries are not congruent with province boundaries.
Native Speakers:7.6 million (2007)
Spoken Natively in: Belarus, Poland, in 14 other countries
Official Language in: Belarus, Poland (in Gmina Orla, Gmina Narewka, Gmina Czyże, Gmina Hajnówka and town of Hajnówka)
Belarusian language (беларуская мова, BGN/PCGN: byelaruskaya mova, Scientific: bielaruskaja mova, łac.: biełaruskaja mova), sometimes referred to as White Ruthenian, is the language of the Belarusian people. It is an official language of Belarus, along with Russian, and is spoken abroad, chiefly in Russia, Ukraine, and Poland. Prior to Belarus gaining its independence from the Soviet Union in 1991, the language was known in English as Byelorussian or Belorussian, transliterating the Russian name, белорусский язык, or alternatively as White Ruthenian or White Russian. Following independence, it became known also as Belarusian. Belarusian is one of the East Slavic languages, and shares many grammatical and lexical features with other members of the group. To some extent, Russian, Ukrainian and Belarusian are mutually intelligible. Its predecessor stage is known as Old Belarusian (14th to 17th centuries), in turn descended from Old East Slavic (10th to 13th centuries). According to the 1999 Belarus Census, the Belarusian language is declared as a "language spoken at home" by about 3,686,000 Belarusian citizens (36.7% of the population) as of 1999. About 6,984,000 (85.6%) of Belarusians declared it their "mother tongue". Other sources put the "population of the language" as 6,715,000 in Belarus and 9,081,102 in all countries. According to a study done by the Belarusian government in 2009, 72% of Belarusians speak Russian at home, while Belarusian is used by only 11.9% of Belarusians. 29.4% of Belarusians can write, speak and read Belarusian, while only 52.5% can read and speak it. According to the research, one out of ten Belarusians does not understand Belarusian.
Native Speakers:205 million (2010)
Spoken Natively in:Bangladesh, India (mainly in West Bengal); significant communities in United Kingdom, United States, Pakistan, Saudi Arabia, Malaysia, Sierra Leone, Singapore, United Arab Emirates, Australia, Burma, Canada
Official Language in: Bangladesh, India (West Bengal, Tripura and Barak Valley) (comprising districts of south Assam- Cachar, Karimganj and Hailakandi)
Bengali (Bengali: বাংলা Bangla [ˈbaŋla]) is an eastern Indo-Aryan language. It is native to the region of eastern South Asia known as Bengal, which comprises present day Bangladesh, the Indian state of West Bengal, and parts of the Indian states of Tripura and Assam. It is written using the Bengali script. With about 193 million native and about 230 million total speakers, Bengali is one of the most spoken languages (ranked sixth) in the world. The National song and the national anthem of India, and the national anthem of Bangladesh were composed in Bengali. Along with other Eastern Indo-Aryan languages, Bengali evolved circa 1000–1200 AD from the Magadhi Prakrit, which developed from a dialect or group of dialects that were close to, but different from, Vedic and Classical Sanskrit. It is now the primary language spoken in Bangladesh and is the second most commonly spoken language in India. With a rich literary tradition arising from the Bengali Renaissance, Bengali binds together a culturally diverse region and is an important contributor to Bengali nationalism. In former East Bengal (today Bangladesh), the strong linguistic consciousness led to the Bengali Language Movement, during which on 21 February 1952, several people were killed during protests to gain its recognition as a state language of the then Dominion of Pakistan and to maintain its writing in the Bengali script. The day has since been observed as Language Movement Day in Bangladesh, and was proclaimed the International Mother Language Day by UNESCO on 17 November 1999.
Native Speakers:6,000 (2001) 200,000 L2
Spoken Natively in:
Official Language in: Vanuatu
Bislama is a creole language, one of the official languages of Vanuatu. It is the first language of many of the "Urban ni-Vanuatu" (those who live in Port Vila and Luganville), and the second language of much of the rest of the country's residents. "Yumi, Yumi, Yumi", the Vanuatu national anthem, is in Bislama.
More than 95% of Bislama words are of English origin; the remainder combines a few dozen words from French, as well as some vocabulary inherited from various languages of Vanuatu, essentially limited to flora and fauna terminology. While the influence of these vernacular languages is low on the vocabulary side, it is very high in the morphosyntax. Bislama can be basically described as a language with an English vocabulary and an Oceanic grammar.
Native Speakers:2.5 - 3.5 million (2008)
Spoken Natively in: Bosnia, Serbia, Montenegro, Kosovo, Croatia
Official Language in: Boznia and Herzegovina, Montenegro Recognize minority language: Albania, Serbia, Macedonia, Kosovo
Bosnian (bosanski / босански [bɔ̌sanskiː]) is a standardized register of the Serbo-Croatian language, a South Slavic language, spoken by Bosniaks. As a standardized form of the Shtokavian dialect, it is one of the three official languages of Bosnia and Herzegovina. The same subdialect of Shtokavian is also the basis of standard Croatian and Serbian, as well as Montenegrin, so all are mutually intelligible. Until the dissolution of SFR Yugoslavia, they were treated as a unitary Serbo-Croatian language, and that term is still used in English to subsume the common base (vocabulary, grammar and syntax) of what are today officially four national standards, although the term is no longer used by native speakers. The Bosnian standard uses both Latin and Cyrillic alphabets. Bosnian is notable amongst the varieties of Serbo-Croatian for having an eclectic assortment of Arabic, Turkish and Persian loanwords, largely due to the language's interaction with those cultures through Islamic ties. This is historically corroborated by the introduction and use of Arebica as a successor script for the Bosnian language, replacing Bosančica upon the introduction of Islam; first amongst the elite, then amongst the public. The Bosnian language also contains a number of Germanisms not often heard in the Croatian or Serbian languages that have been in use since the Austro-Hungarian Empire. The first official dictionary in the Bosnian language was printed in the early 1630s, while the first dictionary in Serbian was printed only in the mid-19th century. Written evidence and records point to the Bosnian language being the official language of the country since at least the Kingdom of Bosnia, as further corroborated by the declaration of the Charter of Ban Kulin, one of the oldest written state documents in the Balkans and one of the oldest to be written in Bosančica.
Native Speakers:9.1 million (1986)
Spoken Natively in: Bulgaria, Turkey, Serbia, Greece, Ukraine, Moldova, Romania, Albania, Kosovo, Republic of Macedonia and among emigrant communities worldwide
Official Language in: Bulgaria European Union Mount Athos
Bulgarian (български език, pronounced: [ˈbɤ̞ɫɡɐrski ɛˈzik]) is an Indo-European language, a member of the Southern branch of the Slavic language family. Bulgarian, along with the closely related Macedonian language (collectively forming the East South Slavic languages), has several characteristics that set it apart from all other Slavic languages: changes include the elimination of case declension, the development of a suffixed definite article (see Balkan language area) and the lack of a verb infinitive; but it retains and has further developed the Proto-Slavic verb system. Various evidential verb forms exist to express unwitnessed, retold, and doubtful action. Estimates of the number of people around the world who speak Bulgarian fluently range from about 9 million to 12 million.
Native Speakers:33 million (2007) Second language: 10 million
Spoken Natively in:
Official Language in: Myanmar
Burmese language (Burmese: မြန်မာဘာသာ; pronounced: [mjəmà bàðà]; MLCTS: myanma bhasa) is the official language of Burma. Burmese is the native language of the Bamar and related sub-ethnic groups of the Bamar, as well as that of some ethnic minorities in Burma like the Mon. Burmese is spoken by 32 million as a first language and as a second language by 10 million, particularly ethnic minorities in Burma and those in neighboring countries. (Although the constitution officially recognizes the English name of the language as the Myanmar language, most English speakers continue to refer to the language as Burmese.)Burmese is a tonal, pitch-register, and syllable-timed language, largely monosyllabic and analytic language, with a subject–object–verb word order. It is a member of the Tibeto-Burman language family, which is a subfamily of the Sino-Tibetan family of languages. The language uses the Burmese script, derived from the Old Mon script and ultimately from the Brāhmī script.
Native Speakers:11.5 million (2006)
Spoken Natively in: Andorra, France, Italy, Spain
Official Language in: Andorra Spain: Catalonia, Valencian Community, Balearic Islands. Italy: Alghero (Sardinia) Latin Union
Catalan (kætəˈlæn/, /ˈkætəlæn/, or /ˈkætələn/; autonym: català [kətəˈɫa] or [kataˈɫa]) is a Romance language named for its origins in the historical region of Catalonia in the northeastern part of the Iberian Peninsula and adjoining parts of what is now France. It is the national and only official language of Andorra, a European microstate, and a co-official language of the Spanish autonomous communities of Catalonia, the Balearic Islands, and the Valencian Community, where it is known as Valencian. It also has semi-official status in the city of Alghero (where the Algherese dialect is spoken) on the Italian island of Sardinia. It is also spoken with no official recognition in the autonomous communities of Aragon (in La Franja) and Murcia (in Carche) in Spain, and in the historic Roussillon region of southern France, roughly equivalent to the current French department of Pyrénées-Orientales (Northern Catalonia). Although recognized as a regional language of the Pyrénées-Orientales department since 2007, Catalan has no official recognition in France, as French is the only official language of that country, according to the French Constitution of 1958.
Spoken Natively in: China, overseas communities
Official Language in: Hong Kong Macau (de facto official spoken form of Chinese in the Hong Kong government)
Chinese Cantonese, or Standard Cantonese, is a language that originated in the vicinity of Canton (i.e., Guangzhou) in southern China, and is often regarded as the prestige dialect of Yue Chinese. Inside mainland China, it is a lingua franca in Guangdong Province and some neighbouring areas, such as the eastern part of Guangxi Province. Outside mainland China, it is spoken by the majority population of Hong Kong and Macau in everyday life. It is also spoken by overseas Chinese communities in Southeast Asia (like Malaysia, Christmas Island), Canada, Brazil, Peru, Cuba, Panama, Australia, New Zealand, Europe, and the United States where it is the third most common language in the country. While the term Cantonese refers narrowly to the prestige dialect described in this article, it is often used in a broader sense for the entire Yue branch of Chinese, including related dialects such as Taishanese.
The Cantonese language is also viewed as part of the cultural identity for the native speakers across large swathes of southern China, Hong Kong and Macau. Although Cantonese shares much vocabulary with Mandarin Chinese, the two languages are not mutually intelligible largely because of pronunciation and grammatical differences. Sentence structure, in particular the placement of the verb, sometimes differs between the two languages. The use of vocabulary in Cantonese also tends to have more historic roots. The most notable difference between Cantonese and Mandarin is how the spoken word is written; with Mandarin the spoken word is written as such, where with Cantonese there may not be a direct written word matching what was said. This results in the situation in which a Mandarin and Cantonese text almost look the same, but both are pronounced differently.
Native Speakers:955 million (2010)
Spoken Natively in:
Official Language in:
Chinese Mandarin, In Chinese linguistics, Mandarin (simplified Chinese: 官话; traditional Chinese: 官話; pinyin: Guānhuà; literally "speech of officials") is a group of related varieties or dialects spoken across most of northern and southwestern China. Because most Mandarin dialects are found in the north, the group is also referred to, particularly among Chinese speakers, as the "northern dialect(s)" (simplified Chinese: 北方话; traditional Chinese: 北方 話; pinyin: Běifānghuà). When the Mandarin group is taken as one language, as is often done in academic literature, it has more native speakers (nearly a billion) than any other language. A northeastern-dialect speaker and a southwestern-dialect speaker can hardly communicate except through the standard language, mainly because of the differences in tone. Nonetheless, the variation within Mandarin is less significant than the much greater variation found within several other varieties of Chinese; this is thought to be due to a relatively recent spread of Mandarin across China, combined with a greater ease of travel and communication compared to the more mountainous south of China. For most of Chinese history, the capital has been within the Mandarin area, making these dialects very influential. Since the 14th century, some form of Mandarin has served as a national lingua franca. In the early 20th century, a standard form based on the Beijing dialect, with elements from other Mandarin dialects, was adopted as the national language. Standard Chinese, which is also referred to as "Mandarin", Pǔtōnghuà (simplified Chinese: 普通话; traditional Chinese: 普通話; literally "common speech") or Guóyǔ (Chinese: 國語; literally "national language"), is the official language of the People's Republic of China and the Republic of China, and one of the four official languages of Singapore. It is also one of the most frequently used varieties of Chinese among Chinese diaspora communities internationally.
Native Speakers:5.55 million (2001)
Spoken Natively in: Croatia, Bosnia and Herzegovina, Serbia (Vojvodina), Montenegro, Romania (Caraş-Severin County), Slovenia, and diaspora
Official Language in: Croatia Bosnia and Herzegovina
Croatian (hrvatski jezik) is the Serbo-Croatian language as spoken by Croats, principally in Croatia, Bosnia and Herzegovina, the Serbian province of Vojvodina and other neighbouring countries. Standard Croatian is based on the most widespread dialect, Shtokavian (Štokavian), more specifically on Eastern Herzegovinian, which is also the basis of standard Serbian, Bosnian, and Montenegrin. The other dialects spoken by Croats are Chakavian (Čakavian), Kajkavian, and Torlakian (by the Krashovani). These four dialects, and the four national standards, are usually subsumed under the term "Serbo-Croatian" in English, though this term is controversial for native speakers and paraphrases such as "Bosnian-Croatian-Montenegrin-Serbian" are therefore sometimes used instead, especially in diplomatic circles. Vernacular texts in the Chakavian dialect first appeared in the 13th century, and Shtokavian texts appeared a century later. Standardization began in the period sometimes called "Baroque Slavism" in the first half of the 17th century, while some authors date it back to the end of 15th century. The modern Neo-Shtokavian standard that appeared in the mid 18th century was the first unified Croatian literary language. Croatian is written in Gaj's Latin alphabet.
Native Speakers:10 million (2007)
Spoken Natively in: Czech Republic Vojvodina, Serbia Banat, Romania
Official Language in: Czech Republic European Union Slovakia (partially)
Czech (ˈtʃɛk/; čeština Czech pronunciation: [ˈt͡ʃɛʃcɪna]) is a West Slavic language with about 12 million native speakers; it is the majority language in the Czech Republic and spoken by Czechs worldwide. The language was known as Bohemian in English until the late 19th century. Czech is similar to and mutually intelligible with Slovak and, to a lesser extent, with Polish and Sorbian.
Native Speakers:5.6 million (2007)
Spoken Natively in: Denmark, Faroe Islands, Greenland, Schleswig-Holstein (Germany)
Official Language in: Denmark Faroe Islands, European Union, Nordic Council Minority language: Germany Greenland Iceland
Danish (dansk, pronounced [d̥anˀsɡ̊] ( listen); dansk sprog, [ˈd̥anˀsɡ̊ ˈsb̥ʁɔʊ̯ˀ]) is a North Germanic language spoken by around six million people, principally in the country of Denmark. It is also spoken by 50,000 Germans of Danish ethnicity in the northern parts of Schleswig-Holstein, Germany, where it holds minority language status. Danish is a mandatory subject in school in the Danish crown territories of the Faroe Islands (where it is also an official language after Faroese) and Greenland (where, however, the only official language since 2009 is Kalaallisut and the language is now spoken as lingua franca), as well as the former crown holding of Iceland. There are also Danish language communities in Argentina, the United States and Canada. Danish is mutually intelligible with Norwegian and Swedish .
Native Speakers:371,000 (2007)
Spoken Natively in: The Maldives, India (Minicoy Island)
Official Language in: Maldives
Dhivehi or Maldivian (Maldivian: ދިވެހި Divehi) is an Indo-Aryan language predominantly spoken by about 350,000 people in the Maldives where it is the national language. It is also the first language of nearly 10,000 people in the island of Minicoy in the Union territory of Lakshadweep, India where the Mahl dialect of the Maldivian language is spoken.The major dialects of Maldivian are Malé, Huvadhu, Mulaku, Addu, Haddhunmathee and Maliku. The standard form of Maldivian is Malé, which is spoken in the Maldivian capital of the same name. The Maliku dialect spoken in Minicoy is officially referred as Mahl by the Lakshadweep administration. This has been adopted by many authors when referring to Maldivian spoken in Minicoy. Maldivian is closely related to the Sinhala language. Many languages have influenced the development of the Maldivian language through the ages, most importantly Arabic. Others include French, Persian, Portuguese, Urdu and English. The English words atoll (a ring of coral islands or reefs) and doni (a vessel for inter-atoll navigation) are anglicized forms of the Maldivian words Atoḷu and Dōni.
Native Speakers:23.5 million (2006) Total: 28 million(not including speakers of closely related Afrikaans)
Spoken Natively in: mainly the Netherlands, Belgium, and Suriname, also in Aruba, Curaçao, and Sint Maarten, as well as Australia, Canada, France (French Flanders), Germany, Indonesia, South Africa, United States.
Official Language in: Aruba, Belgium, Curaçao, Netherlands, Sint Maarten, Suriname, Benelux , European Union, Union of South American Nations
Dutch is a West Germanic language and the native language of most of the population of the Netherlands, and about sixty percent of the populations of Belgium and Suriname, the three member states of the Dutch Language Union. Most speakers live in the European Union, where it is a first language for about 23 million and a second language for another 5 million people. It also holds official status in the Caribbean island nations of Aruba, Curaçao, and Sint Maarten, while historical minorities remain in parts of France and Germany, and to a lesser extent, in Indonesia, and up to half a million native Dutch speakers may be living in the United States, Canada, and Australia. The Cape Dutch dialects of Southern Africa have been standardised into Afrikaans, a mutually intelligible daughter language of Dutch[n 3] which today is spoken to some degree by an estimated total of 15 to 23 million people in South Africa and Namibia. Dutch is closely related to German and English and is said to be between them. Apart from not having undergone the High German consonant shift, Dutch—like English—has mostly abandoned the grammatical case system, is relatively unaffected by the Germanic umlaut, and has levelled much of its morphology. Dutch historically has three grammatical genders, but this distinction has far fewer grammatical consequences than in German. Dutch shares with German the use of subject–verb–object word order in main clauses and subject–object–verb in subordinate clauses. Dutch vocabulary is mostly Germanic and contains the same Germanic core as German and English, while incorporating more Romance loans than German and fewer than English.
Native Speakers:170,000 (2006) Second Language: 470,000
Spoken Natively in:
Official Language in: Bhutan
Bhutan (རྫོང་ཁ་; Wylie: rdzong-kha, Jong-kă), occasionally Ngalopkha, is the national language of Bhutan. The word "dzongkha" means the language (kha) spoken in the dzong, – dzong being the fortress-like monasteries established throughout Bhutan by Shabdrung Ngawang Namgyal in the 17th century. "Bhutani" is not another name for Dzongkha, but the name of a Balochi language. The two are sometimes confused, even in some published ISO 639 codelists.
Native Speakers:360 million (2010) L2: 375 million and 750 million EFL
Spoken Natively in:
Official Language in:54 countries 27 non-sovereign entities United Nations, European Union Commonwealth of Nations Council of Europe, IOC NATO, NAFTA, OAS, OECD, OIC, PIF, UKUSA Agreement
English is a West Germanic language that was first spoken in early medieval England and is now the most widely used language in the world. It is spoken as a first language by the majority populations of several sovereign states, including the United Kingdom, the United States, Canada, Australia, Ireland, New Zealand and a number of Caribbean nations. It is the third-most-common native language in the world, after Mandarin Chinese and Spanish. It is widely learned as a second language and is an official language of the European Union, many Commonwealth countries and the United Nations, as well as in many world organisations. English arose in the Anglo-Saxon kingdoms of England and what is now southeast Scotland. Following the extensive influence of Great Britain and the United Kingdom from the 17th century to the mid-20th century, through the British Empire, and also of the United States since the mid-20th century, it has been widely propagated around the world, becoming the leading language of international discourse and the lingua franca in many regions. Historically, English originated from the fusion of closely related dialects, now collectively termed Old English, which were brought to the eastern coast of Great Britain by Germanic settlers (Anglo-Saxons) by the 5th century – with the word English being derived from the name of the Angles, and ultimately from their ancestral region of Angeln (in what is now Schleswig-Holstein). A significant number of English words are constructed based on roots from Latin, because Latin in some form was the lingua franca of the Christian Church and of European intellectual life. The language was further influenced by the Old Norse language because of Viking invasions in the 8th and 9th centuries. The Norman conquest of England in the 11th century gave rise to heavy borrowings from Norman-French, and vocabulary and spelling conventions began to give the appearance of a close relationship with Romance languages to what had then become Middle English. The Great Vowel Shift that began in the south of England in the 15th century is one of the historical events that mark the emergence of Modern English from Middle English. Owing to the assimilation of words from many other languages throughout history, modern English contains a very large vocabulary, with complex and irregular spelling, particularly of vowels. Modern English has not only assimilated words from other European languages, but from all over the world. The Oxford English Dictionary lists over 250,000 distinct words, not including many technical, scientific, and slang terms.
Native Speakers:1.05 million (1989)
Spoken Natively in: Estonia
Official Language in: Estonia European Union
Estonian (eesti keel; pronounced [ˈeːsti ˈkeːl]) is the official language of Estonia, spoken natively by about 1.1 million people in Estonia and tens of thousands in various migrant communities. It belongs to the Finnic branch of the Uralic language family. One distinctive feature that has caused a great amount of interest among linguists is what is traditionally seen as three degrees of phoneme length: short, long, and "overlong", such that /toto/, /toˑto/ and /toːto/ are distinct. In actuality, the distinction is not purely in the phoneme length, and the underlying phonological mechanism is still disputed.
Native Speakers:340,000 (1996 census) 320,000 second-language users (1991)
Spoken Natively in: Fiji
Official Language in: Fiji
Fijian is an Austronesian language of the Malayo-Polynesian family spoken in Fiji. It has 450,000 first-language speakers, which is more than half the population of Fiji, but another 200,000 speak it as a second language. The 1997 Constitution established Fijian as an official language of Fiji, along with English and Hindustani, and there is discussion about establishing it as the "national language", though English and Hindustani would remain official. Fijian is a VOS language. It has prepositions. Standard Fijian is based on the language of Bau, which is an East Fijian language.
Native Speakers:28 million (2007) 96% of the Philippines can speak Tagalog (2000)
Spoken Natively in: Philippines
Official Language in: Philippines
Filipino is a prestige register of the Tagalog language, and the name under which Tagalog is designated the national language of the Philippines, as well as an official language alongside English. Tagalog is a first language of about one-third of the Philippine population; it is centred around Manila but is spoken to varying degrees nationwide.
Native Speakers: c. 5 million (2011)
Spoken Natively in: Finland, Estonia, Ingria, Karelia, Norway, Sweden
Official Language in: Finland, European Union, recognised as minority language in: Sweden, Russian Federation:Republic of Karelia
Finnish (or suomen kieli) is the language spoken by the majority of the population in Finland (92% as of 2006) and by ethnic Finns outside Finland. It is one of the two official languages of Finland and an official minority language in Sweden. In Sweden, both standard Finnish and Meänkieli, a Finnish dialect, are spoken. The Kven language, a Finnish dialect, is spoken in Northern Norway. Finnish is the eponymous member of the Finnic language family and is typologically between fusional and agglutinative languages. It modifies and inflects nouns, adjectives, pronouns, numerals and verbs, depending on their roles in the sentence.
Native Speakers:74 million (2007) 220 million L1 and L2 speakers (2010)
Spoken Natively in:Europe: Legal Status in France, Switzerland, Belgium, Monaco and Andorra, Luxembourg, Italy, The United Kingdom and Channel Islands, North and South America: Canada, Haiti, French overseas departments and territories in the Americas, United States, Brazil, Africa: Algeria, Egypt, French overseas departments and territories in the Africa, Asia: Southeast Asia, Lebanon, Syria and Israel, Indiana, Oceania and Australasia
Official Language in: Belgium, Benin, Burkina Faso, Burundi, Cameroon, Canada, Central African Republic, Chad, Comoros, Côte d'Ivoire Democratic Republic of the Congo, Djibouti, Equatorial Guinea , France, Gabon, Guinea, Haiti, Luxembourg, Madagascar, Mali, Monaco, Niger, Republic of the Congo, Rwanda, Senegal, Seychelles, Switzerland, Togo, Vanuatu Administrative/cultural: Algeria, Lebanon, Morocco, Mauritius, Mauritania, Tunisia 14 dependent entities: Aosta Valley, France Clipperton, French Southern and Antarctic Lands, French Polynesia, Guernsey, Jersey, Louisiana USA, Mayotte, New Caledonia, India Pondicherry, Saint-Barthélemy, Saint-Martin, Saint-Pierre and Miquelon, Wallis and Futuna
French (le français [lə fʁɑ̃sɛ] ( listen) or la langue française [la lɑ̃ɡ fʁɑ̃sɛz]) is a Romance language spoken as a first language in France, the Romandy region in Switzerland, Wallonia and Brussels in Belgium, Monaco, the province of Quebec, many African countries, and the Acadia region in Canada, the north of the U.S. state of Maine, the Acadiana region of the U.S. state of Louisiana, and by various communities elsewhere. Other speakers of French, who often speak it as a second language, are distributed throughout many parts of the world, the largest numbers of whom reside in Francophone Africa. In Africa, French is most commonly spoken in Gabon (where 80% report fluency) Mauritius (78%), Algeria (75%), Senegal and Côte d'Ivoire (70%). French is estimated as having 110 million native speakers and 190 million more second language speakers. French is a descendant of the spoken Latin language of the Roman Empire, as are languages such as Italian, Portuguese, Spanish; Romanian, Lombard, Catalan, Sicilian and Sardinian. Its closest relatives are the other langues d'oïl—languages historically spoken in northern France and Belgium, which French has largely supplanted. French was also influenced by native Celtic languages of Roman Gaul, and by the (Germanic) Frankish language of the post-Roman Frankish invaders. Today, owing to France's past overseas expansion, there are numerous French-based creole languages, most notably Haitian.
It is an official language in 29 countries, most of which form la francophonie (in French), the community of French-speaking countries. It is an official language of all United Nations agencies and a large number of international organizations. According to the European Union, 129 million, or twenty-six percent of the Union's total population, can speak French, of whom 72 million are native speakers (65 million in France, 4.5 million in Belgium, plus 2.5 million in Switzerland, which is not part of the EU) and 69 million are second-language or foreign language speakers, thus making French the third language in the European Union that people state they are most able to speak, after English and German. Twenty percent of non-Francophone Europeans know how to speak French, totaling roughly 145.6 million people in Europe alone. As a result of extensive colonial ambitions of France and Belgium (at that time governed by a French-speaking elite), between the 17th and 20th centuries, French was introduced to the Americas, Africa, Polynesia, the Levant, Southeast Asia, and the Caribbean. According to a demographic projection led by the Université Laval and the Réseau Démographie de l'Agence universitaire de la francophonie, French speakers will number approximately 500 million people in 2025 and 650 million people, or approximately 7% of the world's population by 2050.
Native Speakers: 467,000 (2001)
Spoken Natively in: Netherlands
Official Language in: Province of Friesland
Frisian (West) (Frysk, Dutch: Westerlauwers Fries [ˈʋɛstərˌlʌu̯ərs ˈfris]) is a language spoken mostly in the province of Friesland (Fryslân) in the north of the Netherlands. West Frisian is the name by which this language is usually known outside the Netherlands, to distinguish it from the closely related Frisian languages of Saterland Frisian and North Frisian, which are spoken in Germany. Within the Netherlands however, the West Frisian language is the language of the province of Friesland and is almost always called simply "Frisian": Fries in Dutch, and Frysk in Frisian; Westfries (literally: West Frisian) is the Dutch name of the West Frisian dialect of the Dutch language, spoken in West Friesland, a region in the province of North Holland. The 'official' name used by linguists in the Netherlands to indicate the West Frisian language is Westerlauwers Fries (West Lauwers Frisian), the Lauwers being a border stream which separates the Dutch provinces of Friesland and Groningen.
Native Speakers: 160,000 (2000)
Spoken Natively in: Moldova, Ukraine, Russia, Turkey
Official Language in: Gagauzia
Gagauz language (Gagauz dili) is a Turkic language, spoken mainly by the Gagauz people and the official language of the autonomous Moldovan region of Gagauzia. Gagauz has two dialects: Bulgar Gagauzi and Maritime Gagauzi. Gagauz is classified as a distinct language from Balkan Gagauz Turkish.
Native Speakers:3.2 million (1986)
Spoken Natively in:
Official Language in: Galicia
Galician /ɡəˈlɪʃən/ (galego IPA: [ɡaˈleɣo]) is a language of the Western Ibero-Romance branch. It is spoken by some 3 million people, mainly in Galicia, an autonomous community located in northwestern Spain, where it is official along with Spanish language. The language is also spoken in some border zones of the neighbouring Spanish regions of Asturias and Castile and León, as well as by Galician migrant communities in the rest of Spain, in Latin America, the United States, and elsewhere in Europe.Galician is part of the same family of languages as the Portuguese language, and both share the common origins. Galician-Portuguese lyric (13th-14th centuries) was among the most remarkable universal literature produced in the Middle Ages. Galician language was the first iberian tonge to be consolidated from latin in the Medieval Ages in Galicia. The language spread southwards with the conquests of the Galician warlords since the VIII century, who became counts and later kings of Portugal. The standards of portuguese and galician dialects started to slowly slip away since the independence of Portugal in the 12th century. The lexicon of the language is predominantly of Latin extraction, although it also contains an important number of words of Germanic and Celtic origin, among other substrates and adstrates, having also received, mainly through Spanish and Portuguese, a sizeable number of nouns from the Arabs who in the Middle Ages governed southern Iberia. It is unofficially regulated in Galicia by the Royal Galician Academy, yet independent organisations such as the Galician Association of Language and the Galician Academy of the Portuguese Language include galician as part of the Galician-Portuguese language.
Native Speakers:6–7 million (1998)
Spoken Natively in: Georgia (Including Abkhazia and South Ossetia)Russia, United States, Israel, Ukraine, Turkey, Iran, Azerbaijan
Official Language in: Georgia
Georgian (ქართული ენა, pronounced [kʰartʰuli ɛna]) is the native language of the Georgians and the official language of Georgia, a country in the Caucasus. Georgian is the primary language of about 4 million people in Georgia itself, and of another 500,000 abroad. It is the literary language for all regional subgroups of the Georgian ethnos, including those who speak other Kartvelian (South Caucasian) languages: Svans, Mingrelians, and the Laz. Judaeo-Georgian is spoken by an additional 20,000 in Georgia and 65,000 elsewhere (primarily 60,000 in Israel).
Native Speakers: Standard German: 90 million (2010) all German: 120 million (1990–2005) L2 speakers: 80 million (2006)
Spoken Natively in: Primarily in German-speaking Europe, as a minority language and amongst the German diaspora worldwide
Official Language in: European Union (official and working language) Germany, Austria, Switzerland, South Tyrol (Italy), Liechtenstein, Luxembourg, Belgium (German-speaking Community of Belgium)
German (Deutsch [ˈdɔʏtʃ] ( listen)) is a West Germanic language related to and classified alongside English and Dutch. With an estimated 90 – 98 million native speakers, German is one of the world's major languages and is the most widely-spoken first language in the European Union. Most German vocabulary is derived from the Germanic branch of the Indo-European language family. A number of words are derived from Latin and Greek, and fewer from French and English. German is written using the Latin alphabet. In addition to the 26 standard letters, German has three vowels with umlauts (Ä/ä, Ö/ö, and Ü/ü) and the letter ß.
Native Speakers:12 million (2007)
Spoken Natively in: Greece, Cyprus, Italy, Turkey, Abkhazia, Albania, Egypt, Romania, France, Ukraine and in the Greek diaspora
Official Language in: Greece , Cyprus European Union
Greek (ελληνικά IPA [eliniˈka] ellīnika or ελληνική γλώσσα IPA [eliniˈci ˈɣlosa]) ellīnikī glōssa is an independent branch of the Indo-European family of languages. Native to the southern Balkans, Western Asia Minor and the Aegean, it has the longest documented history of any Indo-European language, spanning 34 centuries of written records. Its writing system has been the Greek alphabet for the majority of its history; other systems, such as Linear B and the Cypriot syllabary, were previously used. The alphabet arose from the Phoenician script, and was in turn the basis of the Latin, Cyrillic, Coptic, and many other writing systems. The Greek language holds an important place in the histories of Europe, the more loosely defined Western world, and Christianity; the canon of ancient Greek literature includes works of monumental importance and influence for the future Western canon, such as the epic poems Iliad and Odyssey. Greek was also the language in which many of the foundational texts of Western philosophy, such as the Platonic dialogues and the works of Aristotle, were composed; the New Testament of the Christian Bible was written in Koiné Greek. Together with the Latin texts and traditions of the Roman world, the study of the Greek texts and society of antiquity constitutes the discipline of classics. Greek was a widely spoken lingua franca in the Mediterranean world and beyond during classical antiquity, and would eventually become the official parlance of the Byzantine Empire. In its modern form, it is the official language of Greece and Cyprus and one of the 23 official languages of the European Union. The language is spoken by at least 13 million people today in Greece, Cyprus, and diaspora communities in numerous parts of the world. Greek roots are often used to coin new words for other languages, especially in the sciences and medicine; Greek and Latin are the predominant sources of the international scientific vocabulary. Over fifty thousand English words are derived from the Greek language.
Native Speakers:4.85 million (1995)
Spoken Natively in: Argentina, Brazil, Paraguay, Bolivia
Official Language in: Paraguay, Corrientes (Argentina), Mercosur Bolivia
Guaraní, specifically the primary variety known as Paraguayan Guaraní (/ɡwɑrəˈniː/; endonym avañe'ẽ [aʋãɲẽˈʔẽ] 'Ava language'), is an indigenous language of South America that belongs to the Tupí–Guaraní subfamily of the Tupian languages. It is one of the official languages of Paraguay (along with Spanish), where it is spoken by the majority of the population, and where half of the rural population is monolingual. It is spoken by communities in neighbouring countries, including parts of northeastern Argentina, southeastern Bolivia and southwestern Brazil, and is a second official language of the Argentine province of Corrientes since 2004; it is also an official language of Mercosur. Guaraní is the only indigenous language of the Americas whose speakers include a large proportion of non-indigenous people. This is an anomaly in the Americas where language shift towards European colonial languages (in this case, the other official language of Spanish) has otherwise been a nearly universal cultural and identity marker of mestizos (people of mixed Spanish and Amerindian ancestry), and also of culturally assimilated, upwardly mobile Amerindian people. Jesuit priest Antonio Ruiz de Montoya, who in 1639 published a book called Tesoro de la lengua guaraní ("The Treasure of the Guaraní Language"), described it as a language "so copious and elegant that it can compete with the most famous [of languages]." The name "Guarani" is generally used for the official language of Paraguay. However, this is part of a dialect chain, most of whose components are also often called Guaraní. See Guaraní dialects.
Native Speakers:49 million (2007)
Spoken Natively in: India
Official Language in: Gujarat (India) Daman and Diu (India) Dadra and Nagar Haveli (India)
Gujarati (Gujarati: ગુજરાતી Gujarātī) is an Indo-Aryan language, and part of the greater Indo-European language family. It is derived from a language called Old Gujarati (1100–1500 AD) which is the ancestor language of the modern Gujarati and Rajasthani languages. It is native to the Indian state of Gujarat, where it is the chief language, and to the adjacent union territories of Daman and Diu and Dadra and Nagar Haveli. There are about 65.5 million speakers of Gujarati worldwide, making it the 26th most spoken native language in the world. Along with Romany and Sindhi, it is among the most western of Indo-Aryan languages. Gujarati was the first language of Mohandas Karamchand Gandhi, the "Father of the Nation of India", and Sardar Vallabhbhai Patel, the "Iron Man of India." Other prominent personalities whose first language is or was Gujarati include Swami Dayananda Saraswati, Morarji Desai, Narsinh Mehta, Dhirubhai Ambani, and J. R. D. Tata. & Muhammad Ali Jinnah the "Father of the Nation of Pakistan"
Native Speakers:9.6 million (2007)
Spoken Natively in: Haiti and Dominican Republic
Official Language in: Haiti
Haitian Creole (Kreyòl ayisyen; pronounced: [kɣejɔl ajisjɛ̃]), often called simply Creole or Kreyòl, is a language spoken in Haiti by about twelve million people, which includes all Haitians in Haiti and via emigration, by about two to three million speakers residing in the Bahamas, Belize, Canada, Cayman Islands, Cuba, Dominican Republic, France, French Guiana, Guadeloupe, Ivory Coast, Martinique, Puerto Rico, Trinidad and Tobago, the United States, and Venezuela. Haitian Creole is one of Haiti's two official languages, along with French. It is a creole based largely on 18th Century French, some African languages, as well as Arabic, Arawak, English, Spanish, and Taíno. Partly due to efforts of Félix Morisseau-Leroy, since 1961 Haitian Creole has been recognized as an official language along with French, which had been the sole literary language of the country since its independence in 1804. Its orthography was standardized in 1979. The official status was maintained under the country's 1987 constitution. The use of Haitian Creole in literature has been small but is increasing. Morisseau was one of the first and most influential authors to write in Haitian Creole. Since the 1980s, many educators, writers and activists have written literature in Haitian Creole. Today numerous newspapers, as well as radio and television programs, are produced in Haitian Creole. As required by the Joseph C. Bernard (Secrétaire d'État de l'éducation nationale) law of 18 September 1979, the Institut Pédagogique National established an official orthography for Kreyòl, and slight modifications were made over the next two decades. For example, the hyphen (-) is no longer used, nor is the apostrophe. The only accent accepted is the grave accent (à, è, or ò).
Native Speakers:5.3 million(1998)
Spoken Natively in: Israel, Jewish settlements in the West Bank; used globally as a liturgical language for Judaism
Official Language in: Israel
Hebrew (/ˈhiːbruː/) (עִבְרִית ʿIvrit, [ʔivˈʁit] or [ʕivˈɾit]) is a West Semitic language of the Afroasiatic language family. Historically, it is regarded as the language of the Israelites and their ancestors. Other Jewish languages had originated among diaspora Jews. The Hebrew language was also used by non-Jewish groups, such as the ethnically related Samaritans. Hebrew had ceased to be an everyday spoken language around 200 CE, and survived into the medieval period only as the language of Jewish liturgy and rabbinical literature. Then, in the 19th century it was revived as a spoken and literary language and, according to Ethnologue is now the language of 5.3 million people worldwide, mainly in Israel. Modern Hebrew is one of the two official languages of Israel (the other being Arabic), while Classical Hebrew is used for prayer or study in Jewish communities around the world. The earliest examples of written Hebrew date from the 10th century BCE to the late Second Temple period, after which the language developed into Mishnaic Hebrew. Ancient Hebrew is also the liturgical tongue of the Samaritans, while modern Hebrew or Arabic is their vernacular, though today only about 700 Samaritans remain. As a foreign language it is studied mostly by Jews and students of Judaism and Israel, archaeologists and linguists specializing in the Middle East and its civilizations, by theologians, and in Christian seminaries.
The core of the Torah (the first five books of the Hebrew Bible), and most of the rest of the Hebrew Bible, is written in Classical Hebrew, and much of its present form is specifically the dialect of Biblical Hebrew that scholars believe flourished around the 6th century BCE, around the time of the Babylonian exile. For this reason, Hebrew has been referred to by Jews as Leshon HaKodesh (לשון הקודש), "The Holy Language", since ancient times.
Native Speakers:180 million (1991) Total, including Urdu: 490 million
Spoken Natively in: India Significant communities in South Africa, Nepal
Official Language in: India
Hindi, or more precisely Modern Standard Hindi (also known as Manak Hindi, High Hindi, Nagari Hindi, and Literary Hindi), is a standardised and sanskritised register of the Hindi-Urdu language. It is the native language of people living in Delhi, Haryana, Western Uttar Pradesh, northeastern Madhya Pradesh, and parts of eastern Rajasthan, and is one of the official languages of the Republic of India. But many non-native speakers from other parts of India, too, understand it easily because it is close to their native languages that, just like Hindi, originated from Sanskrit. These languages have common roots and the native speakers of several regional Indian languages find it easier to understand the more Sanskritised form of Hindi. Colloquial Hindi is mutually intelligible with another register of Hindi-Urdu called (Modern Standard) Urdu. Mutual intelligibility decreases in literary and specialized contexts which rely on educated vocabulary. The number of native speakers of Standard Hindi is unclear. According to the 2001 Indian census, 258 million people in India reported their native language to be "Hindi". However, this includes large numbers of speakers of Hindi languages other than Standard Hindi; as of 2009, the best figure Ethnologue could find for Khariboli dialect (the basis of Hindustani) was a 1991 citation of 180 million. This places Hindi in a three-way tie with Bengali and Portuguese for the fifth-largest language in the world.
Native Speakers: Very few (1992) 120,000 L2 speakers (1989); use declining since 1965
Spoken Natively in:
Official Language in: Papua New Guinea
Hiri Motu, (also known as Police Motu or Pidgin Motu) is an official language of Papua New Guinea. It is a simplified version of Motu and although it is strictly neither a pidgin nor a creole it possesses some features of both language types. Phonological and grammatical differences mean not only that Hiri Motu speakers cannot understand Motu, but also that Motu speakers not exposed to Hiri Motu have similar difficulties, though the languages are lexically very similar, and retain a common Austronesian syntactical basis. Unlike Tok Pisin its use has been in decline for many years.
Native Speakers:14–15 million (2005)
Spoken Natively in: Hungary and areas of Austria, Croatia, Romania, Serbia, Slovakia, Slovenia, Ukraine
Official Language in: Hungary, European Union, Slovakia (regional language), Slovenia (regional language), Serbia (regional language), Austria (regional language), some official rights in Romania, Ukraine and Croatia
Hungarian (Hungarian: magyar nyelv ) is a Uralic language, one of the Ugric branch, spoken by the Hungarians. It is the most widely spoken non-Indo-European language in Europe, based on the number of native speakers. Hungarian is the official language of Hungary and is also spoken by Hungarian communities in the seven neighboring countries and by diaspora communities worldwide. The Hungarian name for the language is magyar [ˈmɒɟɒr], which is also occasionally used as an English word to refer to the Hungarian people as an ethnic group.
Native Speakers:320,000 (2011)
Spoken Natively in: Iceland, Denmark
Official Language in: Iceland
Icelandic ( íslenska ) is a North Germanic language, the main language of Iceland. Its closest relative is Faroese. Icelandic is an Indo-European language belonging to the North Germanic or Nordic branch of the Germanic languages. Historically, it was the westernmost of the Indo-European languages prior to the colonisation of the Americas. Icelandic, Faroese, Norn, and Norwegian formerly comprised West Nordic; Danish and Swedish comprised East Nordic. The Nordic languages are now divided into Insular Nordic and mainland Scandinavian languages. Norwegian is now grouped with Danish and Swedish because of its mutual intelligibility with those languages due to its heavy influence from them over the last millennium, particularly from Danish. Most Western European languages have greatly reduced levels of inflection, particularly noun declension. In contrast, Icelandic retains a four-case synthetic grammar comparable to, but considerably more conservative and synthetic than, German. It is inappropriate to compare the grammar of Icelandic to that of the more conservative Baltic, Slavic, and Indic languages of the Indo-European family, many of which retain six or more cases, except to note that Icelandic utilises a wide assortment of irregular declensions. Icelandic also possesses many instances of oblique cases without any governing word, as does Latin. For example, many of the various Latin ablatives have a corresponding Icelandic dative. However, despite its arguable baggage, the remarkable conservatism of the Icelandic language and its resultant near-isomorphism to Old Norse (which is equivalently termed Old Icelandic by linguists) means that, to their delight, modern Icelanders can easily read the Eddas, sagas, and other classic Old Norse literary works created in the tenth through thirteenth centuries. The vast majority of Icelandic speakers—about 320,000—live in Iceland. There are about 8,165 speakers of Icelandic living in Denmark, of whom approximately 3,000 are students. The language is also spoken by 5,112 people in the USA and by 2,170 in Canada (Notably in Gimli, Manitoba), indeed, the word 'Gimli' is itself the Icelandic for 'heaven'. 97% of the population of Iceland consider Icelandic their mother tongue, but in some communities outside Iceland the use of the language is declining. Icelandic speakers outside Iceland represent recent emigration in almost all cases except Gimli, which was settled from the 1880s onwards. The Icelandic constitution does not mention the language as the official language of the country. Though Iceland is a member of the Nordic Council, the Council uses only Danish, Norwegian and Swedish as its working languages. The council does, though, publish material in Icelandic. Under the Nordic Language Convention, since 1987, citizens of Iceland have the opportunity to use Icelandic when interacting with official bodies in other Nordic countries without being liable for any interpretation or translation costs. The Convention covers visits to hospitals, job centres, the police and social security offices; however, the Convention is not very well known and is mostly irrelevant as many Icelanders born after the 1940s have an excellent command of English. The countries have committed themselves to providing services in various languages, but citizens have no absolute rights except for criminal and court matters.
The state-funded Árni Magnússon Institute for Icelandic Studies serves as a centre for preserving the medieval Icelandic manuscripts and studying the language and its literature. The Icelandic Language Council, comprising representatives of universities, the arts, journalists, teachers, and the Ministry of Culture, Science and Education, advises the authorities on language policy. The Icelandic Language Fund supports activities intended to promote the Icelandic language. Since 1995, on November 16 each year, the birthday of 19th century poet Jónas Hallgrímsson is celebrated as Icelandic Language Day.
Native Speakers:23 million (2000) Over 140 million L2 speakers
Spoken Natively in: Indonesia East Timor (as a "working language")
Official Language in: Indonesia
Indonesian (Bahasa Indonesia) is the official language of Indonesia. It is a standardized register of Malay, an Austronesian language which has been used as a lingua franca in the Indonesian archipelago for centuries. Indonesia is the fourth most populous nation in the world. Of its large population, the number of people who speak Indonesian fluently is fast approaching 100%, making Indonesian, and thus Malay, one of the most widely spoken languages in the world.
Most Indonesians, aside from speaking the national language, are often fluent in another regional language (examples include Javanese, Sundanese and Madurese) which are commonly used at home and within the local community. Most formal education, as well as nearly all national media and other forms of communication, are conducted in Indonesian. In East Timor, which was an Indonesian province from 1975 to 1999, Indonesian is recognised by the constitution as one of the two working languages (the other being English), alongside the official languages of Tetum and Portuguese. The Indonesian name for the language is Bahasa Indonesia (literally "the language of Indonesia"). This term is occasionally found in English. Indonesian is sometimes called "Bahasa" by English speakers, though this literally just means "language".
Native Speakers:575 (2006)
Spoken Natively in: Canada (Nunavut and Northwest Territories)
Official Language in: Nunavut and Northwest Territories (Canada)
Inuinnaqtun (meaning Like the real human beings/peoples), is an indigenous Inuit language of Canada and a dialect of Inuvialuktun. It is related very closely to Inuktitut, and some scholars, such as Richard Condon, believe that Inuinnaqtun is more appropriately classified as a dialect of Inuktitut. The governments of the Northwest Territories and Nunavut recognise Inuinnaqtun as an official language in addition to Inuktitut. The Nunavut Official Languages Act, passed by the Senate of Canada on June 11, 2009, recognized Inuinnaqtun as one of the official languages of Nunavut. Inuinnaqtun is used primarily in the communities of Cambridge Bay and Kugluktuk in the western Kitikmeot Region of Nunavut. Outside of Nunavut, it is spoken in the hamlet of Ulukhaktok, where it is called Kangiryuarmiutun. It is written using the Latin script.
Native Speakers:14,000 (1991) 36,000 together with Inuvialuk (2006)
Spoken Natively in: Canada (Nunavut, Quebec (Nunavik), Northwest Territories, Newfoundland and Labrador (Nunatsiavut)
Official Language in: Nunavut, Nunavik, Northwest Territories, Nunatsiavut (Canada)
Inuktitut (Inuktitut syllabics: ᐃᓄᒃᑎᑐᑦ) [i.nuk.ti.'tut] or Eastern Canadian Inuktitut, Eastern Canadian Inuit language is the name of some of the Inuit languages spoken in Canada. It is spoken in all areas north of the tree line, including parts of the provinces of Newfoundland and Labrador, Quebec, to some extent in northeastern Manitoba as well as the territories of Nunavut, the Northwest Territories, and traditionally on the Arctic Ocean coast of Yukon. It is recognised as an official language in Nunavut and the Northwest Territories. It also has legal recognition in Nunavik—a part of Québec—thanks in part to the James Bay and Northern Québec Agreement, and is recognised in the Charter of the French Language as the official language of instruction for Inuit school districts there. It also has some recognition in Nunatsiavut—the Inuit area in Labrador—following the ratification of its agreement with the government of Canada and the province of Newfoundland and Labrador. The Canadian census reports that there are roughly 35,000 Inuktitut speakers in Canada, including roughly 200 who live regularly outside of traditionally Inuit lands.
Native Speakers: approx. 133,000 native (2011) L2 1.77 million (Native & L2) in the Republic. 200,000 in Northern Ireland. 30,000 in the United States of America. 7,500 in Canada. 1,895 in Australia.
Spoken Natively in: Ireland, United Kingdom, United States, Canada, Australia, Argentina
Official Language in: Ireland, EU
Irish (Gaeilge), also known as Irish Gaelic, is a Goidelic language of the Indo-European language family, originating in Ireland and historically spoken by the Irish people. Irish is now spoken as a first language by a minority of Irish people, as well as being a second language of a larger proportion of the population. Irish enjoys constitutional status as the national and first official language of the Republic of Ireland. It is an official language of the European Union and an officially recognised minority language in Northern Ireland. Irish was the predominant language of the Irish people for most of their recorded history, and they brought their Gaelic speech with them to other countries, notably Scotland and the Isle of Man, where it gave rise to Scottish Gaelic and Manx. It has the oldest vernacular literature in Western Europe. In the Elizabethan era the Gaelic language was viewed as something barbarian and as a threat to all things English in Ireland. There was then enacted a systematic campaign to destroy all things Irish, including the Irish language. Irish forms of dress were banned and no form of Irish names was recognised (something which still exists to this day in Northern Ireland). Those English who had married Irish women were forbidden from speaking Irish or maintaining Irish customs. Consequently, it began to decline under English and British rule after the seventeenth century. The nineteenth century saw a dramatic decrease in the number of speakers especially after the Great Famine of 1845–1852 (where Ireland lost 20–25% of its population either to emigration or death). Irish-speaking areas were especially hit hard. By the end of "British" rule, the language was spoken by less than 15% of the national population. Since then, Irish speakers have been in the minority except in areas collectively known as the Gaeltacht. Ongoing efforts have been made to preserve, promote and revive the language, particularly the Gaelic Revival. Significantly the language hung on in at least one area on the east coast of Ireland - far away from the usual west coast Gaeltacht areas - this was in the area of 'Oirghialla' - the remnant of a vast Gaelic territory that once encompassed Down, Armagh, Tyrone, Meath and Louth - but now just the few parishes of Mullaghbane (An Mullach Bán), Dromintee (Droim an Tí) and Killeavy (Cill Shléibhe) in South Armagh, and the contiguous area of Omeath (Ó Méith) in County Louth. The language was spoken in this area up to the 1920s and the last native speakers died in the 1950s. A vibrant revival has seen the language take off in the area with pre-school playgroups and primary schools and the language is probably more widely spoken now in the area than at any time in the last 50 years. Around the turn of the 20th century, estimates of native speakers ranged from 20,000 to 80,000 people. In the 2006 census for the Republic, 85,000 people reported using Irish as a daily language outside of the education system, and 1.2 million reported using it at least occasionally in or out of school. In the 2011 Census, these numbers had increased to 94,000 and 1.3 million, respectively. There are also thousands of Irish speakers in Northern Ireland, and viable communities of native speakers in the United States and Canada. Historically the island of Newfoundland had a dialect of Irish Gaelic, called Newfoundland Irish.
Native Speakers:59 million Italian proper, native and native bilingual (2007) 85 million all varieties
Spoken Natively in: Italy, San Marino, Malta, Switzerland, Vatican City, Slovenia (Slovenian Istria), Croatia (Istria County), Argentina, Brazil
Official Language in: European Union, Italy, Switzerland, San Marino, Vatican City, Croatia (Istria County), Slovenia (Slovenian Istria)
Italian (italiano or lingua italiana) is a Romance language spoken mainly in Europe: Italy, Switzerland, San Marino, Vatican City, by minorities in Malta, Monaco, Croatia, Slovenia, France, Libya, Eritrea, and Somalia, and by immigrant communities in the Americas and Australia. Many speakers are native bilinguals of both standardised Italian and other regional languages. According to the Bologna statistics of the European Union, Italian is spoken as a mother tongue by 65 million people in the EU (13% of the EU population), mainly in Italy, and as a second language by 14 million (3%). Including the Italian speakers in non-EU European countries (such as Switzerland and Albania) and on other continents, the total number of speakers is more than 85 million. In Switzerland, Italian is one of the four official languages; it is studied and learned in all the confederation schools and spoken, as mother tongue, in the Swiss cantons of Ticino and Grigioni and by the Italian immigrants that are present in large numbers in German- and French-speaking cantons. It is also the official language of San Marino, as well as the primary language of Vatican City. It is co-official in Slovenian Istria and in Istria County in Croatia. The Italian language adopted by the state after the unification of Italy is based on the Tuscan, which beforehand was a language spoken mostly by the upper class of Florentine society. Its development was also influenced by other Italian languages and by the Germanic languages of the post-Roman invaders. Italian derives from Latin. Unlike most other Romance languages, Italian retains Latin's contrast between short and long consonants. As in most Romance languages, stress is distinctive. In particular, among the Romance languages, Italian is the closest to Latin in terms of vocabulary.
Native Speakers:125 million (2010)
Spoken Natively in: Japan
Official Language in: Japan
Japanese (日本語 Nihongo?, /ni.ho.ɴ.go/, [nihõ̞ŋgo̞], [nihõ̞ŋŋo̞]) is an East Asian language spoken by about 125 million speakers, primarily in Japan, where it is the national language. It is a member of the Japonic (or Japanese-Ryukyuan) language family, whose relation to other language groups is debated, particularly to Korean and the Altaic language family. Little is known of the language's prehistory, or when it first appeared in Japan. 3rd century Chinese documents recorded a few Japanese words, but substantial texts did not appear until the 8th century. During the Heian period (794–1185), Chinese had a considerable influence on the vocabulary and phonology of Old Japanese. Late Middle Japanese (1185–1600) saw changes in features that brought it closer to the modern language, as well the first appearance of European loanwords. The standard dialect moved from the Kansai region to the Edo (modern Tokyo) region in the Early Modern Japanese period (early 17th century–mid-19th century). Following the end in 1853 of Japan's self-imposed isolation, the flow of loanwords from European languages has increased significantly. English loanwords in particular have become frequent, and Japanese words from English roots have proliferated. Japanese is an agglutinative, mora-timed language with simple phonotactics, a pure vowel system, phonemic vowel and consonant length, and a lexically significant pitch-accent. Word order is normally subject–object–verb with particles marking the grammatical function of words, and sentence structure is topic–comment. Sentence-final particles are used to add emotional or emphatic impact, or make questions. Nouns have no grammatical number, gender or article aspect. Verbs are conjugated, primarily tense and voice, but not person. Japanese equivalents of adjectives are also conjugated. Japanese has a complex system of honorifics with verb forms and vocabulary to indicate the relative status of the speaker, the listener, and persons mentioned.
Japanese has no genealogical relationship with Chinese, but makes extensive use of Chinese characters, or kanji (漢字?), in its writing system and a large portion of its vocabulary is borrowed from Chinese. Along with kanji, the Japanese writing system primarily uses two syllabic (or moraic) scripts, hiragana (ひらがな or 平仮名?) and katakana (カタカナ or 片仮名?). Latin script is used in a limited way, often in the form of rōmaji, and the numeral system uses mostly Arabic alongside traditional Chinese numerals. Japanese was little studied by non-Japanese before the Japanese economic bubble of the 1980s. Since then, along with the spread of Japanese popular culture, the number of students of Japanese has reached the millions.
Native Speakers:38 million (2007) 11.4 million as a second language
Spoken Natively in: India - Karnataka, Kasaragod, Kerala, Maharashtra, Andhra pradesh, Goa, Tamil nadu and significant communities in USA, Australia, Singapore, UK, Mauritius, United Arab Emirates.
Official Language in: Karnataka
Kannada (ಕನ್ನಡ) (/'kʌnnəɖɑː/), or Kanarese /kænəˈriːz/, is a language spoken in India predominantly in the state of Karnataka. Kannada, whose native speakers are called Kannadigas (Kannaḍigaru) and number roughly 70 million, is one of the 40 most spoken languages in the world. It is one of the scheduled languages of India and the official and administrative language of the state of Karnataka. The Kannada language is written using the Kannada script, which evolved from the 5th century Kadamba script. Kannada is attested epigraphically from about one and a half millennia, and literary Old Kannada flourished in the 6th century Ganga dynasty and during 9th century Rashtrakuta Dynasty. With an unbroken literary history of over a thousand years, the excellence of Kannada literature continues into the present day. Works of Kannada literature have received eight Jnanpith awards and fifty-six Sahitya Akademi awards. Based on the recommendations of the Committee of Linguistic Experts, appointed by the Ministry of Culture, the Government of India officially recognised Kannada as a classical language. In July 2011, a centre for the study of classical Kannada was established under the aegis of Central Institute of Indian Languages (CIIL) at Mysore to facilitate research related to the language.
Native Speakers: Over 5.5 million (2001)
Spoken Natively in: Jammu and Kashmir (India) Azad Jammu and Kashmir (Pakistan)
Official Language in: India
Kashmiri (कॉशुर, کأشُر Koshur) is an Indo-Aryan language and it is spoken primarily in the Kashmir Valley, in Jammu and Kashmir. There are approximately 5,527,698 speakers throughout India, according to the Census of 2001. Most of the 105,000 speakers or so in Pakistan are émigrés from the Kashmir Valley after the partition of India. They include a few speakers residing in border villages in Neelum District. The Kashmiri language is one of the 22 scheduled languages of India, and is a part of the Sixth Schedule in the constitution of the Jammu and Kashmir. Along with other regional languages mentioned in the Sixth Schedule, as well as Hindi and Urdu, the Kashmiri language is to be developed in the state. Some Kashmiri speakers frequently use Hindi as a second language, though the most frequently used second language is Urdu. Since November 2008, the Kashmiri language has been made a compulsory subject in all schools in the Valley up to the secondary level.
Native Speakers:11.0 million (2007)
Spoken Natively in: Kazakhstan, China, Mongolia, Afghanistan, Tajikistan, Turkey, Turkmenistan, Ukraine, Uzbekistan, Russia, Iran
Official Language in: Kazakhstan Russia: Altai Republic
Kazakh (also Qazaq and variants, natively Қазақ тілі, Qazaq tili, قازاق ٴتىلى; pronounced [qɑˈzɑq tɘˈlɘ]) is a Turkic language which belongs to the Kipchak (or Western Turkic) branch of the Turkic languages, closely related to Nogai and Karakalpak . Kazakh is an agglutinative language, and it employs vowel harmony.
Native Speakers:16 million (2007) 1 million L2 speakers
Spoken Natively in: Cambodia, Vietnam, Thailand, USA, France, Australia
Official Language in: Cambodia
Khmer (ភាសាខ្មែរ, IPA: [pʰiːəsaː kʰmaːe]; or more formally, ខេមរភាសា, IPA: [kʰeɛmaʔraʔ pʰiːəsaː]), or Cambodian, is the language of the Khmer people and the official language of Cambodia. It is the second most widely spoken Austroasiatic language (after Vietnamese), with speakers in the tens of millions. Khmer has been considerably influenced by Sanskrit and Pali, especially in the royal and religious registers, through the vehicles of Hinduism and Buddhism. It is also the earliest recorded and earliest written language of the Mon–Khmer family, predating Mon and by a significant margin Vietnamese. The Khmer language has influenced, and also been influenced by, Thai, Lao, Vietnamese and Cham, all of which, due to geographical proximity and long-term cultural contact, form a sprachbund in peninsular Southeast Asia. Khmer is primarily an isolating language. There are no inflections, conjugations or case endings. Instead, particles and auxiliary words are used to indicate grammatical relationships. General word order is subject–verb–object. Most words conform to the typical Mon-Khmer pattern, having a "main" syllable preceded by a minor syllable. The Khmer language is written with an abugida known in Khmer as អក្សរខ្មែរ (IPA: [aʔksɑː kʰmaːe]). Khmer differs from neighboring languages such as Thai, Lao and Vietnamese in that it is not a tonal language.
Native Speakers:76 million (2007)
Spoken Natively in: South Korea, North Korea, People's Republic of China
Official Language in: South Korea, North Korea, China Yanbian, People's Republic of China
Korean is the official language of South Korea and North Korea as well as one of the two official languages in China's Yanbian Korean Autonomous Prefecture. Approximately 78 million people speak Korean worldwide. For over a millennium, Korean was written with adapted Chinese characters called hanja, complemented by phonetic systems like hyangchal, gugyeol, and idu. In the 15th century, a national writing system called hangul was commissioned by Sejong the Great, but it only came into widespread use in the 20th century, because of the yangban aristocracy's preference for hanja. Most historical linguists classify Korean as a language isolate while a few consider it to be in the controversial Altaic language family. The Korean language is agglutinative in its morphology and SOV in its syntax.
Native Speakers:21 million (2007)
Spoken Natively in: Turkey, Iran, Iraq, Syria, Azerbaijan, Armenia, Lebanon, Georgia, Germany, France, United Kingdom, Sweden, Netherlands, Switzerland, Austria, Greece, Belgium, Denmark, Norway, Italy, Finland, USA, Canada, Australia
Official Language in: Iraq: status as official language alongside Arabic. Iran: constitutional status as a regional language Armenia: minority language
Kurdish (Kurdish: Kurdî or کوردی) is a dialect continuum spoken by the Kurds in western Asia. It is part of the Iranian branch of the Indo-Iranian group of Indo-European languages. The Kurdish language is spoken by some 20 million people. According to the CIA World Factbook, 10% of the total population of Iran speaks Kurdish. The CIA factbook estimates the number to be between 25–30 million. Kurdish is not a unified standard language but a discursive construct of languages spoken by ethnic Kurds, referring to a group of speech varieties that are not necessarily mutually intelligible unless there has been considerable prior contact between their speakers. The second official language of Iraq, referred to only as 'Kurdish' in political documents, is in fact an academic and standardized version of the Sorani dialect of a branch of languages spoken by Kurds.
The written literary output in Kurdic languages was confined mostly to poetry until the early 20th century, when a general written literature began to be developed. In its written form today "Kurdish" has two regional standards, namely Kurmanji in the northern parts of the geographical region of Kurdistan, and Sorani further east and south. Another distinct language group called Zaza–Gorani is also spoken by several million ethnic Kurds today and is generally also described and referred to as Kurdish, or as Kurdic languages, because of the ethnic association of the communities speaking the languages and dialects. Hewrami, a variation of Gorani, was an important literary language used by the Kurds but was steadily replaced by Sorani in the twentieth century.
Native Speakers:2.9 million (1993)
Spoken Natively in:
Official Language in: Kyrgyzstan
Kyrgyz or Kirgiz, also Kirghiz, Kyrghiz, Qyrghiz (Кыргызча or Кыргыз тили, قىرعىز تىلى, Kırgızça or Kırgız tili) is a language of the Turkic language family and one of the two official languages of Kyrgyzstan, the other being Russian. It is a member of the Kazakh-Nogai subgroup of the Kypchak languages, and modern day language convergence has resulted in an increasing degree of mutual intelligibility between Kyrgyz and Kazakh. Kyrgyz is spoken by about 4 million people in Kyrgyzstan, China, Afghanistan, Kazakhstan, Tajikistan, Turkey, Uzbekistan, Pakistan and Russia. Kyrgyz was originally written in the Turkic runes, gradually replaced by an Arabic alphabet (in use until 1928 in USSR, still in use in China). Between 1928 and 1940, the Latin-based Uniform Turkic Alphabet was used. In 1940 due to general Soviet policy, a Cyrillic alphabet eventually became common and has remained so to this day, though some Kyrgyz still use the Arabic alphabet. When Kyrgyzstan became independent following the Soviet Union's collapse in 1991, there was a popular idea among some Kyrgyz people to make transition to the Latin alphabet (taking in mind a version closer to the Turkish alphabet, not the original alphabet of 1928–1940), but the plan was never implemented.
Native Speakers: 5.22 million (2006) (20 million if Isan speakers are included)
Spoken Natively in: Laos, Thailand, U.S., France, Canada, China, Australia, New Zealand, Argentina (Misiones Province).
Official Language in: Laos
Lao or Laotian (ພາສາລາວ, BGN/PCGN: phasa lao, IPA: [pʰáːsǎː láːw]) is a tonal language of the Tai–Kadai language family. It is the official language of Laos, and also spoken in the northeast of Thailand, where it is usually referred to as the Isan language. Being the primary language of the Lao people, Lao is also an important second language for the multitude of ethnic groups in Laos and in Isan. Lao, like many languages in Laos, is written in the Lao script, which is an abugida script. Although there is no official standard, the Vientiane dialect has become the de facto standard.
Spoken Natively in: Latium, Roman Monarchy, Roman Republic, Roman Empire, Medieval and Early modern Europe, Armenian Kingdom of Cilicia (as lingua franca), Vatican City
Official Language in: Holy See
Latin (Listeni/ˈlætən/; Latin: lingua latīna; IPA: [ˈlɪŋɡʷa laˈtiːna]) is an Italic language originally spoken in Latium and Ancient Rome. Along with most European languages, it is a descendant of the ancient Proto-Indo-European language. It originated in the Italian peninsula. Although it is considered a dead language, many students, scholars, and members of the Christian clergy speak it fluently, and it is still taught in some primary and secondary and many post-secondary educational institutions around the world. Latin is still used in the creation of new words in modern languages of many different families, including English, and in biological taxonomy. Latin and its daughter Romance languages are the only surviving languages of the Italic language family. Other languages of the Italic branch are attested in the inscriptions of early Italy, but were assimilated to Latin during the Roman Republic. The extensive use of elements from vernacular speech by the earliest authors and inscriptions of the Roman Republic make it clear that the original, unwritten language of the Roman Monarchy was an only partially deducible colloquial form, the predecessor to Vulgar Latin. By the late Roman Republic, a standard, literate form had arisen from the speech of the educated, now referred to as Classical Latin. Vulgar Latin, by contrast, is the name given to the more rapidly changing colloquial language spoken throughout the empire. With the Roman conquest, Latin spread to many Mediterranean regions, and the dialects spoken in these areas, mixed to various degrees with the autochthonous languages, developed into the modern Romance tongues. Classical Latin slowly changed with the Decline of the Roman Empire, as education and wealth became ever scarcer. The consequent Medieval Latin, influenced by various Germanic and proto-Romance languages until expurgated by Renaissance scholars, was used as the language of international communication, scholarship and science until well into the 18th century, when it began to be supplanted by vernacular languages. Latin is a highly inflected language, with three distinct genders, seven noun cases, four verb conjugations, six tenses, three persons, three moods, two voices, two aspects and two numbers. A dual number ("a pair of") is present in Archaic Latin. One of the rarer of the seven cases is the locative, only marked in proper place names and a few common nouns. Otherwise the locative function ("place where") has merged with the ablative. The vocative, a case of direct address, is marked by an ending only in words of the second declension. Otherwise the vocative has merged with the nominative, except that the particle O typically precedes any vocative, marked or not. There are only five fully productive cases; that is, in the few instances of the formation of a distinct locative or vocative, the endings are specific to those words, and cannot be placed on other stems of the declension to produce a locative or vocative. In contrast, the plural nominative ending of the first declension may be used to form any first declension plural. As a result of this case ambiguity, different authors list different numbers of cases: 5, 6 or 7, which may be confusing. Adjectives and adverbs are compared, and the former are inflected according to case, gender, and number. In view of the fact that adjectives are often used for nouns, the two are termed substantives. Although Classical Latin has demonstrative pronouns indicating different degrees of proximity ("this one here", "that one there"), it does not have articles. Later Romance language articles developed from the demonstrative pronouns; e.g., le and la from ille and illa, su and sa from ipse and ipsa.
Native Speakers: 1.5 million
Spoken Natively in: Latvia
Official Language in: Latvia ,European Union
Latvian (latviešu valoda) is the official state language of Latvia. It is also sometimes referred to as Lettish. There are about 1.39 million native Latvian speakers in Latvia and about 115 thousand abroad. About 1.9 million or 79% of the population of Latvia speak Latvian. Of those about 1,165,000 use it as the primary language at home. The use of the Latvian language in various areas of social life in Latvia is increasing. Latvian is a Baltic language and is most closely related to Lithuanian, although the two are not mutually intelligible. Latvian first appeared in Western print in the mid-16th century with the reproduction of the Lord's Prayer in Latvian in Sebastian Münster's Cosmographia Universalis, in Latin script.
Native Speakers: 3.2 million (1998)
Spoken Natively in: Lithuania
Official Language in: Lithuania, European Union
Lithuanian (lietuvių kalba) is the official state language of Lithuania and is recognized as one of the official languages of the European Union. There are about 2.9 million native Lithuanian speakers in Lithuania and about 200,000 abroad. Lithuanian is a Baltic language, closely related to Latvian, although they are not mutually intelligible. It is written in a Latin alphabet. The Lithuanian language is believed to be the most conservative living Indo-European language, retaining many features of Proto-Indo-European now lost in other Indo-European languages.
Native Speakers: 320,000 (1998)
Spoken Natively in: Luxembourg, Belgium, France, Germany
Official Language in: Luxembourg
Luxembourgish (from Lëtzebuergesch; French: Luxembourgeois, German: Luxemburgisch, Dutch: Luxemburgs, Walloon: Lussimbordjwès) is a High German language that is spoken mainly in Luxembourg. About 390,000 people worldwide speak Luxembourgish.
Native Speakers: 1.6– 3.0 million (1985–1998)
Spoken Natively in: Macedonia, Albania, Bulgaria, Greece, Serbia, Macedonian diaspora
Official Language in: Republic of Macedonia
Macedonian (македонски јазик, makedonski jazik, pronounced [maˈkɛdɔnski ˈjazik]) is a South Slavic language, spoken as a first language by approximately 2–3 million people principally in the region of Macedonia and the Macedonian diaspora. It is the official language of the Republic of Macedonia and an official minority language in parts of Albania, Romania and Serbia. Standard Macedonian was implemented as the official language of the Socialist Republic of Macedonia in 1945 and has since developed a thriving literary tradition. Most of the codification was formalized during the same period. Macedonian dialects form a continuum with Bulgarian dialects; together in turn they form a broader continuum with Serbo-Croatian through the transitional Torlakian dialects. The name of the Macedonian language is a matter of political controversy in Greece as is its distinctiveness in Bulgaria.
Native Speakers: 18 million (2007)
Spoken Natively in: Madagascar, Comoros, Mayotte
Official Language in: Madagascar
Malagasy [ˌmalaˈɡasʲ] is the national language of Madagascar. It is a member of the Austronesian family of languages. Most people in Madagascar speak it as a first language as do some people of Malagasy descent elsewhere.
Native Speakers: 77 million (2007) Total: more than 215 million
Spoken Natively in: Indonesia (as Indonesian), Malaysia (as Malaysian) Brunei, Singapore, Thailand (as Bahasa Jawi), East Timor (as Indonesian), Christmas Island Cocos (Keeling) Islands (de jure)
Official Language in: Indonesia, Malaysia, Brunei, Singapore, Cocos (Keeling) Islands(de jure)
Malay (məˈleɪ/; Bahasa Melayu; Jawi script: بهاس ملايو ) is a major language of the Austronesian family. It is the national language of Indonesia (as Indonesian), Malaysia (also known as Malaysian), and Brunei, and it is one of four official languages of Singapore. It is spoken natively by 40 million people across the Malacca Strait, including the coasts of the Malay Peninsula of Malaysia and the eastern coast of Sumatra in Indonesia, and has been established as a native language of part of western coastal Sarawak and West Kalimantan in Borneo. The total number of speakers of the language is more than 215 million. As the Bahasa Kebangsaan or Bahasa Nasional (National Language) of several states, Standard Malay has various official names. In Singapore and Brunei it is called Bahasa Melayu (Malay language); in Malaysia, Bahasa Malaysia (Malaysian language); and in Indonesia, Bahasa Indonesia (Indonesian language) and is designated the Bahasa Persatuan/Pemersatu ("unifying language/lingua franca"). However, in areas of central to southern Sumatra where the language is indigenous, Indonesians refer to it as Bahasa Melayu and consider it one of their regional languages. Standard Malay, also called Court Malay, was the literary standard of the pre-colonial Malacca and Johor Sultanates, and so the language is sometimes called Malacca, Johor, or Riau Malay (or various combinations of those names) to distinguish it from the various other Malayan languages, though it has no connection to the Malay dialect of the Riau Islands. According to Ethnologue 16, several of the Malayan varieties they currently list as separate languages, including the Orang Asli varieties of Peninsular Malay, are so closely related to standard Malay that they may prove to be dialects. (These are listed with question marks in the table at right.) There are also several Malay-based creole languages which are based on a lingua franca derived from Classical Malay, as well as Makassar Malay, which appears to be a mixed language.
Native Speakers: 38 million (2007)
Spoken Natively in: Primarily in the Indian state of Kerala
Official Language in: Indian states:Kerala (State), Lakshadweep (Territory), Pondicherry (Territory)
Malayalam (English pronunciation: /mæləˈjɑːləm/;മലയാളം, malayāḷam ?, IPA: [mɐləjaːɭəm]), is a language spoken in India predominantly in the state of Kerala. It is one of the 22 scheduled languages of India with official language status in the state of Kerala and the union territories of Lakshadweep and Pondicherry. It belongs to the Dravidian family of languages, and was spoken approximately by 33 million people according to the 2001 census. Malayalam is also spoken in the neighboring states of Tamil Nadu and Karnataka; the Nilgiris, Kanyakumari and Coimbatore districts of Tamil Nadu, and the Dakshina Kannada, Mangalore and Kodagu districts of Karnataka. Malayalam most likely originated from Middle Tamil (Sen-Tamil-Malayalam) in the 6th century. An alternative theory proposes a split in even more ancient times. Malayalam incorporated many elements from Sanskrit through the ages and today over eighty percent of the vocabulary of Malayalam in scholarly usage is from Sanskrit. Before Malayalam came into being, Old Tamil was used in literature and courts of a region called Tamilakam, including present day Kerala state, a famous example being Silappatikaram. Silappatikaram was written by Chera prince Ilango Adigal from Cochin, and is considered a classic in Sangam literature. Modern Malayalam still preserves many words from the ancient Tamil vocabulary of Sangam literature. The earliest script used to write Malayalam was the Vatteluttu script, and later the Kolezhuttu, which derived from it. As Malayalam began to freely borrow words as well as the rules of grammar from Sanskrit, Grantha script was adopted for writing and came to be known as Arya Ezhuttu. This developed into the modern Malayalam script. Many medieval liturgical texts were written in an admixture of Sanskrit and early Malayalam, called Manipravalam. The oldest literary work in Malayalam, distinct from the Tamil tradition, is dated from between the 9th and 11th centuries. Due to its lineage deriving from both Sanskrit and Tamil, the Malayalam alphabet has the largest number of letters among the Indian languages. Malayalam script includes letters capable of representing all the sounds of Sanskrit and all Dravidian languages.
Native Speakers: (400,000 cited 1975)
Spoken Natively in: Malta
Official Language in: Malta, European Union
Maltese (Malti) is the national language of Malta, and a co-official language of the country alongside English, while also serving as an official language of the European Union, the only Semitic language so distinguished. Maltese is descended from Siculo-Arabic (the Arabic dialect that developed in Sicily, and later in Malta, between the end of the ninth century and the end of the thirteenth century). About half of the vocabulary is borrowed from standard Italian and Sicilian; English words make up between 6% and 20% of the Maltese vocabulary, according to different estimates (see below). It is the only Semitic language written in the Latin script, in its standard form, as well as the only Semitic language written and read from left to right.
Native Speakers: Extinct as a first language in 1974; subsequently revived and now with about a hundred competent speakers, including a small number of children who are new native speakers, and 1,823 people (2.27% de facto population) in the Isle of Man professing some knowledge of the language (2011)
Spoken Natively in: Isle of Man
Official Language in: Isle of Man
Manx Gaelic (native name Gaelg or Gailck, pronounced [ɡilk] or [ɡilɡ]), also known as Manx Gaelic, and as the Manks language, is a Goidelic language of the Indo-European language family, historically spoken by the Manx people. Only a small minority of the Isle of Man's population is fluent in the language, but a larger minority has some knowledge of it. It is widely considered to be an important part of the island's culture and heritage. The last native speaker, Ned Maddrell, died in 1974. However in recent years the language has been the subject of revival efforts. Mooinjer Veggey [muɲdʒer veɣə], a Manx medium playgroup, was succeeded by the Bunscoill Ghaelgagh [bʊn-skolʲ ɣɪlɡax], a primary school for 4- to 11-year-olds in St John's. In recent years, despite the small number of speakers, the language has become more visible on the island, with increased signage and radio broadcasts. The revival of Manx has been aided by the fact that the language was well recorded: for example the Bible was translated into Manx, and a number of audio recordings were made of native speakers.
Native Speakers: 60,000 (1991) 157,000 New Zealand residents claim they can converse in Māori about everyday things (2006 census)
Spoken Natively in: New Zealand
Official Language in: New Zealand
Māori or Te Reo Māori (pronounced [ˈmaːɔɾi, tɛ ˈɾɛɔ ˈmaːɔɾi]), commonly Te Reo ("The language"), is the language of the indigenous population of New Zealand, the Māori. It has the status of an official language in New Zealand. Linguists classify it within the Eastern Polynesian languages as being closely related to Cook Islands Māori, Tuamotuan, Rapanui and Tahitian; somewhat less closely to Hawaiian and Marquesan; and more distantly to the languages of Western Polynesia, including Samoan, Tokelauan, Niuean and Tongan. According to a 2001 survey on the health of the Māori Language, the number of very fluent adult speakers was about 9% of the Māori population, or 29,000 adults.
Native Speakers: 73 million (2007)
Spoken Natively in: India
Official Language in: India (State Maharashtra, Union territories of Daman-Diu) and Dadra Nagar Haveli
Marathi (मराठी Marāṭhī [məˈɾaʈʰi]) is a southern Indo-Aryan language spoken by the Marathi people. It is the official language of Maharashtra and Goa and is one of the 23 official languages of India. It is the 19th most spoken language in the world. There were 72 million speakers in 2001. Marathi has the fourth largest number of native speakers in India Marathi has some of the oldest literature of all modern Indo-European, Indic languages, dating from about 1000 AD. The major dialects of Marathi are called Standard Marathi and Warhadi Marathi. There are a few other sub-dialects like Ahirani, Dangi, Vadvali, Samavedi, Khandeshi, and Malwani. Standard Marathi is the official language of the State of Maharashtra.
Spoken Natively in:
Official Language in:
Mayan languages (alternatively: the languages of the Mayas) form a language family spoken in Mesoamerica and northern Central America. Mayan languages are spoken by at least 6 million indigenous Maya, primarily in Guatemala, Mexico, Belize and Honduras. In 1996, Guatemala formally recognized 21 Mayan languages by name, and Mexico recognizes eight more. The Mayan language family is one of the best documented and most studied in the Americas. Modern Mayan languages descend from Proto-Mayan, a language thought to have been spoken at least 5,000 years ago; it has been partially reconstructed using the comparative method. Mayan languages form part of the Mesoamerican Linguistic Area, an area of linguistic convergence developed throughout millennia of interaction between the peoples of Mesoamerica. All Mayan languages display the basic diagnostic traits of this linguistic area. For example, all use relational nouns instead of prepositions to indicate spatial relationships. They also possess grammatical and typological features that set them apart from other languages of Mesoamerica, such as the use of ergativity in the grammatical treatment of verbs and their subjects and objects, specific inflectional categories on verbs, and a special word class of "positionals" which is typical of all Mayan languages. During the pre-Columbian era of Mesoamerican history, some Mayan languages were written in the Mayan hieroglyphic script. Its use was particularly widespread during the Classic period of Maya civilization (c. 250–900 AD). The surviving corpus of over 10,000 known individual Maya inscriptions on buildings, monuments, pottery and bark-paper codices, combined with the rich postcolonial literature in Mayan languages written in the Latin script, provides a basis for the modern understanding of pre-Columbian history unparalleled in the Americas.
Spoken Natively in:
Official Language in:
Moldovan (also Moldavian; limba moldovenească or лимба молдовеняскэ in Moldovan Cyrillic) is one of the names of the Romanian language as spoken in the Republic of Moldova, where it is an official language. The variety of Romanian spoken in Moldova is the Moldavian subdialect, which is also spoken in northeastern Romania. The two countries share the same literary standard. Written in Cyrillic, Moldovan is also the name of one of three official languages of the breakaway Moldovan territory of Transnistria. The Constitution of Moldova (Title I, Article 13) states that the Moldovan language is the official language of the country. In the Declaration of Independence of Moldova, the state language is called Romanian. The 1989 Language Law that proclaimed it the state language of Moldova, speaks in the preamble of a "Moldovan-Romanian linguistic identity". After political debate over the issue became inflamed again in the early 2000s, a group of Romanian linguists adopted a resolution stating that promotion of the notion of Moldovan language is an anti-scientific campaign. The term Moldavian is also used to refer collectively to the north-eastern varieties of spoken Romanian, spread approximately within the territory of the former Principality of Moldavia (now split between Moldova and Romania). The Moldavian variety is considered one of the five major spoken varieties of Romanian, all five being written identically. There is no particular linguistic break at the Prut River, the border between Romania and Moldova. In Moldova's schools, the discipline about the state language is called "Romanian language", though former Moldovan president Vladimir Voronin asked for it to be changed into "Moldovan language". The standard alphabet is Latin (currently official in the Republic of Moldova). Before 1989, two versions of Cyrillic had been used: the Moldovan Cyrillic alphabet in 1924–1932 and 1938–89, and the historical Romanian Cyrillic alphabet until 1918. As of 2010, the former remains in use only in Transnistria.
Native Speakers: 5.7 million (2005)
Spoken Natively in: Mongolia, People's Republic of China
Official Language in: Mongolia China Inner Mongolia, People's Republic of China
Mongolian language (in Mongolian script: Monggol kele.svg, Mongɣol kele; in Mongolian Cyrillic: Монгол хэл, Mongol khel) is the official language of Mongolia and the best-known member of the Mongolic language family. The number of speakers across all its dialects may be 5.2 million, including the vast majority of the residents of Mongolia and many of the Mongolian residents of the Inner Mongolia autonomous region of China. In Mongolia, the Khalkha dialect, written in Cyrillic, is predominant, while in Inner Mongolia, the language is more dialectally diverse and is written in the traditional Mongolian script. In the discussion of grammar to follow, the variety of Mongolian treated is Standard Khalkha Mongolian (i.e., the standard written language as formalized in the writing conventions and in the school grammar), but much of what is to be said is also valid for vernacular (spoken) Khalkha and other Mongolian dialects, especially Chakhar. Mongolian has vowel harmony and a complex syllabic structure for a Mongolic language that allows clusters of up to three consonants syllable-finally. It is a typical agglutinative language that relies on suffix chains in the verbal and nominal domains. While there is a basic word order, subject–object–predicate, ordering among noun phrases is relatively free, so grammatical roles are indicated by a system of about eight grammatical cases. There are five voices. Verbs are marked for voice, aspect, tense, and epistemic modality/evidentiality. In sentence linking, a special role is played by converbs. Modern Mongolian evolved from "Middle Mongol", the language spoken in the Mongol Empire of the 13th and 14th centuries. In the transition, a major shift in the vowel harmony paradigm occurred, long vowels developed, the case system was slightly reformed, and the verbal system was restructured.
Spoken Natively in: Montenegro
Official Language in: Montenegro
Montenegrin (Crnogorski jezik, Црногорски језик) is an incipient standardized register of the Serbo-Croatian language as spoken by Montenegrins used as the official language of Montenegro. The same subdialect of Shtokavian is also the basis of standard Bosnian, Croatian and Serbian, so all are mutually intelligible and are a single language by that criterion despite being distinct national standards.The idea of a Montenegrin standard language separate from Serbian appeared in 1990s and gained traction in 2000s via proponents of Montenegrin independence. Montenegrin became the official language of Montenegro with the ratification of a new constitution on 22 October 2007. The Montenegrin standard is still emerging. Its orthography was established 10 July 2009 with the addition of two letters to the alphabet, though grammar and a school curriculum are yet to be approved.
Native Speakers: 1.45 million (2000)
Spoken Natively in: Mexico
Official Language in: Mexico
Nahuatl (Nahuatl pronunciation: [ˈnaːwatɬ], with stress on the first syllable) is a language of the Nahuan branch of the Uto-Aztecan language family. It is spoken by an estimated 1.5 million Nahua people, most of whom live in Central Mexico. All Nahuan languages are indigenous to Mesoamerica. Nahuatl has been spoken in Central Mexico since at least the 7th century AD. It was the language of the Aztecs, who dominated what is now central Mexico during the Late Postclassic period of Mesoamerican history. During the centuries preceding the Spanish conquest of Mexico, the Aztec Empire had expanded to incorporate most of Mexico, and its influence caused the variety of Nahuatl spoken by the residents of Tenochtitlan becoming a prestige language in Mesoamerica. At the conquest, with the introduction of the Latin alphabet, Nahuatl also became a literary language and many chronicles, grammars, works of poetry, administrative documents and codices were written in the 16th and 17th centuries.This early literary language based on the Tenochtitlan variety has been labeled Classical Nahuatl and is among the most studied and best documented languages of the Americas. Today Nahuatl varieties are spoken in scattered communities mostly in rural areas throughout central Mexico. There are considerable differences among varieties, and some are mutually unintelligible. They have all been subject to varying degrees of influence from Spanish. No modern Nahuatl languages are identical to Classical Nahuatl, but those spoken in and around the Valley of Mexico are generally more closely related to it than those on the periphery. Under Mexico's Ley General de Derechos Lingüísticos de los Pueblos Indígenas ("General Law on the Linguistic Rights of Indigenous Peoples") promulgated in 2003, Nahuatl and the other 63 indigenous languages of Mexico are recognized as lenguas nacionales ("national languages") in the regions where they are spoken, enjoying the same status as Spanish within their region. Nahuatl is a language with a complex morphology characterized by polysynthesis and agglutination (agglutinative language), allowing the construction of long words with complex meanings out of several stems and affixes. Through centuries of coexistence with the other indigenous Mesoamerican languages, Nahuatl has absorbed many influences, coming to form part of the Mesoamerican Linguistic Area. Many words from Nahuatl have been borrowed into Spanish, and since diffused into hundreds of other languages. Most of these loanwords denote things indigenous to central Mexico which the Spanish heard mentioned for the first time by their Nahuatl names. English words of Nahuatl origin include "avocado", "chayote", "chili", "chocolate", "atlatl", "coyote", "axolotl" and "tomato".
Native Speakers: Northern Ndebele: 2-3 million (2001) Southern Ndebele: 640,000 (2006) Total: 2 million
Spoken Natively in: Northern Ndebele: Zimbabwe, Botswana, South Africa Southern Ndebele: South Africa
Official Language in: South Africa
Nbebele (There are at least two languages commonly called Ndebele) The Northern Ndebele language, isiNdebele, Sindebele, or Ndebele is an African language belonging to the Nguni group of Bantu languages, and spoken by the Ndebele or Matabele people of Zimbabwe.isiNdebele is related to the Zulu language spoken in South Africa. This is because the Ndebele people of Zimbabwe descend from followers of the Zulu leader Mzilikazi, who left KwaZulu in the early nineteenth century during the Mfecane. The Northern and Southern Ndebele languages are not variants of the same language; though they both fall in the Nguni group of Bantu languages, Northern Ndebele is essentially a dialect of Zulu, and the older Southern Ndebele language falls within a different subgroup. The shared name is due to contact between Mzilikazi's people and the original Ndebele, through whose territory they crossed during the Mfecane. The Southern Ndebele language (isiNdebele or Nrebele in Southern Ndebele) is an African language belonging to the Nguni group of Bantu languages, and spoken by the amaNdebele (the Ndebele people of South Africa). There is also another, separate dialect called Northern Ndebele or Matabele spoken in Zimbabwe and Botswana – see Sindebele language. The Zimbabwean and South African Ndebele dialect is closer to Zulu than other Nguni dialects
Native Speakers: 17 million (2007)
Spoken Natively in: Nepal, India, Bhutan
Official Language in: Nepal India (in Sikkim and Darjeeling district, West Bengal)
Nepali or Nepalese (नेपाली) is a language in the Indo-Aryan branch of the Indo-European language family. It is the official language and de facto lingua franca of Nepal and is also spoken in Bhutan, parts of India and parts of Myanmar (Burma). In India, it is one of the country's 23 official languages: Nepali has official language status in the formerly independent state of Sikkim and in West Bengal's Darjeeling district. The influence of the Nepali language can also be seen in Bhutan and some parts of Burma. Nepali developed in proximity to a number of Tibeto-Burman languages, most notably Kiranti and Gurung, and shows Tibeto-Burman influences. Historically, the language was first called Khaskura (language of the khas 'rice farmers'), then Gorkhali or Gurkhali (language of the Gurkha) before the term Nepali was taken from Nepal Bhasa. Other names include Parbatiya ("mountain language", identified with the Parbatiya people of Nepal) and Lhotshammikha (the "southern language" of the Lhotshampa people of Bhutan). The name 'Nepali' is ambiguous, as it was originally a pronunciation of Nepal Bhasa, the Tibeto-Burman language of the capital Kathmandu.
Native Speakers: 4.1 million (2006)
Spoken Natively in: South Africa
Official Language in: South Africa
Northern Sotho is a designation in English, rendered officially and among indigenous speakers as Sesotho sa Leboa. Also confusingly known by the name of its major variety, "Pedi" or "Sepedi", Northern Sotho is a designated official language of South Africa, spoken by 4,208,980 people (2001 Census) in the provinces of Gauteng, Limpopo and Mpumalanga. Urban varieties of Pedi have acquired clicks in an ongoing process of the spread of such sounds from Nguni languages
Native Speakers: 5 million
Spoken Natively in: Norway
Official Language in: Norway Nordic Council
Norwegian (norsk) is a North Germanic language spoken primarily in Norway, where it is the official language. Together with Swedish and Danish, Norwegian forms a continuum of more or less mutually intelligible local and regional variants . These Scandinavian languages together with the Faroese language and Icelandic language, as well as some extinct languages, constitute the North Germanic languages (also called Scandinavian languages). Faroese and Icelandic are hardly mutually intelligible with Norwegian in their spoken form, because continental Scandinavian has diverged from them. As established by law and governmental policy, there are two official forms of written Norwegian – Bokmål (literally "book tongue") and Nynorsk (literally "new Norwegian"). The Norwegian Language Council is responsible for regulating the two forms, and recommends the terms "Norwegian Bokmål" and "Norwegian Nynorsk" in English. Two other written forms without official status also exist: Riksmål ("national language"), which is to a large extent the same language as Bokmål, but somewhat closer to the Danish language, is regulated by the Norwegian Academy, which translates it as "Standard Norwegian". Høgnorsk ("High Norwegian") is a more purist form of Nynorsk that rejects most of the reforms from the 20th century, but is not widely used. There is no officially sanctioned standard of spoken Norwegian, and most Norwegians speak their own dialect in all circumstances. The sociolect of the urban upper and middle class in East Norway can be regarded as a de facto spoken standard for Bokmål because it adopted many characteristics from Danish when Norway was under Danish rule. This so-called standard østnorsk ("Standard Eastern Norwegian") is the form generally taught to foreign students.
From the 16th to the 19th centuries, Danish was the standard written language of Norway. As a result, the development of modern written Norwegian has been subject to strong controversy related to nationalism, rural versus urban discourse, and Norway's literary history. Historically, Bokmål is a Norwegianised variety of Danish, while Nynorsk is a language form based on Norwegian dialects and puristic opposition to Danish. The now abandoned official policy to merge Bokmål and Nynorsk into one common language called Samnorsk through a series of spelling reforms has created a wide spectrum of varieties of both Bokmål and Nynorsk. The unofficial form known as Riksmål is considered more conservative than Bokmål, and the unofficial Høgnorsk more conservative than Nynorsk. Norwegians are educated in both Bokmål and Nynorsk. A 2005 poll indicates that 86.3% use primarily Bokmål as their daily written language, 5.5% use both Bokmål and Nynorsk, and 7.5% use primarily Nynorsk. Thus 13% are frequently writing Nynorsk, though the majority speak dialects that resemble Nynorsk more closely than Bokmål. Broadly speaking, Nynorsk writing is widespread in Western Norway, though not in major urban areas, and also in the upper parts of mountain valleys in the southern and eastern parts of Norway. Examples are Setesdal, the western part of Telemark county (fylke) and several municipalities in Hallingdal, Valdres and Gudbrandsdalen. It is little used elsewhere, but 30–40 years ago it also had strongholds in many rural parts of Trøndelag (Mid-Norway) and the south part of Northern Norway (Nordland county). Today, not only is Nynorsk the official language of 4 of the 19 Norwegian counties (fylker), but also of many municipalities in 5 other counties. The Norwegian broadcasting corporation (NRK) broadcasts in both Bokmål and Nynorsk, and all governmental agencies are required to support both written languages. Bokmål is used in 92% of all written publications, Nynorsk in 8% (2000). Norwegian is one of the working languages of the Nordic Council. Under the Nordic Language Convention, citizens of the Nordic countries who speak Norwegian have the opportunity to use their native language when interacting with official bodies in other Nordic countries without being liable to any interpretation or translation costs.
Native Speakers: 2,000,000 (1999)[1
Spoken Natively in: France, Spain, Italy, Monaco
Official Language in: Spain (Aran Valley)
Occitan (English pronunciation: /ˈɒksɪˌtæn/; French pronunciation: [ɔk.si'tɑ̃]; Occitan: [utsiˈta]), known also as Lenga d'òc by its native speakers (Occitan: [ˈleŋɡɔ ˈðɔ(k)]; French: Langue d'oc), is a Romance language spoken in southern France, Italy's Occitan Valleys, Monaco, and Spain's Val d'Aran: the regions sometimes known unofficially as Occitania. It is also spoken in the linguistic enclave of Guardia Piemontese (Calabria, Italy). Occitan is a descendant of the spoken Latin language of the Roman Empire, as are languages such as Italian, Portuguese, Spanish, Romanian and Sardinian. It is an official language in Catalonia, (known as Aranese in Val d'Aran). Occitan's closest relative is Catalan. Since September 2010, the Parliament of Catalonia has considered Aranese Occitan to be the officially preferred language for use in the Val d'Aran. The term Provençal (Occitan: provençal, provençau or prouvençau, IPA: [pruβenˈsal, pʀuveⁿˈsaw]) may be used as a traditional synonym for Occitan but, nowadays, “Provençal” is mainly understood as an Occitan dialect spoken in Provence. The long-term survival of Occitan is in question. According to the UNESCO Red Book of Endangered Languages, four of the six major dialects of Occitan (Provençal, Auvergnat, Limousin and Languedocien) are considered "severely endangered", while the remaining two (Gascon and Vivaro-Alpine) are considered "definitely endangered" ("severely endangered" essentially means that only elderly people still speak the language fluently, while "definitely endangered" means that adults speak the language but are not passing it on to their children). In Bearn, at least, schoolchildren are being taught at least some of the language, as the language is enjoying somewhat of a revival in the province.
Native Speakers: 33 million (2007)
Spoken Natively in: India
Official Language in: Orissa, Jharkhand
Oriya (ଓଡ଼ିଆ oṛiā), officially spelled Odia, is an Indian language, belonging to the Indo-Aryan branch of the Indo-European language family. It is mainly spoken in the Indian states of Orissa and in parts of West Bengal, Jharkhand, Chhattisgarh and Andhra Pradesh. Oriya is one of the many official languages in India; it is the official language of Orissa and the second official language of Jharkhand. Oriya is the predominant language of Orissa, where Oriya speakers comprise around 83.33% of the population according to census surveys.
Native Speakers: ca. 640,000
Spoken Natively in: Russia (North Ossetia), South Ossetia (partially recognized), Georgia, Turkey
Official Language in: South Ossetia, North Ossetia
Ossetic or Ossetian (Ossetic: Ирон, tr. Iron), also sometimes called Ossete, is an East Iranian language spoken in Ossetia, a region on the slopes of the Caucasus Mountains. The area in Russia is known as North Ossetia–Alania, while the area south of the border is referred to as South Ossetia, recognized by Russia, Nicaragua, Venezuela and Nauru as an independent state but by the rest of the international community as part of Georgia. Ossetian speakers number about 525,000, sixty percent of whom live in North Ossetia, and ten percent in South Ossetia.
Native Speakers: 329,002
Spoken Natively in: Aruba, Bonaire, Curaçao; also in Caribbean Netherlands
Official Language in: Aruba Bonaire Curaçao Caribbean Netherlands
Papiamento (or Papiamentu) is the most widely spoken language on the Caribbean ABC islands, having the official status on the islands of Aruba and Curaçao. The language is also recognized on Bonaire by the Dutch government. Papiamento is a creole language derived from African languages and either Portuguese or Spanish, with some influences from Amerindian languages, English, and Dutch. Papiamento has two main dialects: Papiamento, spoken primarily on Aruba; and Papiamentu, spoken primarily on Bonaire and Curaçao.
Native Speakers: 40–60 million (2007–2009)
Spoken Natively in: Pakistan and Afghanistan. Also spoken among the Pashtun diaspora
Official Language in: Afghanistan
Pashto (پښتو, Pax̌to, IPA: [paʂˈto, paçˈto, puxˈto]; alternatively spelled Pakhto or Pushto), also known as Afghani or Pathani, is the native language of the Pashtun people of South-Central Asia. Pashto is a member of the Eastern Iranian languages group, and is descended from Avestan, the oldest preserved Iranian language. Pashto is spoken in Afghanistan and Pakistan, as well as among the Pashtun diaspora around the world. Pashto belongs to the Northeastern Iranic branch of the Indo-Iranian language family, although Ethnologue lists it as Southeastern Iranian. The number of Pashtuns or Pashto-speakers is estimated 50-60 million people worldwide. Pashto is one of the two official languages of Afghanistan (the other being Dari Persian), and a regional language in western and northwestern Pakistan.
Native Speakers: 60 million (2009) (110 million total speakers)
Spoken Natively in: Western Asia: Iran, Turkey, Iraq, UAE, Saudi Arabia, Qatar, Israel, Bahrain, Kuwait, Oman, Syria, Yemen, Lebanon, Azerbaijan, Georgia Central Asia: Tajikistan, Uzbekistan, Kyrgyzstan, Kazakhstan, Turkmenistan South Asia: Afghanistan, Pakistan, India, Bangladesh East Asia: Japan Europe: Russia, Germany, United Kingdom, France, Sweden, Netherlands, Austria, Norway, Greece, Ukraine, Finland, Belgium, Spain, Italy America: United States, Canada Australasia: Australia, New Zealand
Official Language in: Iran Afghanistan
Persian (local name: فارسی farsi [fɒːɾˈsiː]) is an Iranian language within the Indo-Iranian branch of the Indo-European languages. "The official language of Iran is sometimes called Farsi in English and other languages. This is a transliteration of the native name of the language, however many, including the ISO and the Academy of Persian Language and Literature, prefer the name Persian for the language. Some speakers use the older local name: Parsi (پارسی parsi [pɒːɾˈsiː])." "The name Persian derives from the province of Pārs (modern Fārs) in southwestern Iran." Persian is primarily spoken in Iran, Afghanistan (as Dari since 1958 due to political reasons), Tajikistan (as Tajik due to political reasons by the USSR), and countries which historically came under Persian influence. The Persian language is classified as a continuation of Middle Persian, the official religious and literary language of Sassanid Iran, itself a continuation of Old Persian, the language of the Persian Empire in the Achaemenid era. Persian is a pluricentric language and its grammar is similar to that of many contemporary European languages. Persian has ca. 110 million native speakers, holding official status respectively in Iran, Afghanistan and Tajikistan. For centuries Persian has also been a prestigious cultural language in Central Asia, South Asia, and Western Asia. Persian has had a considerable, mainly lexical influence on neighboring languages, particularly the Turkic languages in Central Asia, Caucasus, and Anatolia, neighboring Iranian languages, as well as Armenian, and Indo-Aryan languages, especially Urdu. It also exerted some influence on Arabic, particularly Iraqi Arabic and Khuzestani Arabic, while borrowing much vocabulary from it after the Muslim conquest of Persia. With a long history of literature in the form of Middle Persian before Islam, Persian was the first language in Muslim civilization to break through Arabic’s monopoly on writing, and the writing of poetry in Persian was established as a court tradition in many eastern courts. Some of the famous works of Persian literature are the Shahnameh of Ferdowsi, works of Rumi (Molana), Rubaiyat of Omar Khayyam, Divan of Hafiz and poems of Saadi.
Native Speakers: 40 million (2007)
Spoken Natively in: Poland; bordering regions of Ukraine, Slovakia, Czech Republic; along the Belarusian–Lithuanian and Belarusian–Latvian border; Germany, Romania, Israel.
Official Language in: European Union, Poland
Polish (język polski, polszczyzna) is a language of the Lechitic subgroup of West Slavic languages, used throughout Poland (being that country's official language) and by Polish minorities in other countries. Its written standard is the Polish alphabet, which has several additions to the letters of the basic Latin script. Despite the pressure of non-Polish administrations in Poland, who have often attempted to suppress the Polish language, a rich literature has developed over the centuries, and the language is currently the largest, in terms of speakers, of the West Slavic group. It is also the second most widely spoken Slavic language, after Russian and ahead of Ukrainian.
Native Speakers: 215 million (2010)
Spoken Natively in: Brazil, Mozambique, Angola, Portugal, Guinea-Bissau, East Timor, Macau, Cape Verde, Sao Tome and Principe
Official Language in: Brazil, Mozambique, Angola, Portugal, Guinea-Bissau, East Timor, Macau, Cape Verde, Sao Tome and Principe
Portuguese (português or língua portuguesa) is a Romance language. It is the official language of Portugal, Brazil, Mozambique, Angola, Cape Verde, Guinea-Bissau, and São Tomé and Príncipe. Portuguese has co-official status (alongside the indigenous language) in Macau in East Asia, East Timor in South East Asia and in Equatorial Guinea in Central Africa; Portuguese speakers are also found in Goa, Daman and Diu in India. With approximately 280 million speakers (210 to 215 million native), Portuguese is the 5th or 6th most spoken language in the world, the 3rd most spoken language in the western hemisphere, and the most spoken language in the southern hemisphere. Spanish author Miguel de Cervantes once called Portuguese "the sweet and gracious language" and Spanish playwright Lope de Vega referred to it as "sweet", while the Brazilian writer Olavo Bilac poetically described it as "a última flor do Lácio, inculta e bela" (the last flower of Latium, rustic and beautiful). Portuguese is also termed "the language of Camões", after one of Portugal's greatest literary figures, Luís Vaz de Camões. In March 2006, the Museum of the Portuguese Language, an interactive museum about the Portuguese language, was founded in São Paulo, Brazil, the city with the greatest number of Portuguese language speakers in the world.
Native Speakers: 100 million (2010)
Spoken Natively in: India, Pakistan
Official Language in: India (Indian states of Punjab & Haryana, secondary officially recognised language in the states of Himachal Pradesh, Delhi, & West Bengal)
Punjabi (Gurmukhi: ਪੰਜਾਬੀ, Shahmukhi: پنجابی, Devanagari: पंजाबी) is an Indo-Aryan language spoken by inhabitants of the historical Punjab region (north western India and eastern Pakistan). In Pakistan, Punjabi is the most widely spoken native tongue. There are some 104 million (2008) native speakers of the Punjabi language; an estimated 76 million in Pakistan (2008) and 28 million in India (2001), and millions in the USA, UK, Canada and Persian Gulf countries Such As UAE, Kuwait And Saudi Arabia making it the 10th most widely spoken language in the world. Native speakers of the Punjabi language and their respective diaspora are referred to as ethnic Punjabis.The Punjabi language has different dialects, spoken in the different sub-regions of greater Punjab. The Majhi dialect is Punjabi's prestige dialect and shared by both countries. This dialect is considered as textbook Punjabi and is spoken in the historical region of Majha, centralizing in Lahore and Amritsar. Along with the Hindko and Western Pahari languages, Punjabi is unusual among modern Indo-European languages because it is a tonal language.
Native Speakers: 8.9 million (2007)
Spoken Natively in: Peru, Bolivia, Colombia, Ecuador, Chile, and Argentina
Official Language in: Peru and Bolivia
Quechua (endonym "Runa Simi" is a Native South American language family spoken primarily in the Andes of South America, derived from a common ancestral language. It is the most widely spoken language family of the indigenous peoples of the Americas, with a total of probably some 8 to 10 million speakers.
Native Speakers: 24 million (2007) Second language: 4 million
Spoken Natively in: Romania, Moldova, Transnistria (Disputed region) Minority in: Israel, Serbia, Ukraine, Hungary
Official Language in: Romania Moldova Vojvodina European Union
Romanian (or Daco-Romanian; obsolete spellings Rumanian, Roumanian; self-designation: română, limba română [ˈlimba roˈmɨnə]("the Romanian language") or românește (lit. "in Romanian") is a Romance language spoken by around 24 million people as a native language, primarily in Romania and Moldova, and by another 4 million people as a second language. It has official status in Romania, Republic of Moldova, the Autonomous Province of Vojvodina in Serbia and in the autonomous Mount Athos in Greece. In the Republic of Moldova, besides the term limba română, the language is also often called limba moldovenească ("Moldovan"); to avoid the political overtones both terms have in that country, a compromise solution has been to call it limba de stat ("the state language"). Romanian speakers are scattered across many other countries, notably Italy, Spain, Ukraine, Bulgaria, the United States, Canada, Israel, Russia, Portugal, the United Kingdom, France, and Germany.
Native Speakers: 35,000 (language of best command) (2000) 60,000 (regular speakers)
Spoken Natively in: Switzerland
Official Language in: Switzerland
Romansh (also spelled Romansch, Rumants(c)h, or Romanche; Romansh: rumantsch / rumauntsch / romontsch /rumàntsch; German: Rätoromanisch; Italian: Romancio) is a Rhaeto-Romance language descended from the Vulgar Latin spoken by the Roman era occupiers of the region. It is closely related to French, Occitan, and Lombard, as well as the other Romance languages to a lesser extent. Romansh is one of the four national languages of Switzerland, along with German, French and Italian. In the 2000 Swiss census, 35,095 people (of which 27,038 in the canton of Grisons) indicated Romansh as the language of "best command", and 61,815 also as a "regularly spoken" language. Spoken by around 0.9% of Switzerland's 7.7 million inhabitants, Romansh is Switzerland's least-used national language in terms of number of speakers and the tenth most spoken language in Switzerland overall.
Native Speakers: 155 million (2010) 110 million L2 speakers
Spoken Natively in: Russia, countries of the former Soviet Union, emigrant communities around the world, notably in the United States, UK, Germany, Israel, Canada, Australia, and Latin America, Egypt.
Official Language in: Russia (state) Belarus (state) Kazakhstan (official) Kyrgyzstan (official) Tajikistan (inter-ethnic communication) Abkhazia(official) South Ossetia(state) Transnistria (official; unrecognized country) Moldova Gagauzia (official) Romania A number of municipalities in Tulcea County and Constanța County Ukraine (regional and minority) Crimea (has some of the official functions Dnipropetrovsk Oblast Donetsk Oblast Kharkiv Oblast Kherson Oblast Luhansk Oblast Mykolaiv Oblast Odessa Oblast Sevastopol Zaporizhia Oblast
Russian (ру́сский язы́к, russkiy yazyk, pronounced [ˈruskʲɪj jɪˈzɨk]) is a Slavic language spoken primarily in Russia, Belarus, Ukraine, Kazakhstan, and Kyrgyzstan. It is an unofficial but widely spoken language in Moldova, Latvia, Estonia, and to a lesser extent, the other countries that were once constituent republics of the USSR. Russian belongs to the family of Indo-European languages and is one of three living members of the East Slavic languages. Written examples of Old East Slavonic are attested from the 10th century onwards. It is the most geographically widespread language of Eurasia and the most widely spoken of the Slavic languages. It is also the largest native language in Europe, with 144 million native speakers in Russia, Ukraine and Belarus. Russian is the 8th most spoken language in the world by number of native speakers and the 4th by total number of speakers. The language is one of the six official languages of the United Nations. Russian distinguishes between consonant phonemes with palatal secondary articulation and those without, the so-called soft and hard sounds. This distinction is found between pairs of almost all consonants and is one of the most distinguishing features of the language. Another important aspect is the reduction of unstressed vowels. Stress, which is unpredictable, is not normally indicated orthographically though an optional acute accent (знак ударения, znak udareniya) may be used to mark stress (such as to distinguish between homographic words, for example замо́к (meaning lock) and за́мок (meaning castle), or to indicate the proper pronunciation of uncommon words or names).
Native Speakers: 14,000 (2001)
Spoken Natively in:
Official Language in: India, Uttarakhand one of the 22 scheduled languages of India
Sanskrit (संस्कृतम् saṃskṛtam [sə̃skɹ̩t̪əm], originally संस्कृता वाक् saṃskṛtā vāk, "refined speech"), is a historical Indo-Aryan language, the primary liturgical language of Hinduism and a literary and scholarly language in Buddhism and Jainism. Today, it is listed as one of the 22 scheduled languages of India and is an official language of the state of Uttarakhand. Sanskrit holds a prominent position in Indo-European studies. The corpus of Sanskrit literature encompasses a rich tradition of poetry and drama as well as scientific, technical, philosophical and dharma texts. Sanskrit continues to be widely used as a ceremonial language in Hindu religious rituals and Buddhist practice in the forms of hymns and mantras. Spoken Sanskrit is still in use in some villages, a few traditional institutions in India and there are many attempts at further popularization.
Native Speakers: 16, 000
Spoken Natively in: People's Republic of China
Official Language in:
Sarikoli language (also Sariqoli, Selekur, Sarikul, Sariqul, Sariköli) is a member of the Pamir subgroup of the Southeastern Iranian languages spoken by Tajiks in China. It is officially referred to in China as the "Tajik language", although it is different from the language spoken in Tajikistan.
Native Speakers: 58,552 in Scotland. 92,400 people aged three and over in Scotland had some Gaelic language ability in 2001 with estimates of additional 500 - 2000 in Nova Scotia. 1,610 speakers in the United States in 2000. 822 in Australia in 2001. 669 in New Zealand in 2006.
Spoken Natively in: United Kingdom, Canada, United States, Australia, New Zealand
Official Language in: Scotland
Scottish Gaelic (Gàidhlig; [ˈkaːlikʲ]) is a Celtic language native to Scotland. A member of the Goidelic branch of the Celtic languages, Scottish Gaelic, like Modern Irish and Manx, developed out of Middle Irish, and thus descends ultimately from Old Irish. The 2001 UK Census showed that a total of 58,652 (1.2% of the Scottish population aged over three years old) in Scotland could speak Gaelic at that time, with the Outer Hebrides being the main stronghold of the language. The census results indicate a decline of 7,300 Gaelic speakers from 1991. Despite this decline, revival efforts exist and the number of younger speakers of the language has increased. Scottish Gaelic is not an official language of the European Union, nor of the United Kingdom. (The only language that is de jure official in any part of the UK is Welsh.) However, it is classed as an autochthonous language under the European Charter for Regional or Minority Languages, which the British government has ratified. In addition, the Gaelic Language (Scotland) Act 2005 gave official recognition to the language and established an official language development body, Bòrd na Gàidhlig. Outside of Scotland, dialects of the language known as Canadian Gaelic exist in Canada on Cape Breton Island, Glengarry County in present-day Eastern Ontario and other isolated areas of the Nova Scotia mainland. The number of present day speakers in Cape Breton is around 2,000, amounting to 1.3% of the population of Cape Breton Island.
Native Speakers: 9 million (2006)
Spoken Natively in: Serbia, Montenegro, Croatia, Bosnia, and neighboring regions
Official Language in: Serbia, Bosnia and Herzegovina, Kosovo
Serbian (Serbian Cyrillic: српски, Latin: srpski, pronounced [sr̩̂pskiː]) is a standardized register of the Serbo-Croatian language spoken by Serbs, mainly in Serbia, Bosnia and Herzegovina, Montenegro, Croatia, and Macedonia. It is official in Serbia and one of the official languages of Bosnia and Herzegovina, and is the principal language of the Serbs. The dialect serving as the basis for the main literary and standard language is Shtokavian, which is also the basis of standard Croatian, Bosnian, and Montenegrin. In particular, Serbian is standardized around Šumadija-Vojvodina and Eastern Herzegovinian subdialects of Shtokavian. The Torlakian dialect of Serbian is spoken in southeast Serbia, and is not standardized, as it represents transitional form to Macedonian and Bulgarian. Serbo-Croatian is the only European language with active digraphia, using both Cyrillic and Latin alphabets. The Bosnian and Serbian varieties use both alphabets while the Croatian variety uses only the Latin alphabet. The Serbian Cyrillic alphabet was devised in 1814 by Serbian linguist Vuk Karadžić, who created the alphabet on phonemic principles. The Latin alphabet was designed by Croatian linguist Ljudevit Gaj in 1830.
Native Speakers: 8.3 million proper (2007)
Spoken Natively in: Zimbabwe, Mozambique, South Africa, Zambia, Botswana
Official Language in:
Shona (or chiShona) is a Bantu language, native to the Shona people of Zimbabwe and southern Zambia; the term is also used to identify peoples who speak one of the Shona language dialects: Zezuru, Karanga, Manyika, Ndau and Korekore. (Some researchers include Kalanga: others recognise Kalanga as a distinct language in its own right.) Shona is a principal language of Zimbabwe, along with Ndebele and the official business language, English. Shona is spoken by a large percentage of the people in Zimbabwe. Other countries that host Shona language speakers are Zambia and Botswana and Mozambique. Shona is the Bantu language most widely spoken as a native language. According to Ethnologue, Shona comprising the Karanga, Zezuru, and Korekore dialects, is spoken by about 10.8 million people. Manyika and Ndau dialects of Shona, listed separately by Ethnologue, and are spoken by 1,025,000 and 2,380,000 people, respectively. The total figure of Shona speakers is then about 14.2 million people. Zulu is the second most widely spoken Bantu language with 10.3 million speakers according to Ethnologue. Shona is a written standard language with an orthography and grammar that was codified during the early 20th century and fixed in the 1950s. The first novel in Shona, Solomon Mutswairo's Feso, was published in 1957. Shona is taught in the schools but is not the general medium of instruction in other subjects. It has a literature and is described through monolingual and bilingual dictionaries (chiefly Shona – English). Modern Shona is based on the dialect spoken by the Karanga people of Masvingo Province, the region around Great Zimbabwe, and Zezuru people of central and northern Zimbabwe. However, all Shona dialects are officially considered to be of equal significance and are taught in local schools. Shona is a member of the large family of Bantu languages. In Guthrie's zonal classification of Bantu languages, zone S10 designates a dialect continuum of closely related varieties, including Shona proper, Manyika, Nambya, and Ndau, spoken in Zimbabwe and central Mozambique; Tawara and Tewe, found in Mozambique; and Ikalanga of Botswana and Western Zimbabwe. Shona speakers most likely moved into present day Zimbabwe from the Mapungubwe and K2 communities in Limpopo South Africa before the invasion of the English settlers. A common misconception is that the speakers of the Karanga dialect were absorbed into the Ndebele culture and language turning them into Kalanga. This misconception is a direct result of the political bias in the national curriculum framework of Zimbabwe. The Kalanga language is widely spoken in Botswana where the Ndebele were never present. The Kalanga language is thought to have been the language used by the Mapungubweans (Department of Archeology Witts University). If this is accurate it follows that the Karanga dialect of Shona is a derivative of Kalanga. Karanga is closer to Kalanga than the rest of the aforementioned dialects. Karanga and Kalanga are both closer to Venda than the other Shona dialects.
Native Speakers: 26 million (2007)
Spoken Natively in: Pakistan. Also India, Hong Kong, Oman, Philippines, Indonesia, Singapore, UAE, UK, USA, Afghanistan, Sri Lanka
Official Language in: Pakistan (Sindh), India
Sindhi (Sindhi: سنڌي) is the language of the historical Sindh region, spoken by the Sindhi people. It is spoken by 53,410,910 people in Pakistan and some 5,820,485 people in India. It is the official language of the province of Sindh. In India, Sindhi is one of the scheduled languages officially recognized by the federal government. Abroad there are some 2.6 million Sindhis. Sindhi is an Indo-Aryan language of the Indo-Iranian branch of the Indo-European language family. It has influences from a local version of spoken form of Sanskrit and from Balochi spoken in the adjacent province of Balochistan. Most Sindhi speakers are concentrated in the Sindh province and in Kutch, India where Sindhi is a local language. The remaining speakers in India are composed of the Hindu Sindhis who migrated from Sindh and settled in India after partition and the Sindhi diaspora worldwide.
Native Speakers: 16 million (2007)
Spoken Natively in:
Official Language in: Sri Lanka
Sinhala ISO 15919: siṁhala, pronounced [ˈsiŋɦələ]), also known as Sinhalese (older spelling: Singhalese) in English, also known locally as Helabasa, is the mother tongue of the Sinhalese people, who make up the largest ethnic group in Sri Lanka, numbering about 15 million. Sinhala is also spoken, as a second language by other ethnic groups in Sri Lanka, totalling about 3 million. It belongs to the Indo-Aryan branch of the Indo-European languages. Sinhala is one of the official and national languages of Sri Lanka, along with Tamil. Sinhala, along with Pali, played a major role in the development of Theravada Buddhist literature. Sinhala has its own writing system, the Sinhala alphabet, which is a member of the Brahmic family of scripts, and a descendant of the ancient Indian Brahmi script. The oldest Sinhala inscriptions found are from the 6th century BCE, on pottery; the oldest existing literary works date from the 9th century CE. The closest relative of Sinhala is the language of the Maldives and Minicoy Island (India), Dhivehi.
Native Speakers: over 7 million (2001 census)
Spoken Natively in: Slovakia; minority language in Czech Republic, Serbia, Hungary
Official Language in: European Union,
Slovak (slovenčina, not to be confused with slovenski jezik or slovenščina, the native name of the Slovene language), is an Indo-European language that belongs to the West Slavic languages (together with Czech, Polish, Silesian, Kashubian, and Sorbian). Slovak is the official language of Slovakia, where it is spoken by 5 million people. There are also Slovak speakers in the United States, the Czech Republic, Serbia, Ireland, Romania, Poland, Canada, Hungary, Croatia, the United Kingdom, Australia, Austria, and Ukraine.
Native Speakers: 2.5 million
Spoken Natively in: Slovenia, Italy (in Friuli Venezia Giulia), Austria (in Carinthia and Styria), Hungary (in Vas); emigrant communities in various countries
Official Language in: Slovenia, European Union, Regional or local official language in: Austria, Hungary Italy
Slovene or Slovenian (slovenski jezik or slovenščina, not to be confused with slovenčina, the native name of Slovak) belongs to the group of South Slavic languages. It is spoken by approximately 2.5 million speakers worldwide, the majority of whom live in Slovenia. It is the first language of about 1.85 million people and is one of the 23 official and working languages of the European Union.
Native Speakers: 15 million (2007)
Spoken Natively in: Somalia, Djibouti
Official Language in: Somalia
Somali language (Somali: Af-Soomaali; Arabic: اللغة الصومالية) is a member of the East Cushitic branch of the Afro-Asiatic language family. Its nearest relatives are Afar and Oromo. Somali is the best documented of the Cushitic languages, with academic studies beginning before 1900.
Native Speakers: at least 5 million
Spoken Natively in: Lesotho, South Africa
Official Language in: Lesotho, South Africa
Sotho language, also known as Sesotho, Southern Sotho, or Southern Sesotho, is a Bantu language spoken primarily in South Africa, where it is one of the 11 official languages, and in Lesotho, where it is the national language. It is an agglutinative language which uses numerous affixes and derivational and inflexional rules to build complete words.
Native Speakers: 405 million (2010) 60 million as a second language
Spoken Natively in:
Official Language in: De jure: Bolivia, Colombia, Costa Rica, Cuba, Dominican Republic, Ecuador, El Salvador, Equatorial Guinea, Guatemala, Honduras, Panama, Paraguay, Peru, Puerto Rico, Spain, Venezuela De facto: Argentina, Chile, Mexico, Nicaragua, SADR, Uruguay
Spanish (español) is a Romance language that originated in Spain. It is also called Castilian (castellano About this sound listen (help•info)) after the particular region of Spain, Castile, where it originated. There are approximately 407 million people speaking Spanish as a native language, making it the second-most-spoken language by number of native speakers after Mandarin. Spanish has the largest amount of native speakers of any Indo-European language in the world, the largest language family on Earth. Spanish is one of the six official languages of the United Nations, and is used as an official language by the European Union and Mercosur. Spanish is a part of the Ibero-Romance group that evolved from several dialects of spoken Latin in central-northern Iberia around the ninth century and gradually spread with the expansion of the Kingdom of Castile (present northern Spain) into central and southern Iberia during the later Middle Ages. Early in its history, the Spanish vocabulary was enriched by its contact with Basque, Arabic and related Iberian Romance languages, and the language continues to adopt foreign words from a variety of other languages, as well as developing new words. Spanish was taken most notably to the Americas as well as to Africa and Asia-Pacific with the expansion of the Spanish Empire between the fifteenth and nineteenth centuries, where it became the most important language for government and trade. Spanish is the most popular second language learned by native speakers of American English. In many other countries as well, in the 21st century, the learning of Spanish as a foreign language has grown significantly, facilitated in part because of the growing population demographics and economic performance of numerous Spanish-speaking countries such as Chile, Peru, Colombia, and Mexico. Growing popularity in extensive international tourism and seeking less expensive retirement destinations for North Americans, Europeans, Australians, and Japanese are other common reasons. Spanish is the most widely understood language in the Western Hemisphere, with significant populations of native Spanish speakers ranging from the tip of Patagonia to as far north as New York City and Chicago. Additionally, there are over 10,000,000 fluent second language speakers in both Brazil and the United States. Since the early 21st century, it has arguably superseded French in becoming the second-most-studied language and the second language in international communication, after English.
Native Speakers: 300,000
Spoken Natively in: Suriname
Official Language in:
Sranan (also Sranan Tongo or Sranantongo "Surinamean tongue", Surinaams, Surinamese, Suriname Creole, Taki Taki) is a creole language spoken as a lingua franca by approximately 300,000 people in Suriname. It was previously called nengre or in Dutch negerengels ("Negro English"). Since this language is shared between the Dutch-, Javanese-, Hindustani-, and Chinese-speaking communities, most Surinamese speak it as a lingua franca, both the Surinamese in Suriname, a former Dutch colony, as well as the immigrants of Surinamese origin in the Netherlands.
Native Speakers: 6 million (2007) 40 million L2 speakers
Spoken Natively in: Burundi, Comoros (as Comorian), DR Congo, Kenya, Mayotte (mostly as Comorian), Mozambique, Oman, Rwanda, Somalia (as Kibajuni and Chimwini), Tanzania, Uganda
Official Language in: African Union Tanzania Kenya Uganda Comoros (as Comorian)
Swahili language or Kiswahili is a Bantu language spoken by various ethnic groups that inhabit several large stretches of the Mozambique Channel coastline from northern Kenya to northern Mozambique, including the Comoros Islands. It is also spoken by ethnic minority groups in Somalia. Although only five million people speak Swahili as their mother tongue, it is used as a lingua franca in much of East Africa, meaning the total number of speakers exceeds 60 million. Swahili serves as a national, or official language, of five nations: Tanzania, Kenya, Uganda, the Comoros and the Democratic Republic of the Congo. Some Swahili vocabulary is derived from Arabic through more than twelve centuries of contact with Arabic-speaking inhabitants of the coast of southeastern Africa. It has also incorporated Persian, German, Portuguese, English, and French words into its vocabulary through contact during the past five centuries.
Native Speakers: 2.0 million (1996–2006)
Spoken Natively in: Swaziland, South Africa, Lesotho, Mozambique
Official Language in: Swaziland South Africa
Swazi or Swati language (Swazi: siSwati [siswatʼi]; Zulu: isiSwazi [isiswaz̤i]) is a Bantu language of the Nguni group spoken in Swaziland and South Africa by the Swazi people. The number of speakers is estimated to be in the region of 3 million. The language is taught in Swaziland and some South African schools in Mpumalanga and KaNgwane areas. Swazi is an official language of Swaziland, (along with English), and is also one of the eleven official languages of South Africa. Although the preferred term is "Swati" among native speakers, in English it is generally referred to as Swazi: this is taken from the Zulu name for the language, isiSwazi. Swazi is most closely related to the other "Tugela" Nguni language, Phuthi; but is also very close to the "Zunda" Nguni languages: Zulu, Southern Ndebele, Northern Ndebele, and Xhosa.
Native Speakers: 8.7 million (2007)
Spoken Natively in: Sweden, Finland
Official Language in: Sweden, Finland, European Union, Nordic Council
Swedish is a North Germanic language, spoken by approximately 10 million people, predominantly in Sweden and parts of Finland, especially along its coast and on the Åland islands. It is largely mutually intelligible with Norwegian and Danish. Along with the other North Germanic languages, Swedish is a descendant of Old Norse, the common language of the Germanic peoples living in Scandinavia during the Viking Era. It is currently the largest of the North Germanic languages by numbers of speakers. Standard Swedish, used by most Swedish people, is the national language that evolved from the Central Swedish dialects in the 19th century and was well established by the beginning of the 20th century. While distinct regional varieties descended from the older rural dialects still exist, the spoken and written language is uniform and standardized. Some dialects differ considerably from the standard language in grammar and vocabulary and are not always mutually intelligible with Standard Swedish. These dialects are confined to rural areas and are spoken primarily by small numbers of people with low social mobility. Though not facing imminent extinction, such dialects have been in decline during the past century, despite the fact that they are well researched and their use is often encouraged by local authorities. The standard word order is subject–verb–object, though this can often be changed to stress certain words or phrases. Swedish morphology is similar to English; that is, words have comparatively few inflections. There are two genders, two grammatical cases, and a distinction between plural and singular. Older analyses posit the cases nominative and genitive and there are some remains of distinct accusative and dative forms as well. Adjectives are compared as in English, and are also inflected according to gender, number and definiteness. The definiteness of nouns is marked primarily through suffixes (endings), complemented with separate definite and indefinite articles. The prosody features both stress and in most dialects tonal qualities. The language has a comparatively large vowel inventory. Swedish is also notable for the voiceless dorso-palatal velar fricative, a highly variable consonant phoneme.
Native Speakers: 28 million (2007) 96% of the Philippines can speak Tagalog (2000)
Spoken Natively in: Philippines
Official Language in: Philippines Philippines (in the form of Filipino)
Tagalog (Tagalog in Latin Alphabet: Wikang Tagalog or transliterated from Baybayin Alphabet: Wikang Tagalog) (/təˈɡɑːlɒɡ/; Tagalog pronunciation: [tɐˈɡaːloɡ]) is an Austronesian language spoken as a first language by a third of the population of the Philippines and as a second language by most of the rest. It is the first language of the Philippine region IV (CALABARZON and MIMAROPA), of Bulacan and of Metro Manila. Its standardized form, officially named Filipino, is the national language and one of two official languages of the Philippines, the other being English. It is related to other Philippine languages such as Ilokano, Bisayan, and Kapampangan.
Native Speakers: 70 million (2007) 8 million as a second language
Spoken Natively in: India, Sri Lanka, Malaysia, Singapore, Réunion, Mauritius
Official Language in: Indian states: Tamil Nadu and Andaman and Nicobar Islands Puducherry, Sri Lanka, and Singapore.
Tamil (தமிழ், tamiḻ, [t̪ɐmɨɻ] ?, alternative spelling: Thamizh) is a Dravidian language spoken predominantly by Tamil people of South India and North-east Sri Lanka. It has official status in the Indian state of Tamil Nadu and in the Indian union territory of Andaman and Nicobar Islands and Puducherry. Tamil is also a national language of Sri Lanka and an official language of Singapore. It is one of the 22 scheduled languages of India and was declared a classical language by the government of India in 2004. Tamil is also spoken by significant minorities in Malaysia, Canada, USA and Mauritius as well as emigrant communities around the world. Tamil is one of the longest surviving classical languages in the world. It has been described as "the only language of contemporary India which is recognizably continuous with a classical past" and having "one of the richest literatures in the world". Tamil literature has existed for over 2000 years. The earliest epigraphic records found on rock edicts and hero stones date from around the 3rd century BCE. The earliest period of Tamil literature, Sangam literature, is dated from ca. 300 BCE – 300 CE. Tamil language inscriptions written c. 1st century BCE and 2nd century CE have been discovered in Egypt, Sri Lanka and Thailand. The two earliest manuscripts from India, to be acknowledged and registered by UNESCO Memory of the World register in 1997 and 2005 were in Tamil. More than 55% of the epigraphical inscriptions (about 55,000) found by the Archaeological Survey of India are in the Tamil language. According to a 2001 survey, there were 1,863 newspapers published in Tamil, of which 353 were dailies. It has the oldest extant literature amongst other Dravidian languages. The variety and quality of classical Tamil literature has led to its being described as "one of the great classical traditions and literatures of the world".
Native Speakers: 76 million (2007)
Spoken Natively in: India worldwide diaspora
Official Language in: India
Telugu (తెలుగు telugu, IPA: [t̪eluɡu]) is a South-Central Dravidian language primarily spoken in the state of Andhra Pradesh, India, where it is an official language. It is also spoken in the neighbouring states of Chattisgarh, Karnataka, Maharashtra, Orissa and Tamil Nadu, and is spoken in the bordering city of Yanam, in the neighboring territory of Pondicherry. According to the 2001 Census of India, Telugu is the language with the third largest number of native speakers in India (74 million), thirteenth in the Ethnologue list of most-spoken languages worldwide, and most spoken Dravidian language. It is one of the twenty-two scheduled languages of the Republic of India and one of the four classical languages. Telugu was influenced by Sanskrit and Prakrit. Telugu borrowed several features of Sanskrit that have subsequently been lost in Sanskrit's daughter languages such as Hindi and Bengali, especially in the pronunciation of some vowels and consonants. It has also been influenced by Urdu around Hyderabad city. Telugu is written in a Brahmic alphabet.
Native Speakers: 500,000 (2004) Tetun Dili widespread in East Timor as L2
Spoken Natively in: Indonesia, East Timor
Official Language in: East Timor
Tetum (also Tetun) is an Austronesian language spoken on the island of Timor. It is spoken in Belu Regency in Indonesian West Timor, and across the border in East Timor, where it is one of the two official languages. In East Timor a creolized form, Tetun Dili, is widely spoken fluently as a second language; without previous contact, Tetum and Tetun Dili are unintelligible. Besides the grammatical simplification involved in creolization, Tetun Dili has been greatly influenced by the vocabulary of Portuguese, the other official language of East Timor.
Native Speakers: 20 million (2000) Total: 60 million (including second-language speakers, 2001)
Spoken Natively in: Thailand
Official Language in: Thailand
Thai (ภาษาไทย Phasa Thai [pʰāːsǎː tʰāj] ), more precisely Central Thai or Siamese, is the national and official language of Thailand and the native language of the Thai people, Thailand's dominant ethnic group. Thai is a member of the Tai group of the Tai–Kadai language family. Some words in Thai are borrowed from Pali, Sanskrit and Old Khmer. It is a tonal and analytic language. Thai also has a complex orthography and relational markers. Thai is mutually intelligible with Lao. Native speakers of these two languages can understand one another without great difficulty.
Native Speakers: 122,000 (2004) 4 million L2 speakers
Spoken Natively in: Papua New Guinea
Official Language in: Papua New Guinea
Tok Pisin (ˌtɔːk ˈpɪsɪn/; Tok Pisin [ˌtokpiˈsin]) is a creole spoken throughout Papua New Guinea. It is an official language of Papua New Guinea and the most widely used language in that country. In parts of Western, Gulf, Central, Oro Province and Milne Bay Provinces, however, the use of Tok Pisin has a shorter history, and is less universal, especially among older people. Between five and six million people use Tok Pisin to some degree, although by no means do all of these speak it well. Between one and two million are exposed to it as a first language, in particular the children of parents or grandparents originally speaking different vernaculars (for example, a mother from Madang and a father from Rabaul). Urban families in particular, and those of police and defence force members, often communicate among themselves in Tok Pisin, either never gaining fluency in a vernacular ("tok ples"), or learning a vernacular as a second (or third) language, after Tok Pisin (and possibly English). Perhaps one million people now use Tok Pisin as a primary language.
Native Speakers: 3.7 million (2006)
Spoken Natively in: Mozambique, South Africa, Swaziland, Zimbabwe
Official Language in: South Africa
Tsonga or Xitsonga language is spoken in southern Africa by the Tsonga people, also known as the Shangaan.
Native Speakers: 3.4 million in South Africa (2006) 1.1 million in Botswana (1993)
Spoken Natively in: Botswana, South Africa, Zimbabwe, Namibia
Official Language in: Botswana South Africa
Tswana or Setswana is a language spoken in Southern Africa by about 4.5 million people. It is a Bantu language belonging to the Niger–Congo language family within the Sotho languages branch of Zone S (S.30), and is closely related to the Northern- and Southern Sotho languages, as well as the Kgalagadi language and the Lozi language. Tswana is an official language and lingua franca of Botswana spoken by almost 1.1 million of its inhabitants. However, the majority of Tswana speakers are found in South Africa where 3.4 million people speak the language. Until 1994, South African Tswana people were notionally citizens of Bophuthatswana, one of the few bantustans that actually became reality as planned by the Apartheid regime. A small number of speakers are also found in Zimbabwe and Namibia, where 29,400 and 12,300 people speak the language, respectively.
Native Speakers: 63 million (2007)
Spoken Natively in: Turkey, Germany, Bulgaria, Macedonia, Northern Cyprus, Greece, Azerbaijan, Kosovo, Romania
Official Language in: Turkey Northern Cyprus (recognized by Turkey), Cyprus
Turkish, also referred to as Istanbul Turkish or Anatolian Turkish, is the most populous of the Turkic languages, with over 70 million native speakers. Speakers are located predominantly in Turkey, with smaller groups in Germany, Bulgaria, Macedonia, Northern Cyprus, Greece, and other parts of Eastern Europe, Caucasus and Central Asia. The roots of the language can be traced to Central Asia, with the first known written records dating back nearly 1,300 years. To the west, the influence of Ottoman Turkish—the variety of the Turkish language that was used as the administrative and literary language of the Ottoman Empire—spread as the Ottoman Empire expanded. In 1928, as one of Atatürk's Reforms in the early years of the Republic of Turkey, the Ottoman script was replaced with a Latin alphabet. Concurrently, the newly founded Turkish Language Association initiated a drive to reform and standardize the language. The distinctive characteristics of Turkish are vowel harmony and extensive agglutination. The basic word order of Turkish is subject–object–verb. Turkish has no noun classes or grammatical gender. Turkish has a strong T–V distinction and usage of honorifics. Turkish uses second-person pronouns that distinguish varying levels of politeness, social distance, age, courtesy or familiarity toward the addressee. The plural second-person pronoun and verb forms are used referring to a single person out of respect. On occasion, double plural second person "sizler" may be used to refer to a much-respected person.
Native Speakers: 7 million (2007)
Spoken Natively in: Turkmenistan, Iran, Afghanistan, Stavropol krai (Russia)
Official Language in: Turkmenistan
Turkmen (türkmençe, türkmen dili, Cyrillic: түркменче, түркмен дили, Persian: تورکمن ﺗﻴﻠی, تورکمنچه), a Turkic language, is the national language of Turkmenistan. It is spoken by 4 million people in Turkmenistan, and by an additional 700,000 in northwestern Afghanistan and 1,400,000 in northeastern Iran.
Native Speakers: 30 million (2007)
Spoken Natively in: Ukraine
Official Language in: Ukraine Transnistria (unrecognized de facto state)
Ukrainian (украї́нська мо́ва / ukrayins'ka mova, [ukrɑˈjɪɲsʲkɑ ˈmɔwɑ], formerly Ruthenian - ру́ська, руси́нська мо́ва / rus'ka, rusyns'ka mova) is a member of the East Slavic subgroup of the Slavic languages. It is the official state language of Ukraine and the principal language of the Ukrainians. Written Ukrainian uses a variant of the Cyrillic script. The Ukrainian language traces its origins to the Old East Slavic of the early medieval state of Kievan Rus'. Ukrainian is a lineal descendant of the colloquial language used in Kievan Rus' (10th–13th century). From 1804 until the Russian Revolution Ukrainian was banned from schools in the Russian Empire of which Ukraine was a part at the time. It has always maintained a sufficient base in Western Ukraine where the language was never banned in its folklore songs, itinerant musicians, and prominent authors. The standard Ukrainian language is regulated by the National Academy of Sciences of Ukraine (NANU), particularly by its Institute for the Ukrainian Language, Ukrainian language-informatical fund, and Potebnya Institute of Language Studies. Ukrainian, Russian, Belarusian, and Rusyn have a high degree of mutual intelligibility. Lexically, the closest to Ukrainian is Belarusian (84% of common vocabulary), followed by Polish (70%), Serbo-Croatian (68%), Slovak (66%) and Russian (62%).
Native Speakers: 66 million (2007)
Spoken Natively in: Pakistan, India
Official Language in: Pakistan, India
Urdu /ˈʊərduː/ (Urdu: اُردُو [ˈʊrd̪u]), or more precisely Standard Urdu, is a register of the Hindi-Urdu language. It is the national language and lingua franca of Pakistan, and is also widely spoken in India, where it is one of the 22 scheduled languages and an official language of five states. In Nepal it's 10th largest language according to the latest census. Based on the Khariboli dialect of Delhi, Urdu developed under the influence of Persian, Arabic, and Turkic languages over the course of almost 900 years. It began to take shape in the region of Uttar Pradesh in the Indian subcontinent during the Delhi Sultanate (1206–1527), and continued to develop under the Mughal Empire (1526–1858). Urdu is mutually intelligible with Standard Hindi spoken in India. Both languages share the same Indo-Aryan base, and are so similar in basic structure, grammar and to large extend vocabulary and phonology, that they appear to be one language. The combined population of Urdu and Standard Hindi speakers is the fourth largest in the world. Mughals hailed from the Barlas tribe which was of Mongol origin, the tribe had embraced Turkic and Persian culture, and resided in Turkestan and Khorasan. Their mother tongue was the Chaghatai language (known to them as Turkī, "Turkic") and they were equally at home in Persian, the lingua franca of the Timurid elite. But after their arrival in the Indian subcontinent, the need to communicate with local inhabitants led to use of Indo-Aryan languages written in the Persian alphabet, with some literary conventions and vocabulary retained from Persian and Turkic; this eventually became a new standard called Hindustani, which is the direct predecessor of Urdu. Urdu is often contrasted with Hindi. Apart from religious associations, the differences are largely restricted to the standard forms: Standard Urdu is conventionally written in the Nastaliq style of the Persian alphabet and relies heavily on Persian and Arabic as a source for technical and literary vocabulary, whereas Standard Hindi is conventionally written in Devanāgarī and draws on Sanskrit. However, both have large numbers of Arabic, Persian and Sanskrit words, and most linguists consider them to be two standardized forms of the same language, and consider the differences to be sociolinguistic, though a few classify them separately. Mutual intelligibility decreases in literary and specialized contexts which rely on educated vocabulary. Due to religious nationalism since the partition of British India and continued communal tensions, native speakers of both Hindi and Urdu frequently assert them to be distinct languages, despite the numerous similarities between the two in a colloquial setting. However, it is quite easy in a longer conversation to distinguish differences in vocabulary and pronunciation of some Urdu letters.
Native Speakers: 26 million (2007)
Spoken Natively in: Uzbekistan, Kyrgyzstan, Afghanistan, Pakistan, Kazakhstan, Turkmenistan, Tajikistan, Russia, China
Official Language in: Uzbekistan
Uzbek (Oʻzbek tili or Oʻzbekcha in Latin script, Ўзбек тили or Ўзбекча in Cyrillic script; اوزبیک تیلی or اوزبیکچه in Arabic script) is a Turkic language and the official language of Uzbekistan. It has about 35.3 million native speakers, and it is spoken by the Uzbeks in Uzbekistan and elsewhere in Central Asia. Uzbek belongs to the southeastern Turkic or Karluk family of Turkic languages, from which it gets its lexicon and grammar, while other influences rose from Persian, Arabic and Russian. One of the most distinguishing aspects of Uzbek from other Turkic languages is its rounding of the vowel /a/ to /ɒ/ or /ɔ/, a feature influenced by Persian.
Native Speakers: 980,000 in South Africa (2006) 84,000 in Zimbabwe (1989)
Spoken Natively in: South Africa, Zimbabwe
Official Language in: South Africa
Venda, also known as Tshivenḓa or Luvenḓa, is a Bantu language and an official language of South Africa. The majority of Venda speakers live in the northern part of South Africa's Limpopo Province, but about 10% of its speakers live in Zimbabwe. The Venda language is related to Kalanga (Western Shona, different from Shona, official language of Zimbabwe) which is spoken in Botswana and Zimbabwe. During the Apartheid era of South Africa, the bantustan of Venda was set up to cover the Venda speakers of South Africa.
Native Speakers: 76 million (2007)
Spoken Natively in: Vietnam, Vietnamese diaspora
Official Language in: Vietnam
Vietnamese (tiếng Việt) is the national, official language of Vietnam. It is the mother tongue of Vietnamese people (Kinh), and of about three million Vietnamese residing elsewhere. It also is spoken as a first or second language by many ethnic minorities of Vietnam. It is part of the Austro-Asiatic language family of which it has, by far, the most speakers (several times that of the other combined Austro-Asiatic languages.) Much of Vietnamese vocabulary has been borrowed from Chinese, and it formerly used a modified Chinese writing system and given vernacular pronunciation. As a byproduct of French colonial rule, Vietnamese was influenced by the French language; the Vietnamese alphabet (quốc ngữ) in use today is a Latin alphabet with additional diacritics for tones, and certain letters.
Native Speakers: 770,700 total speakers (2004) — Wales: 611,000 speakers, around 21.7% of the population of Wales (all skills), with 57% (315,000) considering themselves fluent — England: 150,000 — Chubut Province, Argentina: 5,000 — United States: 2,500 — Canada: 2,200
Spoken Natively in: Wales and Argentina
Official Language in: Wales
Welsh (Cymraeg or y Gymraeg, pronounced [kəmˈrɑːɨɡ, ə ɡəmˈrɑːɨɡ]) is a member of the Brythonic branch of the Celtic languages spoken natively in Wales, by some along the Welsh border in England, and in Y Wladfa (the Welsh colony in Chubut Province, Argentina).Historically, it has also been known in English as "the British tongue", "Cambrian","Cambric" and "Cymric". A 2004 survey by the Welsh Language Board indicated that 21.7% of the population of Wales (611,000 people)were able to speak the language, compared with 20.8% in the 2001 Census. Of those 611,000 Welsh speakers, 57% (315,000) considered themselves fluent, and 78% (477,000) considered themselves fluent or "fair" speakers. 62% of speakers (340,000) claimed to speak the language daily, including 88% of fluent speakers. A greeting in Welsh is one of 55 languages included on the Voyager Golden Record chosen to be representative of Earth in NASA's Voyager program launched in 1977. The greetings are unique to each language, with the Welsh greeting being Iechyd da i chwi yn awr ac yn oesoedd which translates into English as "Good health to you now and forever". The Welsh Language Measure 2011 gave the Welsh language official status in Wales, making it the only language that is de jure official in any part of the United Kingdom.
Native Speakers: 7.8 million (2006)
Spoken Natively in: South Africa, Lesotho
Official Language in: South Africa
Xhosa /ˈkoʊsə/ (Xhosa: isiXhosa [isikǁʰóːsa]) is one of the official languages of South Africa. Xhosa is spoken by approximately 7.9 million people, or about 18% of the South African population. Like most Bantu languages, Xhosa is a tonal language, that is, the same sequence of consonants and vowels can have different meanings when said with a rising or falling or high or low intonation. One of the most distinctive features of the language is the prominence of click consonants; the word "Xhosa" begins with a click. Xhosa is written using a Latin alphabet. Three letters are used to indicate the basic clicks: c for dental clicks, x for lateral clicks, and q for post-alveolar clicks (for a more detailed explanation, see the table of consonant phonemes, below). Tones are not indicated in the written form.
Native Speakers: 1.8 million (no date) 11 million L2 speakers
Spoken Natively in: Argentina, Australia, Austria, Belarus, Belgium, Bosnia and Herzegovina, Brazil, Canada, France, Germany, Hungary, Israel, Latvia, Lithuania, Mexico, Moldova, Netherlands, Poland, Romania, Russia, Sweden, Ukraine, United Kingdom, United States, and elsewhere
Official Language in: Jewish Autonomous Oblast, Russia
Yiddish (ייִדיש yidish or אידיש idish, literally "Jewish") is a High German language of Ashkenazi Jewish origin, spoken in many parts of the world. It developed as a fusion of Hebrew and Aramaic into German dialects with the infusion of Slavic and traces of Romance languages. It is written in the Hebrew alphabet. The language originated in the Ashkenazi culture that developed from about the 10th century in the Rhineland and then spread to Central and Eastern Europe and eventually to other continents. In the earliest surviving references to it, the language is called לשון־אַשכּנז (loshn-ashknez = "language of Ashkenaz") and טײַטש (taytsh, a variant of tiutsch, the contemporary name for the language otherwise spoken in the region of origin, now called Middle High German). In common usage, the language is called מאַמע־לשון (mame-loshn, literally "mother tongue"), distinguishing it from Biblical Hebrew and Aramaic, which are collectively termed לשון־קודש (loshn-koydesh, "holy tongue"). The term "Yiddish" did not become the most frequently used designation in the literature of the language until the 18th century. For a significant portion of its history, Yiddish was the primary spoken language of the Ashkenazi Jews and once spanned a broad dialect continuum from Western Yiddish to three major groups within Eastern Yiddish, namely Litvish, Poylish and Ukrainish. Eastern and Western Yiddish are most markedly distinguished by the extensive inclusion of words of Slavic origin in the Eastern dialects. While Western Yiddish has few remaining speakers, Eastern dialects remain in wide use. Yiddish is written and spoken in many Orthodox Jewish communities around the world, although there are also a number of Orthodox Jews who do not know Yiddish. It is a home language in most Hasidic communities, where it is the first language learned in childhood, used in schools and in many social settings. Yiddish is also the academic language of the study of the Talmud according to the tradition of the Lithuanian yeshivas.
Yiddish is also used in the adjectival sense to designate attributes of Ashkenazic Jewish culture (for example, Yiddish cooking and Yiddish music).
Native Speakers: 28 million (2007)
Spoken Natively in: Nigeria, Togo, Benin
Official Language in: Nigeria
Yoruba language (natively èdè Yorùbá) is a Niger–Congo language spoken in West Africa. The number of speakers of Yoruba was estimated at around 20 million in the 1990s. The native tongue of the Yoruba people, is spoken, among other languages, in Nigeria, Benin, and Togo and in communities in other parts of Africa, Europe and the Americas. A variety of the language, Lucumi, from olukunmi is used as the liturgical language of the Santeria religion of Cuba, Puerto Rico, Dominican Republic and the United States. It is most closely related to the Owo and Itsekiri language (spoken in the Niger-Delta) and Igala spoken in central Nigeria.
Native Speakers: 10.4 million (2007) 16 million L2 speakers
Spoken Natively in: South Africa, Zimbabwe, Lesotho, Malawi, Mozambique, Swaziland
Official Language in: South Africa
Zulu (isiZulu in Zulu) is the language of the Zulu people with about 10 million speakers, the vast majority (over 95%) of whom live in South Africa. Zulu is the most widely spoken home language in South Africa (24% of the population) as well as being understood by over 50% of the population (Ethnologue 2005). It became one of South Africa's eleven official languages in 1994. According to Ethnologue, it is the second most widely spoken Bantu language after Shona. Like many other Bantu languages, it is written using the Latin alphabet.