Principal Language Groups:

India is a land of numerous languages. According to Grierson, the editor and compiler of The Linguistic Survey of India, nearly 180 languages and about 550 dialects are spoken by Indians.

These languages belong to four important groups: the Austro-Asiatic, Tibeto-Burman, Dravidian, and Indo- Aryan.

The Austro-Asiatic languages in India seem to be the earliest and are generally known because of Munda speech.


The speakers of this language are found as far east as Australia and as far west as Madagascar near the eastern coast of Africa. They, however, have a large number of speakers in Southeast Asia. The anthropologists believe that the Austric people appeared in Australia around 40,000 BC.

It is, therefore, more likely that they went from Africa to Southeast Asia and Australia via the coast of the Indian subcontinent about 50,000 years ago. By that time, language seems to have been invented. Human genetics show that, 50,000 years ago, the Africans came to the deep south in India from where they passed through the Andaman-Nicobar Islands to Indonesia and later to Australia.


The Austric language family is divided into two subfamilies, Austric-Asiatic spoken in the Indian subcontinent and Austronesian spoken in Australia and Southeast Asia. The Austric-Asiatic subfamily has two branches: Munda and Mon-Khmer. Mon-Khmer represents the Khasi language which is spoken in the Khasi and Jantia hills in Meghalaya in north-east India and also in the Nicobar islands.

However, the Munda tongue is spoken in a much larger area. The Santhals, who constitute the largest tribal group in the subcontinent, speak it in Jharkhand, Bihar, West Bengal, and Orissa. The forms of speech of the Mundas, Santhals, Hoes, etc., also known as the Mundari language, are prevalent in West Bengal, Jharkhand, and central India. In the Himalayas, Munda survivals are most apparent.



The second group of languages, that is Tibeto-Burman, is a branch of the Sino-Tibetan family. If we take account of China and other countries, the number of the speakers of this family far exceeds that of the Austric family and even of the Indo-Aryan family. This family has some 300 languages which are spoken in China, Tibet, and Myanmar (Burma). In the Indian subcontinent, Tibeto-Burman speech extends along the Himalayas from north-eastern Assam to north-east Punjab.

These forms are found in the north-eastern states of India, and a large number of people in this area speak various forms of the Tibeto-Burman tongue. Various tribes use as many as 116 dialects of this language. The north-eastern states, where they are spoken, include Tripura, Sikkim, Assam, Meghalaya, Arunanchal, Nagaland, Mizoram, and Manipur.

The Tibeto-Burman language also prevails in the Darjeeling area of West Bengal. Although both the Austric and the Tibeto-Burman forms of speech are much older than the Dravidian and Indo-Aryan, no literature developed in those tongues because, unlike the Indo-Aryans and the Dravidians, they did not have any form of writing.

The speakers were, however, conversant with oral legends and traditions which were first recorded by Christian missionaries in the nineteenth century. It is significant that a Tibeto-Burman term called burunji was used by the Ahoms in medieval times in the sense of the family tree. It is likely that the Maithili term panji for the family tree was linked to the Tibeto-Burman term.



The third family of languages spoken in India is Dravidian. This form of speech covers almost the whole of south India, and is also prevalent in north-eastern Sri Lanka. Over twenty Dravidian languages are spoken in this area. The earliest form of Dravidian speech, Brahui, is found in the north-western part of the Indian subcontinent located in Pakistan.

There are two views about the migration of the Dravidian speaking people, genetic and linguistic. According to the genetic view, the first major migration into India came from the Middle East around 30,000 years ago. According to the second view, the Dravidians came from Elam around 6000 years ago.

It seems that the process of the dispersal of the Dravidian speakers started in about 30,000 BC and continued until 4000 BC. Scholars of linguistics attribute the origin of the Dravidian language to Elam, that is south-western Iran. This language is assigned to the fourth millennium BC, and Brahui is a later form of it. It is still spoken in Iran, Turkmenistan, and Afghanistan, and also in the states of Baluchistan and Sindh in Pakistan.

It is said that the Dravidian language travelled via the Pakistan area to south India where it gave rise to Tamil, Telugu, Kannada, and Malayalam as its main branches, but Tamil is far more Dravidian than the other languages. Oraon or Kurukh, spoken in Jharkhand and central India, is also Dravidian, but is spoken mainly by members of the Oraon tribe.


The fourth language group, Indo-Aryan belongs to the Indo-European family. According to scientists genetic signals found in the steppe, people throughout Central Asia appear in a good degree in the speakers of the Indo-Aryan languages in India and very little in Dravidian speakers. This suggests that the speakers of the language of the Indo-European family migrated to India. It is said that the eastern or Arya branch of the Indo- European family split into three sub-branches known as Indo-Iranian, Dardic, and Indo-Aryan. Iranian, also called Indo-Iranian, is spoken in Iran and the earliest specimen of it is found in the Zend Avesta.

The Dardic language belongs to eastern Afghanistan, north Pakistan, and Kashmir, though most scholars now consider Dardic speech to be a branch of the Indo-Aryan language. Indo-Aryan is spoken by a large number of people in Pakistan, India, Bangladesh, Sri Lanka, and Nepal. Nearly 500 Indo-Aryan languages are spoken in north and central India.

The Old Indo-Aryan covers Vedic Sanskrit. The middle Indo-Aryan covers Prakrit, Pali, and Apabhramsha from about 500BC to ad 1000. Both Prakrit and classical Sanskrit continued to develop in early medieval times, and many words appeared in Apabhramsha from ad 600. The modern Indo-Aryan regional languages such as Hindi, Bengali, Assamese, Oriya, Marathi, Gujarati, Punjabi, Sindhi, and Kashmiri developed in medieval times out of Apabhramsha, as is also the case with Nepali. Kashmiri is Dardic in origin, but it has been deeply influenced by Sanskrit and later Prakrit.

Although India has four groups of languages, their speakers do not form isolated units. In the past an ongoing interaction went on between the various linguistic groups. Consequently, words from one language group appear in another language group. The process began in Vedic times.

Large numbers of Munda and Dravidian words are to be found in the Rig Veda. However, eventually the Indo-Aryan language superseded many tribal languages because of the socio-economic dominance of its speakers. Though the Indo-Aryan ruling groups used their own language, they could not exploit tribal resources and manpower without using the tribal dialects. This led to the mutual borrowing of words.

Ethnic Groups and Language Families:

In the Indian subcontinent, each of the four language families is attributed to each one of the four ethnic groups into which the people of India are divided. These four groups are Negrito, Australoid, Mongoloid, and Caucasoid. This racial division was made in the nineteenth century and was based on the physical features of various peoples.

Thus, short stature, short face, and short lips are assigned to the Negrito, who live in the Andaman and Nicobar Islands and the Nilgiri Hills of Tamil Nadu. The Negrito are also placed in Kerala and Sri Lanka. It is thought that they speak some Austric language. The Australoids too are of short stature though they are taller than the Negrito. They too have dark complexions and plenty of body hair. They live mainly in central and southern regions, though also in the Himalayan areas, and speak Austric or Munda languages.

The Mongoloids are of short stature, have scanty body hair, and flat noses. They live in the sub-Himalayan and north-eastern regions and speak Tibeto- Burman languages. The Caucasoids are generally of tall stature with long faces, and show well-developed chins, fair skin, and narrow but prominent noses. They speak both the Dravidian and Indo-Aryan languages, and are not therefore linked to a single language.

It is difficult to demarcate one racial group from another, for their physical features keep changing due to climatic conditions. It is interesting that brahmanas and chamars share the same physical features in some areas, and both of them speak the same language. Brahmanas mention their clan groups called gotras to which they belong, but such gotras are not assigned to the chamars.

However, in all the bordering areas of various cultural zones, people speak two or more languages. More importantly, commingling of various peoples leads to intermixture of languages. Thus, neither do the people concerned retain their original features nor does the language retain its original character. It is, therefore, not easy to assign a particular language to any one ethnic group.