Chinese Words and Phrases
Chinese Video Tutorials
Chinese Culture
Chinese History
How to Write Chinese Characters

Overview of the Chinese Language

The Chinese language was originally spoken by the Han Chinese and is spoken by about one-fifth of the total population of Earth. It is one of the two Sino-Tibetan languages and features much internal diversity. There are somewhere between six and twelve regional dialects of Chinese, including Mandarin, Cantonese, and Min. These different dialects are mostly unique, although there are some common terms and structure between some of them. The standard Chinese language is Standard Mandarin, the official language of both China and Taiwan and one of the six languages used at the United Nations. Hong Kong, Macau, and some other areas speak Cantonese. Native speakers of any Chinese dialect are referred to as sinophones.

In addition to the many dialects currently classified, there are a number of regional dialects that have not been studied much. These include Danzhou, Xianghua, and Shaozhu Tuhua. Another relative language is the language spoken by the Dungan people, which is similar to Mandarin. However, it is written using the Cyrillic alphabet and is not considered a true Chinese dialect.

Regional dialects can differ from each other to the point that they are mutually unintelligible. The northern parts of China are more linguistically related, while the southern parts are not. In some areas of Southern China, for example, neighboring cities may speak wildly different dialects.

The standard Mandarin dialect is based on that which is spoken in Beijing. This dialect has been set as the standard form of communication across all of China, meaning that it is spoken in all government agencies, in schools, and in the media. Because of the many dialects, it is normal for most Chinese to speak several different varieties of Chinese. Nearly everyone speaks standard Mandarin, but nearly everyone speaks at least one other dialect, if not two or three. Mixing dialects is also fairly common, especially in Taiwan and Hong Kong, where Mandarin is becoming as common as English and Cantonese.

Linguistics of Chinese

While Chinese is actually more of a language family than one simply language (similar to the various Romance languages), the fact that all the different Chinese dialects use the same writing system ties them together. There is some argument between linguists on whether Chinese is really one language or a language family, but most prefer to consider it one language.

The Chinese writing system is known as Zhongwen, and the standard Mandarin language is referred to as Hanyu in Chinese. Since Chinese does not differentiate between singular and plural like English and other languages, Hanyu could actually refer to either one language or multiple languages. Most Chinese consider the Han Chinese ethnicity as one race of people and the various dialects as regional tongues or even as separate languages, a somewhat unique split. However, most Chinese do not like the idea of referring to Chinese as a language family since it implies that the Chinese people are not as unified as they are.

The Written Chinese Language

The written Chinese language has not evolved as much as the spoken language, where the many dialects evolved at different times and at different rates. The written language first appeared with the oracle bones sometime before the 11th century BC. Over the centuries, writing evolved until it became mostly set during the Spring and Autumn period.

Chinese is written using complex characters called hanzi. They are usually written in columns that are read from right to left, top to bottom. The characters are independent from spoken language, yet no matter which dialect’s pronunciation is used, the character always means the same thing. However, there are some dialectal characters that are only used in specific dialects or have different meanings in different regions. A colloquial version of written Cantonese is often used on the internet and when text messaging, but this written version is considered informal.

The characters used in Chinese first began as a system of hieroglyphs; however, it is important to note that that are not pictographs. Most characters are made up of a phonetic component and a semantic radical. Either part of the character by itself does not really represent an item, thought, or word, although some of the very simple characters (like the character for water) are made up of only a phonetic component.

During the Han dynasty, the scholar Xu Shen took the current list of characters and split them into six categories. He only classified around four percent of them as true pictographs, with the rest being complexes that contained both a semantic element and a phonetic element. Today, Chinese dictionaries list about 215 different radicals, or semantic elements.

Today’s characters are set in a standard script, although more stylish calligraphy may be used. Characters can be written in either traditional or simplified characters, with the traditional ones being more complex and more artistic. Traditional characters are also usually used in Taiwan, Hong Kong, and Macau, while the simplified system is used in mainland China. The simplified system is relatively new—it was created in 1954 as a means of promoting literacy by simplifying most of the complex characters into versions that are easier to memorize and write.

Today, an adult needs to know over 3,000 characters to read a newspaper, and most know between 6,000 and 7,000. The government has set the literacy bar at knowing 2,000 characters, although that is too few to truly be able to read most novels and magazines. There are over 40,000 characters listed in a complex Chinese dictionary, although only about a forth of them appear in everyday use.

History of the Language

Linguists believe that all forms of Chinese evolved from one single language, called the Proto-Sino-Tibetan language. This language is actively being researched, but so far, linguists have not been able to reconstruct it. This is because there is nothing to indicate when this proto language evolved into the ancient Chinese language. Some of the other older languages that could help determine this are not understood very well, and some of the techniques and theories that have successfully been used in the past cannot be applied to Chinese because of the structure of the language.

Old Chinese was the main language used during the Zhou dynasty (1122 BC to 256 BC). Written versions of it have been found on bronze items and in the I Ching. This version of the language has been deciphered because modern Chinese characters can actually be used to determine the Old Chinese character’s pronunciations. Evidence suggests that the language did not use tones like modern Chinese.

Middle Chinese appeared during the sixth century and was used by the Sui, the Tang, and the Song dynasties. It can be divided into two periods, the early and late period, based on the changing rhyme table. While linguists are fairly certain about their reconstruction of Old Chinese, they are much more certain about Middle Chinese. Thanks to the modern language, foreign transliterations, and written documents, Middle Chinese has been preserved. There are still some linguists who have doubts, of course, but for the most part, many are confident the reconstruction of the language is valid.

Modern Chinese developed unevenly throughout China. Many in the north used a version of the Mandarin dialect, while the southern areas and the mountainous areas were more linguistically diverse. In fact, most of the people in the south spoke only a version of their native dialect until the middle of the twentieth century. Once the Ming dynasty moved its capital to Nanjing, most began speaking the Nanjing Mandarian dialect until the Qing dynasty began.

Once the Qing Empire was established, academies were set up to promulgate Beijing Mandarin, the standard for the empire. However, these academies did not succeed, and it wasn’t until the final 50 years of the Qing reign that the Beijing dialect finally became the standard at the imperial court. Outside of the court, there was no real standard dialect, and nearly all continued to use their regional dialect.

A standard dialect was finally created in the mid-20th century. Standard Mandarin became the set language of the government and the school system, creating one language for the entire country. It is spoken by nearly everyone in China and Taiwain, and it is becoming more popular in Hong Kong following the transfer from British to Chinese control in 1997.

Chinese Influence on Foreign Languages

The Chinese language, indeed all of the Chinese culture, has influenced much of Asian and even Europe over history. One of the biggest influences has been on the writing system of both Korean and Japan. These two languages used the Chinese hanzi characters in much of their writing.

Vietnam used traditional Chinese for its system of writing up to the 14th century. Following that, they used a modified version of the script that featured some symbols that represented Vietnamese sounds. This system was used up to the late 19th century when it was finally replaced by a modified version of the Latin script. However, the Vietnamese language still features some Chinese influences.

Many languages also include loanwords from Chinese. At least half of the Korean language comes from Chinese, and a good ten percent or so of the Philippine language is made up of Chinese words. Even European languages contain some Chinese loanwords, including “tea” and “kumquat.”


Like Japanese and other Asian languages, Chinese has been Romanized so that it can be written using the Latin alphabet. There are several different ways of Romanizing Chinese, with the first appearing around the sixteenth century. Today, the standard for Romanizing Chinese is called hanyu pinyin or simply pinyin. It was first used in 1956 and is almost always used when teaching Chinese to foreign learners. Many Chinese even use it to teach young children the sounds and tones for new words.


While some classify modern Chinese as monosyllabic, that is not the case. Many words in the language are disyllabic, especially in Mandarin. Generally, each morpheme, or idea, corresponds to a single syllable and character, which makes many thing the language is monosyllabic. Many new words, however, are formed by linking two or more characters together, creating di-, tri-, and even tetra-syllabic words.

The morphemes of Chinese are constricted by syllables. Many can stand alone as words, but most must be connected with another to create a compound word. Most of these words contain at least two morphemes, but some contain more. All types of modern Chinese are classified as analytic languages because the syntax is more important than the morphology. Chinese has almost no grammatical inflections—no voices, tenses, articles, or gender. The speaker/reader can tell each of these from the context of the sentence, not from the words themselves.

Chinese does have a set subject/verb/object word order, however, and like Korean and Japanese, it has many measure words. Like these other two languages, Chinese speakers often drop the subject from a sentence if the subject is understood.

While most dialects of Chinese share many grammatical traits, there are some differences between the regional languages.


Mandarin is made up of only around 400 monosyllables, but there are over 10,000 written characters. These characters are homophones, or words that sound the same except for the tone. There are four different tones, and while the tone of a word is generally enough to indicate the word’s meaning, sometimes it is not.

Some of the southern languages, including Cantonese, have more tones and use more structure from Middle Chinese. Because of this, there are fewer multi-syllabic words in Southern Chinese dialects.

Adding New Words to Chinese

Chinese, like most other languages, contains a number of loanwords, or words assimilated from other cultures. Words have been brought into Chinese since the Silk Road was used for trade. Some of the earliest include the words “grape” and “lion.” Other words include words related to Buddhism, including Buddha and bodhisattva. Most of these words come from Sanskrit or North India. Most borrowed words keep their phonetic pronunciation. Some, however, are not exactly loanwords but loan concepts. The term for computer, for example, is made up of the Chinese words for “electric” and “brain.”

New foreign words are entering the language even today. Many recent loan words were first used in the Shanghai dialect and then moved over to Mandarin. This transfer resulted in loanwords no longer being pronounced anything like they are in the original language. Today, Chinese morphemes are usually used to represent concepts rather than importing the actual word (such as computer). Other words that have entered Chinese but are not pronounced like they are in the original language include terms like television, blue tooth, and telephone. Some words are only half-translated, such as “hanbao” for “hamburger.”

Many words have moved into Chinese from Japan since the beginning of the 20th century. It was much easier for the Chinese to borrow Japanese loanwords since the two languages are more similar than Chinese and English and other European languages. Sometimes, the Chinese even borrowed their own words back from the Japanese. The Chinese character for “workings of the state” was borrowed by the Japanese for their character for economy. The Chinese later took this definition for the character.

Chinese as a Foreign Language

Many people have begun learning Chinese, especially since the People’s Republic of China set the standard Mandarin version of the language. Many students in the Western world have been studying Chinese since the early 1990s. In 2005, over 100,000 individuals have taken the Chinese Proficiency Test.