How much can you squeeze in 140? Arabic vs English vs French vs Chinese
(This is a loose and abridged translation of the original piece from Grégoire Fleurot published in French in slate.fr)
Since the demonstrations in Iran last year Twitter has become an increasingly important tool for activists needing to “get the message out”. Its key feature being to limit messages to 140 characters, whatever the alphabet we use, it is interesting to compare languages and find out which one enables to squeeze the maximum of information in 140.
Arabic: Gold – English: Silver – French: Bronze
In English and 130 characters you can say:
«I just came back from Tahrir square. Everyone there was calling for Mubarak to leave. Peaceful atmosphere, no policemen to be seen»
The French equivalent message needs more space and uses the entire 140:
«Je viens de rentrer de la Place Tahrir. Tout le monde y réclamait le départ de Moubarak. Ambiance pacifique, il n’y a pas de policier en vue»
Whilst the Arabic translation only needs 93:
«لقد رجعت للتو من ميدان التحرير. الجميع تطالب برحيل مبارك. جو هادئ. لايوجد شرطي في مرمى البصر»
On the other hand if you wanted to make use of the full 140 in Arabic you could then say:
“عدت للتو من ميدان التحرير. الجميع يطالبون برحيل مبارك على الفور. وهناك أناس من جميع الأعمار والفئات الاجتماعية والقرى والمدن. مصر كلّها هنا.”
Which becomes a longer 197:
“I just came back from Tahrir Square. Everyone there was calling for Mubarak to leave immediately. There were people of all ages and social classes, from cities and villages. All of Egypt was there.”
Whilst the French equivalent becomes a whooping 218:
«Je viens de rentrer de la Place Tahrir. Tout le monde y réclamait le départ immédiat de Moubarak. Il y avait des gens de tous âges et de toutes les classes sociales, des villes et des villages. Toute l’Egypte était là.»
So Arabic appears to be more concise, followed by English, whilst French is.. well.. more verbose.
The main reason for those differences comes from the structure of the Arabic words: no vowels, 3 to 6 consonants per word, as well as the frequent use of nominal sentences, as opposed to verbal sentences, which tend to be more compact. A classic example being: “The more, the merrier’” as opposed to “The more we are, the merrier we are”.
English wins tie-breaker thks 2 SMS Txt talk
However “Shakespeare’s language” has an other advantage besides its inherent grammar. The dominance of English as the Lingua Franca of the internet has resulted in the generalisation of abbreviations and acronyms, which other languages have not necessarily embraced so widely. We are all familiar with the classic: ”For” = 4, “to” = 2 , “Be” = b, “are” = r, etc…
This is where English gets its edge. For instance in Arabic the word «government» (10 characters) is written with 7 signs: الحكومة whereas the English abbreviation «gov» only uses 3 characters.
So we can shrink our Tahrir message to 86 without altering its *understandability*:
«Bck from Tahrir sqre. Every1 was callin 4 Mubarak 2 go. Peaceful atmosphere, no police»
However the Platinum winner of the information/data ratio is the Chinese language:
A good example is the following piece in 139 Mandarin characters from wikipedia …:
… which would be translated in 490 English characters:
“1960 was the year of the Sino-Soviet split, the dissolution of the socialist camp. The Chinese Republic embarked on a completely separate development path and actively worked to establish and develop friendly relations with countries in Asia, Africa, and Latin American. It was gradually recognised by Britain, France, Israel and other Western countries. The United States were still recognising the ROC government in Taiwan against the Chinese central government, to counter a PRC blockade.”