Frequently Asked Questions

Q: What is this?
A: This is a comparison of the letters (or, more technically, graphemes) used in the 100 most spoken languages using the Latin alphabet.
Q: Ok, but, um... why did you make it?
A: I'm really not sure! I'm interested in languages, and I noticed that in Icelandic uses some of the letters that used to appear in Old English (think Beowulf). I knew the Swedes and the Danes had the a with the circle over it (å), and I wondered what other strange letters and accents the other languages had. I thought, what must it be like for kids in these countries to learn the alphabet? How different is it from mine, and how different is it across languages?
Q: What criteria did you use for inclusion?
A: This is a complicated one. I set out to include any language that:
Q: What was your source for all of this information?
A: Omniglot, which is a tremendous free website, was the primary source. I frequently corroborated their information with Wikipedia, but the number of speakers and most of the alphabet information comes from Omniglot. Technically, the Omniglot pages I used (and link to) are focused on a language's orthography, which is really the whole set of rules for writing a language, not just its alphabet. Sometimes the alphabet wasn't included, so in those cases I made an attempt to find other sources. If I couldn't, I just defaulted to what was in the Omniglot orthography page. Ethnologue is also a great resource, but I don't have the pockets for the membership, and the old print version I bought doesn't have any orthography information.
Q: Ok, fine, but what about all of those weird multiple-letter things at the far right? Surely they can't be part of the alphabet.
digraphs, three-letter ones are called trigraphs, and any sequence of letters that behaves as a unit is called a multigraph.) Sometimes though, there isn't a lot of information distinguishing the alphabet from the orthography. This is especially true of languages that are not yet widely written. For example, many of the Bantu languages have multigraphs with q and x representing click sounds. So for those (and others like them), I have erred on the side of inclusion, choosing to show all consonant-based sounds, including multigraphs. Multigraphs representing diphthongs (vowel combinations of multiple letters) have generally not been included except in a few cases where things like ‘aa’ legitimately appeared to be included in the alphabet.
Q: Got it. So what other rules did you have about what letters and multigraphs you included and how you organized them?
A: Basically I tried to organize everything by the base letter (or grapheme), starting with single-letter elements and then re-setting for multigraphs. Also:
Q: What are all these weird Niger-Congo languages?
A: They're some of the most widely spoken languages in Northern Africa, and they're fascinating! It was fun as part of this project to get out of what I now realize is a very Indo-European-centric view. I also stumbled upon the !Xóõ language, which, with its clicks, has the highest number of distinct sounds in any language. Hearing it spoken is wild, you should check it out.
Q: Who are you, and what qualifications do you have to do this?
A: I'm Spencer Blackman, and I have absolutely no liguistic qualifications at all, I'm just sort of interested. If you're a language expert and I've gotten something wrong, let me know and I'll look into it. In either case, thanks for looking!
Q: I'm a tech nerd and I want to know more about the weird little plugins you used to get this huge ugly table to have filters and frozen columns and stuff.
A: Sure. The main table is powered by TableFilter, and the locked header and columns are courtesy of TableHeadFixer. Thanks to those guys for putting great free tools out there.