The Most Commonly Used Japanese Words by Frequency

Topics: Japan, Language

In a previous post, I expounded on language learning and laid out what I think is the fastest way to learn any language. One of the components of quickly acquiring languages is to prioritize the words that you learn. Learning the most common words first will reap huge benefits for your comprehension. There are several word frequency lists out there, most of them I found were compiled from newspapers, but Mike “Pomax” Kamermans over at nihongoresources.com had a brilliant idea to use Japanese novels as material instead. His algorithm compiled over 65 million words. No word frequency list can be perfect, but I think this one is about as close as you can get.

I simply took the first 3000 words from his data and made some tweaks so the words are easier to utilize for studying. I removed punctuation and numbers, and compiled the words into 2 page pdf files that are easy to print so you can cross off words when you learn them. I’ve also included the text file of those 3000 words in case you want to do any textual searches.

PDF files: For Printing

Japanese Word Frequency List 1-1000

Japanese Word Frequency List 1000-2000

Japanese Word Frequency List 2000-3000

Text file: For Searching

Japanese Word Frequency List 1-3000

A little bit of number crunching on the data turned out some very interesting facts.

The first 100 words on the list make up 57.2% of the text that was processed.

The first 500? 70.3%.

The first 1000? 76.2%

The first 3000? 85.4%

The first 10,000? 94.1%

But don’t let this data fool you completely.  Mike (the man who generated the list himself) said in an email…

Usually the most frequently used words don’t need explicit learning because they are found all over the place, and the medium-presence words are more important, because they convey important things. Frequent words are usually common because they contain little information, so you have a trade-off between ‘used a lot’ and ‘give critical information’.

You can find the complete list of more than 65,000 words including punctuation, word frequency, and parts of speech at http://pomax.nihongoresources.com/index.php?entry=1222520260.

Here’s a link my article How to Learn Any Language in 6 Months

Share

57 Responses to “The Most Commonly Used Japanese Words by Frequency”

  1. Appreciative Researcher says:

    Actually, it’s quite useful that this list is the most common Japanese morphemes, not words. Fuller understanding of etymologies helps one deepen their comprehension.

    For example, after seeing “-er” as the suffix of many English words one can notice two distinct types: those which (basically) mean “one who ___s” and those which demonstrate adjectival comparison.

    As to the point of “わたし & 私” both being included in the list, there is a reason for this. 私 can be pronounced in ways other than simply わたし—e.g., あたし, あて, あっし, あたくし—and these different ways of vocalizing the same written character carry meaningful information. Rolling these vocalizations up into one ranking would reduce the available data and thus be less useful.

    Actually, I would even split homophones into separate rankings. For example, Lee asks about “ん.” I can think of a few reasons to consider ん a morpheme in it’s own right. It can mean “mm-hmm” (affirmative, nonstandard spelling), “mm-mm” (negative, nonstandard spelling), or can mean “because,” “in other words,” or just show the speakers resolve (when used before です). But I’ll bet one meaning is much more common than the others.

    Anyway, I think it’s great someone took the time to put this together and make it available for simple distribution. Thanks!

  2. AdolfoBiggie says:

    Hi blogger, i’ve been reading your content for some time and I really like coming back here.

    I can see that you probably don’t make money on your
    page. I know one cool method of earning money, I
    think you will like it. Search google for: dracko’s tricks

  3. BestDean says:

    I see you don’t monetize your website, don’t waste your traffic, you can earn extra bucks every month.

    You can use the best adsense alternative for any type of website (they approve all websites), for more info simply search in gooogle: boorfe’s tips monetize
    your website

  4. By these calculations I should be able to read 80% of things (up to about 2300 words in my ANKI deck). Just goes to show how vital some of these core words are.

  5. Jaques says:

    Can you provide the list in English?

  6. WOW just what I was searching for. Came here by searching for ασφαλεια μοτοσυκλετας

    my homepage – ασφαλειες αυτοκινητων online

Comments