Sunday, December 31, 2017

Telling Words Apart in Japanese

One thing beginners learning the Japanese language might find confusing and maybe even mysterious is how to tell the words apart in Japanese. After all, Japanese, unlike English, doesn't quite use spaces to separate words. So how do you know where a word starts and where it ends?

The trick is to rely on patterns, mostly patterns based on the alternating Japanese alphabets.

Hiragana Grammar Particles

Words in Japanese can be written with kanji, with hiragana, or with katakana, or with a mix of these. Words in a text change from a script to another, and this switching around happens for a couple of frequent reasons.

For example, consider the phrase:
  • shingeki no kyojin 進撃の巨人
    Giants of the charge.
    (also known as: "Attack on Titan")

Above we have the word shingeki 進撃, a "charge," an "attack" pushing war forward. Said word is written entirely with kanji, it's a kanji compound. Then we have no の, this is hiragana. Note, here, that the script changed: from kanji to hiragana.

This no の is a grammatical particle with a number of functions. In this case, it's associating the noun in front of it, kyojin 巨人, "giant," another kanji compound, with the noun behind it. So the giants are of the charge. And it's probably a charge of giants. A bunch of giants charging forward.

So shingeki is kanji, no is hiragana, kyojin is kanji. Three words.

Another example:
  • jojo no kimyou na bouken ジョジョの奇妙な冒険
    Jojo's Bizarre Adventure.

Here we have exactly the same thing. The only difference is that Jojo ジョジョ is written with katakana, then we have the hiragana particle no の, the kanji compound kimyou 奇妙, "bizarre," the hiragana particle na な, used with adjectives, and the noun kanji compound bouken 冒険, "adventure."

In a lot of situations it's simple like that. A well-known, extremely common grammatical particle written with hiragana is used to glue multiple nouns together in a phrase. These are awfully easy to spot, because they're normally a single syllable, maybe two, in the middle of words. See:
  • baka to tesuto to shoukanjuu バカテスト召喚獣
    (katakana, hiragana, katakana, hiragana, kanji compound)
    The idiots, the tests, and the summoned beasts.
  • higashi no edenエデン
    (kanji, hiragana, katakana)
    Eden of the east.
  • kino no tabi キノ
    (katakana, hiragana, kanji)
    Kino's Journey (Journey of Kino)
  • ookami to koushinryou香辛料
    (kanji, hiragana, kanji compound)
    Wolf and spices.
  • kimi no na wa
    (kanji, hiragana, kanji, hiragana)
    Your name [is...?]

Note that Japanese doesn't have words for the articles "a," "an," and "the," and the way it treats plurals is different from English. So you don't always get the same number of words in English as there were in the original Japanese phrase.

Also note that it's only possible to tell the words apart because hiragana, katakana and kanji are being used normally. That is: in normal Japanese, where normal people use kanji and katakana, it's very easy to tell the words apart like above.

This is to serve as a warning for people starting to learn Japanese out there and picking something without kanji to read thinking it's easier to read a text if it only contains hiragana. It's not easier. In fact, it's not even easy. I'd have a hard time reading something written with only hiragana because Japanese is meant to be written using hiragana, katakana and kanji. A text with only one of these is abnormal.

Known Words

Sometimes, there's literally nothing to hint that two words are indeed separate words and not just one long word. This doesn't always happen, but when it happens, virtually the only way of telling them apart is knowing the words of the phrase.

Although, generally, kanji compounds have two kanji, compounds with three kanji are rarer. So four kanji at once is probably a couple of 2-kanji compounds rather than 1+3 or 3+1.

For example: gekkan shoujo nozaki-kun 月刊少女野崎くん, "monthly girls Nozaki-kun." It's written entirely with kanji, except for the kun くん honorific at the end, which's written with hiragana.

So how do we know it says gekkan shoujo 月刊少女 and not getsukanshou jo 月刊少女 or something? How do we know shoujo is a separate word?

Because you know it's a separate word.

If you have read that combination of kanji before, shoujo 少女, then you're already aware it's a separate word. You can tell where the word starts and where it ends because you've seen it before.


Next we have okurigana. The okurigana is generally the hiragana written at the end of a word. I repeat: at the end of a word. Not after a word. At the end of that same word.

Such okurigana is normally used for conjugating verbs and inflecting adjectives. Since conjugation and inflection always follow an established pattern, it's not difficult to tell where a word, and its okurigana, ends.

For example: hauru no ugoku shiro ハウルの動く城, "Howl's castle that moves." Here we have something like before: first a katakana word, Howl's katakanized name: hauru ハウル, switch to a hiragana particle: no の, switch to the kanji at the start of a word ugo 動, and then ku く, that word's okurigana.

The word ugoku 動く, "to move," is a verb. We know it's a verb because the inflection part, written with hiragana, is ku く, and ku ends with -u. All Japanese verbs in the non-past form end in -u. See: taberu 食べる, "to eat," shinu 死ぬ, "to die," korosu 殺す, "to kill, " and so on. it all ends in the -u vowel.

Since kana represents whole syllables, and not consonants and vowels, in Japanese you don't write the -u vowel alone, you have a different character for every syllable, but that's just a detail. The idea is pretty much the same: if it ends with the u vowel, it's probably a verb.

Furthermore, grammatically, when a verb proceeds a noun in Japanese it becomes an adjective clause. So in ugoku shiro 動く城, the verb ugoku, "to move," precedes shiro, "castle," so "moving castle," or "castle that moves." Because this grammatical pattern exists, it also helps recognize and differentiate words in writing.

Another example: re: zero kara hajimeru isekai seikatsu Re:ゼロから始める異世界生活. This is a more complicated one. Here we have:
  • zero ゼロ
  • kara から
    "From." (this is a two-syllable particle)
  • hajimeruめる
    (kanji plus okurigana)
    To start.
  • isekai 異世界
    (kanji compound)
    Different world.
    (this is sekai 世界, "world," prefixed with i 異, "different")
  • seikatsu 生活
    (kanji compound)

Joined together: "re: different word livelihood started from zero."

Another example, this one with an adjective, which ends in -i, instead of a verb:
  • kono この
    This. (extremely common and basic word, is normally written without kanji)
  • subarashii 素晴らしい
    (kanji compound plus okurigana)
  • sekai 世界
    (kanji compound)
  • ni
    To. (particle)
  • shukufuku 祝福
    (kanji compound)
  • wo! を!
    Marks the direct object of a clause.
    (normally there'd be a verb after this, but this phrase ends here, making the action implicit. We'll assume the action would be "give.")

So, joined together: [give] blessings to this wonderful world.

As we can see above, subarashii sekai 素晴らしい世界 goes 2-kanji-3-hiragana-2-kanji, and those 3-hiragana are the okurigana of the adjective subarashii 素晴らしい. Like this, we can see how such words are connected in Japanese.


I ran out of anime names, but not out of things to say.

In some cases, a word is conjugated, which makes things a little more confusing.

For example: shinda hito 死んだ人, "person who died." Here, we have the verb "to die," shinu 死ぬ, in its past form: shinda 死んだ. Note how the okurigana changed but the kanji remained the same. Anywyay, as we can clearly see, shinda doesn't end with -u, so it's not obvious it's a verb, right?

Not exactly.

When verbs are conjugated the okurigana changes, yes. And it becomes hiragana at the end of the word which might get mistaken for hiragana that's after the word. However, that rarely happens.

The reason for that is simply that the patterns in which a word is conjugated are few and extremely regular.

For example, when verbs are conjugated to the past, the words end in ta た or da だ. Sometimes a verb will change from ending with -u to ending with -i. And there's also the te form which makes a verb end in te て or de で. Anyway, it's not that many possibilities. It's very easy to figure out.

1 comment:

Leave your komento コメント in this posuto ポスト of this burogu ブログ with your questions about Japanese, doubts or whatever!

All comments are moderated and won't show up until approved. Spam, links to illegal websites, and inappropriate content won't be published.

  1. It always made me wonder out of all the particles, how many can be confused as actual words and not the particle itself when things are written or an extension using kana.