|
|
Japan diary |
|
As new material is completed, it will be added to the section of this website. |
|
|
|
site contents diary essays poems stories how to write fiction FAQ e-mail Hugh Cook - details SF novel WORSHIPPERS / WAY fantasy novel WITCHLORD / WEAPONMASTER |
Section 42 Entry 0001. Date: 2003 May 16 Friday.
(diary) (previous) (top) (bottom) (next) (topics) (contents)
For a long time I've been unhappy with my semi-literate state, and have resolved to work on my Japanese. However, despite many good resolutions, I haven't got around to it. Today, however, I hit on an idea which might get me moving: create my own online Japanese dictionary. Or, to be precise, a Japanese-English dictionary, an English-Japanese dictionary and a kanji dictionary.
It very soon became evident that this would involve me in a lot of research. To start with, I would have to find out how many letters there are in the English alphabet. Are there 24 or 26? I've never been sure about this. (It turns out that there are 26.)
A little later, when I started using Microsoft Word to write Japanese, I found I had forgotten how to do wordprocessing in Japanese, so I had to relearn that. (I prefer not to use Windows and I seldom use Microsoft Word, but I don't have any Linux software for doing Japanese word processing.)
Once I had finally succeeded in making the kanji for "shingata haien" (the Japanese for "SARS") I wanted to cut out the kanji using a screen-capture program. However, I couldn't figure out how to make Microsoft's soft gray "this is a blank space" dots go away ... finally I typed in an "i" after the kanji combination, leaving enough blank space for me to capture exactly what I wanted.
It occurred to me that if I seriously intended to make a kanji dictionary then it wouldn't be good enough just to slice out chunks of the screen by hand, so I had to figure out how to get a defined rectangle using the PrintKey 2000 screen capture program. I decided to use 200 x 60 pixels (width x height) to capture four kanji using Microsoft's default Japanese font at 36 point on my 12.1-inch ThinkPad screen, which is 1024 x 768 pixels.
(... but I think that later I will go for 200 x 50 pixels.)
After some fussing around I succeeded in capturing the "shingata haien" kanji, but then I realized I had forgotten how to write the HTML code for putting graphics on screen ... in fact, on reflection, I don't think I ever committed this to memory. So I had to hunt around in my files and find the code (my website is hand-coded.)
And here it is:-
This character reads "shingata haien," "newtype pneumonia," or (to take it character by character) "new / model / lung / flame".
Or, assuming that I'm going to use capital letters for any ON reading and lower-case letters for any kun reading, "SHIN / GATA / HAI / EN," "NEW / MODEL / LUNG / FLAME".
Judging by today's progress I figure I could comfortably build a kanji dictionary at the rate of one character combination a day ... 365 a year ... I'm going to have to live for quite some time if I want to put together a dictionary of 36,500 character combinations ....
However, this exercise did persuade me to open a kanji dictionary for the first time in longer than I can remember. In fact, this morning I had to hunt around to find out where my kanji dictionary was. (I have two or three dozen Japanese study books but cannot instantly lay my hands on any of them.)
Kanji definition: kanji are Chinese characters used in the Japanese language, plus some homemade Japanese characters (the character for "samurai," for instance) which look to the untutored eye like authentic Chinese characters but which were invented in Japan.
English words are typically indigenous (sort of) (such as "water," for example) or derived from Latin (such as "aquarium") or derived from Greek (such as "hydroelectric").
In a very similar way, Japanese words are typically indigenous (such as "mizu," meaning "water") or are derived from Chinese (such as "sui," which means "water," and makes combinations such as "suiriku," "land and water," a vocabulary item used in combinations such as "amphibious operations."
A single kanji typically can be pronounced in at least two different ways. One way is the indigenous reading (the "kun yomi") and the other is the Chinese-derived reading (the "on yomi").
Note that in English compound words "aqua" and "hydro" both mean "water" but that nobody would ever ask for "a glass of hydro" or "a glass of aqua". Similarly, in Japan one would ask for "a glass of mizu" not "a glass of sui".
Section 42 Entry 0002. Date: 2003 May 16 Friday.
(diary) (previous) (top) (bottom) (next) (topics) (contents)
A prototype entry for the projected Japanese kanji dictionary:-
SHIN - NEW
atarashii - new
zadFormat: (Items in brackets are not always present:-)
item; READING / reading; (breakdown into components); (comment on literal meaning); translation; (source); (notes).
Examples of usage, if any, will be placed like this, under a horizontal rule.
![]()
SHIN (NEW)
![]()
atarashii (new)
SHIN / GATA / HAI / EN (NEW / MODEL / LUNG / FIRE) (literally: "new model pneumonia") = SARS = Severe Acute Respiratory Syndrome . (AS 2003). (Note: "HAI / EN" = pneumonia".)
"AS" in this context stands for "Asahi Shimbun" - today I bought the Japanese edition of the Asahi Shimbun. If I do decide to go ahead and actually build a dictionary, then I'm going to build it from living sources.
Section 42 Entry 0003. Date: 2003 May 16 Friday.
(diary) (previous) (top) (bottom) (next) (topics) (contents)
SARS conversation between two slightly inebriated salarymen on a subway train tonight, en route from Tokyo to Kanagawa:-
"SARS wa kowai ne."
Easy - "SARS is scary, isn't it?"
(They pronounce "SARS" in the standard English way, none of this "shingata haien" stuff that you get from the TV announcers on NHK.)
So the topic is obvious, but the rest of the conversation eludes me, partly because of the train noise, partly because of the trickling distraction of the music leaking from the headphones of the guy seated next to me, but mostly because my Japanese is just too weak ... it's SARS this and SARS that and SARS the other, but what they're actually saying I have no idea, chiefly because my vocabulary is too weak so I'm missing too many of the words.
The news to report, then, is that SARS (unsurprisingly) has become a routine topic of conversation.
It is SARS, more than anything, which has given me the incentive to finally get moving on my Japanese. Logic tells me that there might come a time when I want information now, immediately, from the radio or the TV, in Japanese ... a time when it won't be good enough to wait until tomorrow so I can read about the story of the moment in the International Herald Tribune and figure out what it was all about.
If a crisis comes, you really need to know the local language. This came home to me a couple of years ago when I got to the train station, late at night, and found crowds of people milling around without purpose, and something scrawled on a chalkboard by the ticket gates ... I figured out that the JR trains weren't running, and eventually got home on the subway, which is a separate system ... but, at the time, it would have been really nice to be able to read what was written on that chalkboard.
(What I later discovered was that a lightning strike had knocked out a computer, bringing a large chunk of Tokyo's train system to a grinding halt.)
Section 42 Entry 0004. Date: 2003 May 17 Saturday.
(diary) (previous) (top) (bottom) (next) (topics) (contents)
search terms relevant to this entry: word process wordprocessing word processing japanese wordprocessing jpanese character japanese character japanese characters japanese dictionary find japanese character dictionary dictioanry find chinese ideogram find ideograms chinese convert hiragana kanji wordprocessing in japanese using microsoft word japanesewordprocessing using micrsooft word using microsoft word japanese word processing displaying japanese computers howto usea japanese dictionary japanesedictionary chinese dictionary chinesedictionary japanese radical chinese radical japanese radicals chinese radicals - list of relevant search terms ends.
A database should accommodate predictable erroneous searches. This thought occurred to me today when I went looking for the "JUKU" of "Shinjuku".
My initial search, in the alphabetically-ordered index of Spahn and Hadamitzky's Kanji & Kana, turned up no such item. There were a couple of "juku" characters but neither was what I was looking for, and I knew exactly what I was looking for since I had just wordprocessed the kanji for "Shinjuku" into existence using Microsoft Word. (I'm often traveling through Shinjuku station, so I have no problems recognizing the relevant kanji.)
So how did I make the characters "Shinjuku" with Microsoft Word? Well, using a Japanese version of Microsoft Word running under a Japanese-language version of Windows 98, I started typing. (By default, the Japanese version of Microsoft Word automatically assumes you want to produce Japanese script, though at a touch of a button you can start producing English script.)
My ThinkPad was manufactured in Taiwan for the Japanese market and comes with a Japanese computer keyboard. Using the Japanese keyboard, which is essentially a standard English keyboard with a few extra keys (a key for telling programs to switch the output from Japanese script to English script, for example), I typed the English letters "s - h - i - n - j - u - k - u".
This caused the word "shinjuku" to appear on the screen in hiragana, the basic Japanese alphabet - Japanese has two syllabic alphabets, one called hiragana and the other called katakana. The hiragana output looked like this:-
![]()
At this point, the output was underlined by a dotted line.
I then hit the spacebar and Microsoft Word converted the hiragana output into the Chinese characters "shinjuku," thus:-
![]()
![]()
At that point, by hitting the spacebar repeatedly, I could cycle through the various options for the "shinjuku" combination, including the option of outputting the "shinjuku" combination in the Japanese katakana alphabet, in which case it would look like this:-
![]()
Once satisfied that I had on screen what I wanted, I hit the ENTER button to tell the program that i was done with modifying that particular combination.
Want to make modifications after hitting the ENTER button? Highlight the combination you want to change. Hit the spacebar. A list of options will come up. Hit the spacebar repeatedly to cycle through the list of options. When you have what you want, hit the ENTER button to end the process.
And that is the standard way in which most Japanese people do word processing in Japanese. First, they type using the conventional A to Z Roman alphabet, using a slighly modified version of the QWERTYUIOP keyboard familiar to the West. The output typically appears on a computer screen in hiragana, and then, by hitting the spacebar (or by using specialized keys) it is converted (optionally) to Japanese versions of Chinese characters (that is, to kanji) or to katakana.
If this procedure sounds slow and cumbersome, that is because it is slow and cumbersome. However. It works. Although it does not always work well - this morning, Windows 98 crashed on me five times when I was using Microsoft Word.
Anyway, to get back to the "juku" character - I knew what it looked like, because I had produced it on the computer screen, but I wanted to locate it in the dictionary so I could find the correct on yomi, the sinojapanese reading of this character.
Having failed to locate the appropriate "juku" in Kanji & Kana, I hauled out The Kanji Dictionary by the same authors, which is the heavy artillery, listing "Over 47,000 Japanese Character Compounds". Still my "juku" remained elusive.
So I was reduced to searching for it by figuring out which part of this Chinese character was the radical (the fundamental graphical element to which the other parts of the character are subsidiary) and then figuring out how many strokes were used to make the entire character.
The kanji in question is this:-
![]()
The radical, obviously (if it's not obvious then go study Chinese characters for a few years) is the pictorial element which is a little stylized drawing of a roof (an inn or hotel has a roof, right?). That radical, the fundamental graphical element of this kanji, is:-
![]()
This is the roof radical, which is radical number 40 in Kanji & Kana. If you open The Kanji Dictionary at the back, you find that in this dictionary's system of organization, the roof radical is radical 3m.
If you then find 3m in the dictionary, you find a list of all the "3m" characters (all the characters which are drawn using the roof radical) organized by stroke number. The first character in the list is 3m itself, then come a bunch of characters made by using 3m plus two additional strokes, and so forth.
If you know how Chinese characters are drawn, it is obvious that the "juku" of "Shinjuku" is drawn using 3m (the roof radical) plus eight more strokes. Looking in the "3m" list for 8-stroke characters, we find ten options, and it is easy to find the target (assuming that you already know what it looks like), and the target turns out to be 8.3.
Turning a few pages to 3m8.3 (that is, the third of the list of characters which use the roof radical plus eight additional strokes) we find the target, which turns out to be "SHUKU," meaning "lodging, inn".
What is the on yomi, the Chinese-derived pronunciation of this kanji? It is "shuku," not the "juku" I was looking for.
One of the problems with Japanese is that words often undergo a phonetic shift when combined with other words. For example, the Japanese word for "model" is "kata," but in combination with "shin" this makes "shingata," or "newmodel," as in "shingata haien," literally "newmodel pneumonia," or SARS.
If you go looking for "gata" in the index of the average Japanese kanji dictionary designed for Westerners, then you're not going to find it, since the character will be indexed as "kata".
Similarly, if you go looking for the "juku" of "Shinjuku" then you're probably going to draw a blank, because it's going to be indexed as "shuku".
This, of course, is the traditional scholarly approach: reference books are generally not designed to accommodate the errors of the ignorant. But, to my way of thinking, the usability of any database is going to improve enormously if it does in fact accommodate predictable errors.
I've had occasion to think about this quite a bit recently as I've designed and redesigned my web pages to cater for people who are searching for stuff using search engines. I posted an essay on the site about Milton's Satan and started to get hits from people searching not just for "Milton's Paradise Lost" but for "Milton's Lost Paradise," which suggests that the term "lostparadise" should occur somewhere on the page in case people are searching for that.
(Typically, search engines will not break a search term into its components. If you search for "horrorstory" then the search engine will try to locate literal occurrences of "horrorstory" rather than to search for the word "horror" in association with the word "story". This is reasonable, since a search for "waterwheel," for example, becomes nonsensical if it is converted into a search for the word "water" in association with the word "wheel," which would make a legitimate search target out of an irrelevant text such as "having been in the water for years, the wheel was rusty".)
I'm still undecided as to how serious I'm going to get about building an online kanji dictionary. However, if I do go ahead and build this, then I'm going to make an effort to accommodate a range of reasonable and predictable errors ... and it's pretty easy for me to start making a list of these, since I've made quite a few during my years of studying Japanese.
The "radical plus stroke count" procedure listed above, incidentally, is a standard way of locating a Chinese character in either a dictionary of Chinese characters or a dictionary of Japanese kanji.
The alternative method is the stroke count method. Typically, a Chinese or Japanese dictionary will have an index of characters arranged by stroke count order. A very simple character may have just one, two or three strokes, such as the character for "three," which is the same in both Chinese and Japanese:-
![]()
If this character is displaying correctly, you should see three strokes, one beneath the other. For some reason, right now this JPEG image is not displaying correctly when I use Internet Explorer or Mozilla under Windows 98. The middle stroke seems to be missing. What's going on here ....?
A direct link to the JPEG image is this:-
san-three.jpg
When I use the direct link, using either Internet Explorer or Mozilla, the graphic displays just fine ... I can't figure out why it's not working properly (at least not for me, not right now) when I display the web page. (All the other JPEG graphics are displaying just fine - this is the kind of thing that drives me crazy.)
Later: I finally fixed this problem by making a new JPEG graphic for the "san" character, this time using a different font in which the strokes slant slightly instead of running horizontally across the page. Also, I used a bold face character. This seems to have fixed the problem, which was that the middle stroke was failing to display.
Later still - 2003 May 21 Wednesday - got an e-mail from JF (thanks!) pointing out that the actual dimensions of the JPEG image are 51 x 51 not the 50 x 50 that my HTML code specifies ... if I alter my HTML code to reflect the true dimensions of the image then this may fix the problem ... I was using a screen capture program to capture what purported to be a 50 x 50 pixel box, and it never occurred to me that the dimensions might be otherwise ... tomorrow I'll take a shot changing the dimensions and see what happens ... actually, I shouldn't be lazy ... let's change it right now and see if it works ... (I've just realized that I never uploaded the new version of "san" with thicker lines so the old thin-line version is still on the server) ... yep, altering the HTML code to reflect the true dimensions of the graphic fixes the display problem ....
Even later still - 2003 May 23 Friday - when I checked, I found that my screen-saving program was actually alerting me to the fact that the 50 x 50 block of pixels that I was saving was creating a 51 x 51 pixel image. So I didn't read the fine print? That's my problem. I've gotten into a bad habit - click first and then worry about why it's not working later. I think the reason why I got into this habit was that I started learning both Windows and Linux at one and the same time (previously I was owned by DOS) and got overwhelmed by the amount of data I had to absorb - if I'd read and properly absorbed all the documentation on all the programs that I encountered, it would have taken me a couple of lifetimes to even start to get productive. One of my long-term goals is to actually read through the documentation on some of the software that I'm using ... but I don't really expect to live that long.
san:-
![]()
It's very easy to look at this "san" character and figure out, at a glance, that it was written with three strokes of the writing brush. So you run your eye down the list of all the characters which are written with three strokes, you find this character in that index and then you turn to the relevant entry in the dictionary.
However, things become trickier as the characters become more complex. For example, the Japanese character IN, meaning SEAL or STAMP has six strokes:-
![]()
This character can also be read - that is, pronounced - "shirushi," in which case it means "sign" or "mark" ... there is a chain of stores in Japan which goes by the name "Mujirushi," meaning, literally, "no sign," and their gimmick is that the goods they market are extremely plain, the essence of simplicity, and carry no logo of any description.
The character "MU" means "NOT" and is seen all over the place. It is written with twelve strokes:-
![]()
In combination with "IN," this makes the abovementioned "Mujirushi":-
![]()
Note that each character is assigned exactly the same amount of space. The character "shirushi" is written with six strokes of the writing brush and the character "mu" is written with twelve, but each is assigned the same space. When writing Japanese, an identical amount of space is assigned to an alphabetical element such as "n," which in hiragana is:-
![]()
The character in Kanji & Kana which has the greatest number of strokes is KAN, meaning MODEL or PATTERN. This character looks like this:-
![]()
This character is written with twenty-three separate strokes, and is used to make the combination "IN / KAN," meaning "seal," as in a seal which is applied to documents (as opposed to the animal known as a "seal," which in Japanese is an entirely different word, "azarashi".) The combination "inkan" looks like this:-
![]()
A closing note: theoretically, it should be possible to display Japanese / Chinese characters on a computer screen without using JPEG images or some similar kludge. In fact, at one point I did have a page on this website which was in Japanese - put there as a favor for a Japanese national who temporarily needed something in Japanese on the Internet.
Making the Japanese-language page was very simple: it was produced in Japanese using Microsoft Word then Microsoft Word's "save as" function was used to save it as a web page, and then I made a link to that web page.
However, anyone running an English-language browser under an English-language operating system will probably have to tinker with the browser's settings in order to see Japanese characters displayed on screen. Furthermore, I've had some experienced with e-mail which suggest that, in practice, combining English and Japanese is a recipe for problems. Consequently, I've chosen to use JPEG graphics in this case.
Here is a link to a Japanese test page which combines English and Japanese characters. It may or may not display correctly on your browser. If you're curious about the mechanics involved, have a look at the source code (VIEW - PAGE SOURCE, or maybe something different, depending on your browser).
This I made by the process mentioned above: do the word processing in Microsoft Word then save the resulting .doc file as a web page using Microsoft Word's "save as" function. I wasn't sure if it would save as .htm or .html so I saved it somewhere, opened it using the UltraEdit-32 text editor then saved it to the desired location with the .html suffix ... I then used UltraEdit to add in, at both the top and the bottom of the document, a link back to this page.
If you want to look at some authentic Japanese websites there are some links to some Japanese language websites on this site. This list of links includes sites about the weather (a link I sometimes use when I've missed the TV weather forecast) and a link to NHK TV news (which I never use, but which I suppose I should, if I'm serious about achieving Japanese literacy.)
If these Japanese-language websites don't display properly, you may be able to get them to display properly by fooling around with your browser settings - have fun!
(diary) (previous) (top) (bottom) (next) (topics) (contents)
|
/free-novels.html site contents diary essays poems stories how to write fiction FAQ e-mail Hugh Cook - details SF novel WORSHIPPERS / WAY fantasy novel WITCHLORD / WEAPONMASTER Website contents copyright © 1973-2006 Hugh Cook |