Hi, You need to sort your index file. Looks like dict(7) is doing binary search on it. After sorted, it works fine.
fhs On Tue, Jan 6, 2009 at 12:34 AM, Akshat Kumar <aku...@sounine.nanosouffle.net> wrote: > Regarding the dict index files, what I understand is that dict(7) > receives a pattern (may also be a byte offset or whatever, but suppose > pattern), looks it up in the first fields of the lines in the dict > index, and uses the corresponding byte offset in the index to find the > full line in the dict file. Well, I've been trying to make the EDICT > dictionary[1] usable with dict(7), using just the "simple" dict scheme > as described in /sys/src/cmd/dict/simple.c, and have made (for now) a > <kanji> <byte offset> > index file from the output of mkindex (piping through to sed and > switching the order of the kanji and byte offset). I've tried quite a > few ways of making that index file, but have yet not succeeded in > getting dict(7) to actually find a corresponding line in the dict file > (`pattern not found'), given any kanji in the first fields in the > index file as a pattern. > > I cannot attach the index file nor the dictionary file with this > E-Mail, since both are too big -- though I've put them online[2] -- > but the dictionary file made available at [1] is in a slightly > different format (inserted tab after each kanji/kana) and charset > (EUC-JP/JIS X 0208 → UTF-8) than I have converted at [2]. If anyone > is willing to help figure this out, I'd be very grateful. > > > [1] http://www.csse.monash.edu.au/~jwb/edict_doc.html > (see FORMAT for default formatting, and CURRENT VERSION & DOWNLOAD > to grab edict.gz) > [2] http://sounine.nanosouffle.net/magic/webls?dir=/comp/dict > > > Please alert me if the information here is insufficient -- > I also don't mind if you go ahead and make the dict files yourself... > just let me in on it -- > ak