Hi Troy, First, I would like to thank you for very detailed explanation.
I followed your text and tried check what is the exact situation in my case. The module I use is "Lithuanian". In .conf file there is Encoding=UTF-8. Both icu and CLucene are installed and enabled in usrinst.sh. I am not sure if I understood your question concerning indexed module. I have't run mkfastmod on the module "Lithuanian" - I just have sent the text to someone of the sword developers. They have compiled it. But I have problems not only with this module "Lithuanian", but also with Russian bible "RST". Therefore I think it does not depend on module. This problem occurs not only on my compiled sword. I have downloaded the binaries of the SWORD Project for Windows. It has the same problem. Regards, Linas Spraunius On Wed, 19 Sep 2007 16:37:24 +0300, Troy A. Griffitts <[EMAIL PROTECTED]> wrote: > Linas, > > Could you look in the module's .conf file which you are searching and > determine what the Encoding= entry says. It if is not UTF-8 then sword > will not attempt to use ICU on the text, even if it is compiled in. > > Having said that, the sword engine could use some attention in the area > of utf8 processing. Could you tell me if you are using an index module > (clucene compiled in, and you've indexed the module using your favorite > frontend, or the CLI mkfastmod)? If you are using the unindexed search > framework, I'm afraid it is not very utf8 friendly and could some tweaks > to make things work correctly. We've started down that path with the > creation of a new class: StringMgr (thanks Joachim!), here: > > http://crosswire.org/svn/sword/trunk/src/mgr/stringmgr.cpp > > Though it currently only has toupper functionality. > > Old string routine declarations are here but should slowly go away. > Currently their impl should just call the methods in StringMgr, but they > might not all do that yet (I'm pretty sure stricmp does, which is the > most widely used in the engine). > http://crosswire.org/svn/sword/trunk/include/utilstr.h > > The search code is here: > > http://crosswire.org/svn/sword/trunk/src/modules/swmodule.cpp > search for: multiword (2nd occurance) > > The comparison is done using stristr (bad), which should probably be > changed to use a new StringMgr::stristrUTF8 method, but for now you > could simply try changing stristr to strstr and toupper_utf8 both the > input once before the loop, and the candidate buffer just before the > comparison. Sorry for the bad language support. > > -Troy. > > > > Linas S. wrote: >>> I think that indicates that diatheke is built without ICU. Sword uses >>> ICU to do upper case, but only if it is present. >> >> Unfortunately it is compiled with icu. Is it possible that I made a >> mistake when compiling it? I enabled icu in usrinst.sh, then launched >> it. >> After that - make, make install. >> >> Regards, >> >> Linas Spraunius >> >> _______________________________________________ >> sword-devel mailing list: [email protected] >> http://www.crosswire.org/mailman/listinfo/sword-devel >> Instructions to unsubscribe/change your settings at above page > > > _______________________________________________ > sword-devel mailing list: [email protected] > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page -- Using Opera's revolutionary e-mail client: http://www.opera.com/m2/ _______________________________________________ sword-devel mailing list: [email protected] http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
