Re: lucene and UTF-8

2005-09-29 Thread Chris Hostetter
: I'm having some problems indexing my UTF-8 html pages. I am running : lucene on Linux and I cannot understand why does the index generated : depends on the locale of my operating system. : If I do set | grep LANG I get: LANG=el_GR which is Greek. If I set this : to en_US the index generated will

Re: lucene and UTF-8

2005-09-29 Thread Andrzej Bialecki
John Cherouvim wrote: Hello I'm having some problems indexing my UTF-8 html pages. I am running lucene on Linux and I cannot understand why does the index generated depends on the locale of my operating system. If I do set | grep LANG I get: LANG=el_GR which is Greek. If I set this to en_US t

Re: lucene and UTF-8

2005-09-29 Thread John Haxby
John Cherouvim wrote: I'm having some problems indexing my UTF-8 html pages. I am running lucene on Linux and I cannot understand why does the index generated depends on the locale of my operating system. If I do set | grep LANG I get: LANG=el_GR which is Greek. If I set this to en_US the inde

lucene and UTF-8

2005-09-29 Thread John Cherouvim
Hello I'm having some problems indexing my UTF-8 html pages. I am running lucene on Linux and I cannot understand why does the index generated depends on the locale of my operating system. If I do set | grep LANG I get: LANG=el_GR which is Greek. If I set this to en_US the index generated will