Does Lucene support UNICODE?

2004-06-08 Thread Satish Kagathare
Hello, Does Lucene support UNICODE search and indexing of UNICODE data(especially..Devnagari unicode data)? Does it make any difference between utf-8 utf-16 unicode docs? Bcoz java strings supports utf-16. Bcoz i tried indexing(using indexFiles indexHTML from lucene Demo) devnagari uni

Re: problems with lucene in multithreaded environment

2004-06-08 Thread Jayant Kumar
--- Doug Cutting [EMAIL PROTECTED] wrote: Jayant Kumar wrote: Thanks for the patch. It helped in increasing the search speed to a good extent. Good. I'll commit it. Thanks for testing it. But when we tried to give about 100 queries in 10 seconds, then again we found that after

lucene scoring

2004-06-08 Thread uddam chukmol
Hi all, It's so confusing the way Lucence computes the score. I tried to see what happaned but am blocked with some parameters' mystery. - in DefaultSimilarity.queryNorm(float sumOfSquareWeights) : how does it compute the query weight? - How does it compute the weight of each field in the

Re: Performance: compound vs. multi-file index, indexing and searching

2004-06-08 Thread Eric Jain
Can anyone comment on performance differences? I just ran a comparison, indexing about 250'000 small documents. Both the time for indexing (239s) and the final disk space used (16.6MB) were identical. Haven't compared search performance, though I suspect I can save myself the effort...

out of memory while indexing one single file

2004-06-08 Thread Yue Sun
Hi, First, I am not sure if I should post my question here, since I am using CLucene (C++ port of Lucene) to build indexes. Hope someone here could help me. I am indexing at a solaris machine with 1G memory. I use ram writer and fs writer, and write into fs index once a while. Now I am testing

Lucene Scoring question

2004-06-08 Thread Ram Subbaroyan
I have been trying to follow Lucene scoring across multiple searchables. And I do not see where the IDF gets normalized between searchables? (Sum DF across searchables in first half of query and use in second half of query execution to calculate right IDF across searchables.) Lets say you have