subject:"Re\: O\/S Search Comparisons"

Re: O/S Search Comparisons

2007-12-07 Thread Grant Ingersoll

Yeah, I wasn't too excited over it and I certainly didn't lose any sleep over it, but there are some interesting things of note in there concerning Lucene, including the claim that it fell over on indexing WT10g docs (page 40) and I am always looking for ways to improve things. Overall, I

Re: O/S Search Comparisons

2007-12-08 Thread Michael McCandless

http://www.clef-campaign.org/2006/working_notes/workingnotes2006/ dinunzioOCL EF2006.pdf, ...) for other information search the web ;-) Samir -Message d'origine- De : Mark Miller [mailto:[EMAIL PROTECTED] Envoyé : vendredi 7 décembre 2007 21:01 À : java-dev@lucene.apache.org Obj

Re: O/S Search Comparisons

2007-12-07 Thread Mark Miller

ther information search the web ;-) Samir -Message d'origine- De : Mark Miller [mailto:[EMAIL PROTECTED] Envoyé : vendredi 7 décembre 2007 21:01 À : java-dev@lucene.apache.org Objet : Re: O/S Search Comparisons Yes, and even if they did not use the stock defaults, I would bet ther

Re: O/S Search Comparisons

2007-12-07 Thread Grant Ingersoll

lineProceedings6/NTCIR/NTCIR6-OVE RVIEW.pdf for NTCIR-6, for CLEF have a look at http://www.clef-campaign.org/2006/working_notes/workingnotes2006/dinunzioOCL EF2006.pdf, ...) for other information search the web ;-) Samir -Message d'origine- De : Mark Miller [mailto:[EMAIL PROTECT

RE: O/S Search Comparisons

2007-12-07 Thread Samir Abdou

06.pdf, ...) for other information search the web ;-) Samir > -Message d'origine- > De : Mark Miller [mailto:[EMAIL PROTECTED] > Envoyé : vendredi 7 décembre 2007 21:01 > À : java-dev@lucene.apache.org > Objet : Re: O/S Search Comparisons > > Yes, and even if they

Re: O/S Search Comparisons

2007-12-07 Thread Mark Miller

Yes, and even if they did not use the stock defaults, I would bet there would be complaints about what was done wrong at every turn. This seems like a very difficult thing to do. How long does it take to fully learn how to correctly utilize each search engine for the task at hand? I am sure lon

Re: O/S Search Comparisons

2007-12-07 Thread Mike Klaas

There is a good chance that they were using stock indexing defaults, based on: Lucene: " In the present work, the simple applications bundled with the library were used to index the collection. " On 7-Dec-07, at 10:27 AM, Grant Ingersoll wrote: Yeah, I wasn't too excited over it and I certain

Re: O/S Search Comparisons

2007-12-07 Thread robert engels

I wouldn't get too excited over this. Once again, it does not seem the evaluator understands the nature of GC based systems, and the memory statistics are quite out of whack. But it is hard to tell because there is no data on how memory consumption was actually measured. A far better way

Re: O/S Search Comparisons

2007-12-08 Thread Grant Ingersoll

On Dec 8, 2007, at 4:51 AM, Michael McCandless wrote: Sometimes, when something like this comes up, it gives you the opportunity to take a step back and ask what are the things we really want Lucene to be going forward (the New Year is good for this kind of assessment as well) What are it

Re: O/S Search Comparisons

2007-12-08 Thread Doron Cohen

Grant Ingersoll <[EMAIL PROTECTED]> wrote on 08/12/2007 16:02:31: > > On Dec 8, 2007, at 4:51 AM, Michael McCandless wrote: > > >>> Sometimes, when something like this comes up, it gives you the > >>> opportunity to take a step back and ask what are the things we > >>> really want Lucene to be goi

Re: O/S Search Comparisons

2007-12-08 Thread robert engels

This is along the lines of what I have tried to get the Lucene community to adopt for a long time. If you want to take Lucene to the next level, it needs a "server" implementation. Only with this can you get efficient locks, caching, transactions, which leads to more efficient indexing an

Re: O/S Search Comparisons

2007-12-09 Thread Michael McCandless

Well, at some point the answer is "use Solr". I think Lucene should stay focused on being a good search library/component, and server level capabilities should be handled by Solr or the application layer on top of Lucene. That said, I still think there is a need for a layer that handles/

Re: O/S Search Comparisons

2007-12-10 Thread Grant Ingersoll

On Dec 7, 2007, at 3:01 PM, Mark Miller wrote: Yes, and even if they did not use the stock defaults, I would bet there would be complaints about what was done wrong at every turn. This seems like a very difficult thing to do. How long does it take to fully learn how to correctly utilize ea

Re: O/S Search Comparisons

2007-12-10 Thread Mike Klaas

On 8-Dec-07, at 10:04 PM, Doron Cohen wrote: +1 I have been thinking about this too. Solr clearly demonstrates the benefits of this kind of approach, although even it doesn't make it seamless for users in the sense that they still need to divvy up the docs on the app side. Would be nice if t

Re: O/S Search Comparisons

2007-12-17 Thread Grant Ingersoll

I did hear back from the authors. Some of the issues were based on values chosen for mergeFactor (10,000) I think, but there also seemed to be some questions about parsing the TREC collection. It was split out into individual files, as opposed to trying to stream in the documents like we

Re: O/S Search Comparisons

2007-12-17 Thread Mark Miller

For the data that I normally work with (short articles), I found that the sweet spot was around 80-120. I actually saw a slight decrease going above that...not sure if that held forever though. That was testing on an earlier release (I think 2.1?). However, if you want to test searching it wou

Re: O/S Search Comparisons

2007-12-17 Thread Doron Cohen

On Dec 18, 2007 2:38 AM, Mark Miller <[EMAIL PROTECTED]> wrote: > For the data that I normally work with (short articles), I found that > the sweet spot was around 80-120. I actually saw a slight decrease going > above that...not sure if that held forever though. That was testing on > an earlier r

Re: O/S Search Comparisons

2007-12-18 Thread Grant Ingersoll

My testing experience has shown around 100 to be good for things like Wikipedia, etc. That is an interesting point to think about in regards to paying the cost once optimize is undertaken and may be worth exploring more. I also wonder how partial optimizes may help. The Javadocs say: Det

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

RE: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

Re: O/S Search Comparisons

18 matches

Site Navigation

Mail list logo

Footer information