Re: sorting a doc field takes more time

2008-03-14 Thread sandyg
HI, thnx for reply field ,documents ,sort and sort field all are lucene classes and after getting the results at the time of displaying am using sort class to sort the reults based on particular field the code for sorting Query query = parser.parse(queryString); Sort

Re: Biggest index

2008-03-14 Thread John Wang
We are running on one box in prod with 20 million docs in one index. -John On Fri, Mar 14, 2008 at 8:01 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > How big is your machine and how big are your docs? (unique terms, > etc.) Even if it would fit, it sounds like you are going to have to > go d

Re: Biggest index

2008-03-14 Thread Grant Ingersoll
How big is your machine and how big are your docs? (unique terms, etc.) Even if it would fit, it sounds like you are going to have to go distributed sooner or later, so you might as well start planning for it. -Grant On Mar 14, 2008, at 8:51 AM, <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> wr

RE: Search against an index on a mapped drive ...

2008-03-14 Thread Dragon Fly
Thank you all. > From: [EMAIL PROTECTED] > Subject: Re: Search against an index on a mapped drive ... > Date: Fri, 14 Mar 2008 08:58:41 -0400 > To: java-user@lucene.apache.org > > > This setup should work fine, but as others said definitely explore > options & test search performance. > > Mik

Re: HELP: how to list term score inside some document?

2008-03-14 Thread Paul Elschot
Op Friday 14 March 2008 17:28:17 schreef Rao WeiXiong: > Dear: > > If possible to list all term scores inside some document by some > simple method? now i just use each term as the query to search the > whole index to get the score. seems very cumbersome. is there any > simple approach? Have a lo

HELP: how to list term score inside some document?

2008-03-14 Thread Rao WeiXiong
Dear: If possible to list all term scores inside some document by some simple method? now i just use each term as the query to search the whole index to get the score. seems very cumbersome. is there any simple approach? Cheers! weixiong

Re: Language identification ??

2008-03-14 Thread Mathieu Lecarme
Raghu Ram a écrit : to complicate it further ... the text for which language identification has to be done is small, in most cases a short sentence like " I like Pepsi ". Can something be done for this ? Drinking water? More seriously, if ngram pattern language guessing is too ambigous, sear

Re: Language identification ??

2008-03-14 Thread Raghu Ram
to complicate it further ... the text for which language identification has to be done is small, in most cases a short sentence like " I like Pepsi ". Can something be done for this ? On Fri, Mar 14, 2008 at 8:18 PM, Mathieu Lecarme <[EMAIL PROTECTED]> wrote: > Itamar Syn-Hershko a écrit : > > Fo

Re: IndexReader deleteDocument

2008-03-14 Thread Erick Erickson
Doc IDs are assigned at index time and can change over time That is, deleting a document and optimizing (and other operations) can and will change document IDs. So, yes, you have to do a search (either use a hits object or one of the HitCollectors) in order to delete by doc ID. You can also delete

Re: Language identification ??

2008-03-14 Thread Mathieu Lecarme
Itamar Syn-Hershko a écrit : For what it worths, I did something similar in my BidiAnalyzer so I can index both Hebrew/Semitic texts and English/Latin words without switching analyzers, giving each the proper treatment. I did it simply by testing the first char and looking at its numeric value -

RE: Language identification ??

2008-03-14 Thread Itamar Syn-Hershko
For what it worths, I did something similar in my BidiAnalyzer so I can index both Hebrew/Semitic texts and English/Latin words without switching analyzers, giving each the proper treatment. I did it simply by testing the first char and looking at its numeric value - so it falls between Hebrew Ale

Re: Language identification ??

2008-03-14 Thread Grant Ingersoll
I think Karl Wettin has one that is a patch in JIRA. Try searching there. On Mar 14, 2008, at 1:28 AM, Raghu Ram wrote: Hi all, I guess this question is a bit off the track. Are there any language identification modules inside Lucene ??? If not can somebody please suggest me a good one.

Re: Search against an index on a mapped drive ...

2008-03-14 Thread Michael McCandless
This setup should work fine, but as others said definitely explore options & test search performance. Mike Dragon Fly wrote: Hi, I'd like to find out if I can do the following with Lucene (on Windows). On server A: - An index writer creates/updates the index. The index is physicall

RE: Biggest index

2008-03-14 Thread spring
Yes of course, the answers to your questions are important too. But no anwser at all until now :( For me I can say (not production yet): 2 ID-Fields and one content field per doc. Seach on content field only. Simple searches like "content:foo" or "content:foo*". 1,5 GB index per 1 million docs. A

Re: Search against an index on a mapped drive ...

2008-03-14 Thread Erik Hatcher
On Mar 14, 2008, at 8:22 AM, Mathieu Lecarme wrote: Dragon Fly a écrit : Hi, I'd like to find out if I can do the following with Lucene (on Windows). On server A: - An index writer creates/updates the index. The index is physically stored on server A. - An index searcher searches agai

Re: Search against an index on a mapped drive ...

2008-03-14 Thread Mathieu Lecarme
Dragon Fly a écrit : Hi, I'd like to find out if I can do the following with Lucene (on Windows). On server A: - An index writer creates/updates the index. The index is physically stored on server A. - An index searcher searches against the index. On server B: - Maps to the index directory.

Search against an index on a mapped drive ...

2008-03-14 Thread Dragon Fly
Hi, I'd like to find out if I can do the following with Lucene (on Windows). On server A: - An index writer creates/updates the index. The index is physically stored on server A. - An index searcher searches against the index. On server B: - Maps to the index directory. - An index searcher sea

Re: Language identification ??

2008-03-14 Thread Mathieu Lecarme
Raghu Ram a écrit : Hi all, I guess this question is a bit off the track. Are there any language identification modules inside Lucene ??? If not can somebody please suggest me a good one. Thank You. nutch provide a tool for that, with ngram pattern, just like OO.o do it. M. ---

Re: Swapping between indexes

2008-03-14 Thread Sridhar Raman
One quick doubt regarding copying of indexes. Is the copy done on the indexes in memory as well, or is it only done on the committed indexes? On Fri, Mar 7, 2008 at 12:29 AM, Peter Keegan <[EMAIL PROTECTED]> wrote: > Sridhar, > > We have been using approach 2 in our production system with good r