Re: Lucene 1.9.1 - How to determine from which machine the hit comes?

2006-03-29 Thread pc123
Hi, Yes, the 2nd case, we have a MultiSearcher taking an array of Searchable (each doing the lookup to different server machines) on the client side and RemoteSearchable taking an instance of MultiSearcher on the server side. How to find out which searcher a hit comes from with a MultiSearcher?

Re: Data structure of a Lucene Index

2006-03-29 Thread Prasenjit Mukherjee
I have already gone through the fileformat. What I was looking for, is the underlying theory behind the chosen fileformats. I am sure those fileformats were decided based on some theoritical axioms. --prasen [EMAIL PROTECTED] wrote: On Mar 28, 2006, at 11:57 PM, Prasenjit Mukherjee wrote:

Re: BooleanQuery containing SpanNearQuery throws ArrayOutOfBoundsException .

2006-03-29 Thread Paul Elschot
Jelda, I have just added a patch for DisjunctionSumScorer.java here: https://issues.apache.org/jira/browse/LUCENE-413 issue. Could you try that patch and report the results at the jira issue? In case you need help using the patch could you move the discussion to the java-dev list? Regards, Paul

Re: Boosting a token in the query

2006-03-29 Thread Erik Hatcher
Unfortunately I'm not sure what boosting capabilities were available in 1.2, but that syntax is correct for the 1.4+ at least. And I'm sure the Explanation feature is not available in 1.2, but with more recent versions of Lucene you'd be able to see the effect of the boosts on the score.

Re: Date Field Indexing

2006-03-29 Thread Erik Hatcher
On Mar 29, 2006, at 11:37 AM, Dennis Kubes wrote: Looking at the Lucene In Action book it shows indexing Date fields with something like this: Field.Keyword("datefield", new Date()); I know that the APIs have changed for Field and I see that there are no longer date constructors. So j

Date Field Indexing

2006-03-29 Thread Dennis Kubes
Looking at the Lucene In Action book it shows indexing Date fields with something like this: Field.Keyword("datefield", new Date()); I know that the APIs have changed for Field and I see that there are no longer date constructors. So just confirming that we should use the DateTools class an

Should I reindex with remove then add or can I add then remove?

2006-03-29 Thread Sindri Traustason
Hi! I have a question about how I should go about reindexing an existing record in an index. Currently my method that reindexes items is like this: public void updateInIndex( Item item ) throws IOException{ Document doc = ItemDocumentFactory.createDocument(item);

What is the largest index(s) size lucene can support

2006-03-29 Thread Mordo, Aviran (EXP N-NANNATEK)
I know Lucene can have multiple indexes and have a parallel search across indexes. The question I have is what is the largest number of documents Lucene can support with multiple distributed indexes. Or if to be more specific, can Lucene support BILLIONS of documents (across multiple indexes), and

Re: Hi Experts

2006-03-29 Thread gekkokid
Hi, Lucene is a component that indexes data and allows you to search that indexed data, you need to be able to program in Java(various ports for other languages are available) or find a crawler you can adapt to download the required data of the internet (still requires basic knowledge of Ja

Boosting a token in the query

2006-03-29 Thread Madhusudan, Veda \(Norcross, DAV\)
Is there a way to boost a token while querying? Example, in the following query +(DESC:sheets DESC:sheet), can the token 'sheets' be boosted and given higher precedence over 'sheet' so the results matching 'sheets ' appear before those for 'sheet'? I am using lucene 1.2. I tried using the boost fac

Re: Lucene 1.9.1 - How to determine from which machine the hit comes?

2006-03-29 Thread Erik Hatcher
On Mar 29, 2006, at 5:50 AM, pc123 wrote: I am searching over multiple indices in multiple machines using RemoteSearchable. Thus I get hits from various indices residing in different machines. My Client and server program is similar to one given in Lucene in Action book (Searching multiple i

Re: Does Optimize preserve index order?

2006-03-29 Thread chan kang
Thank you~ The sorting doesn't seem to take that long (not as long as I expected), but unfortunately didn't get to measure it this time... Maybe next time I'l try measuring.. Now I've got another problem.. My final goal is to keep the index sorted reverse-chronologically, so that, when searching,

Re: Data structure of a Lucene Index

2006-03-29 Thread Erik Hatcher
On Mar 28, 2006, at 11:57 PM, Prasenjit Mukherjee wrote: It seems to me that lucene doesn't use B-tree for its indexing storage. Any paper/article which explains the theory behind data- structure of single index(segment). I am not referring to the merge algorithm, I am curious to know the

Re: Paging results

2006-03-29 Thread Erik Hatcher
On Mar 29, 2006, at 5:09 AM, Marios Skounakis wrote: I am executing searches that return between 2000 and 1 documents and sorting the results by relevance (or sometimes alphabetically). In every query, I need to discard some of the results based on their docId. I have a list of the do

Lucene 1.9.1 - How to determine from which machine the hit comes?

2006-03-29 Thread pc123
I am searching over multiple indices in multiple machines using RemoteSearchable. Thus I get hits from various indices residing in different machines. My Client and server program is similar to one given in Lucene in Action book (Searching multiple indexes remotely). Is it possible to determine fr

Parallel MultiSearcher

2006-03-29 Thread pksunilpk
What I have understood from Lucene Remote Parallel Multi Searcher Search Procedure is first compute the weight for the Query in each Index sequentially (one by one, eg: - calculate "query weight" of index1 first and then index2) and then perform searching of each index one by one and merge the res

Re: Paging results

2006-03-29 Thread Volodymyr Bychkoviak
Hi Marios Skounakis wrote: Hi all, I have the following issue (I am giving a quantified example so we can talk more concretely) My documents have an docId field, stored as a keyword field. I am executing searches that return between 2000 and 1 documents and sorting the results by rele

RE: Hi Experts

2006-03-29 Thread Aditya Liviandi
Well you'll have to index the internet. Then when you've done that then you can try going against google. Oh, and you'll have to update that index every now and then to keep your index of the internet updated. Good luck. --- I²R Disclaimer ---

RE: Hi Experts

2006-03-29 Thread Babu, KameshNarayana \(GE, Research, consultant\)
Thanks Aditya, Lucene is used only to search in the local machine right? How can lucene search on the internet? Do we have any tools which can index on the internet self and displays the results. I know this is very silly. -Original Message- From: Aditya Liviandi [mailto:[EMAIL PROTECTED

Paging results

2006-03-29 Thread Marios Skounakis
Hi all, I have the following issue (I am giving a quantified example so we can talk more concretely) My documents have an docId field, stored as a keyword field. I am executing searches that return between 2000 and 1 documents and sorting the results by relevance (or sometimes alphabeti

RE: BooleanQuery containing SpanNearQuery throws ArrayOutOfBoundsException .

2006-03-29 Thread Ramana Jelda
Thanks for your reply. For smaller index it is working fine. I will try again and again to reproduce exception. Please let me know, if there is a quick fix to do locally. Thanks & Regards, Jelda > -Original Message- > From: Paul Elschot [mailto:[EMAIL PROTECTED] > Sent: Tuesday, March

RE: Hi Experts

2006-03-29 Thread Babu, KameshNarayana \(GE, Research, consultant\)
" -Original Message- From: Ranjan K. Baisak [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 29, 2006 12:32 PM To: java-user@lucene.apache.org Subject: RE: Hi Experts you wrote " I am using HTMLparser to parse all html pages and to get required information out of that. Let me tell you my c