Re: Exception while adding document in 3.0

2010-02-02 Thread Glen Newton
Documents cannot be re-used in v3.0? http://wiki.apache.org/lucene-java/ImproveIndexingSpeed -glen http://zzzoot.blogspot.com/ On 2 February 2010 02:55, Simon Willnauer simon.willna...@googlemail.com wrote: Ganesh, do you reuse your Document instances in any way or do you create new docs

RE: Exception while adding document in 3.0

2010-02-02 Thread Uwe Schindler
They can. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Glen Newton [mailto:glen.new...@gmail.com] Sent: Tuesday, February 02, 2010 9:03 AM To: java-user@lucene.apache.org Subject: Re: Exception while

RE: [Bulk] RE: Exception while adding document in 3.0

2010-02-02 Thread Uwe Schindler
They can be reused, but the exception and looking into the code shows that you are doing it wrong. You can reuse Documents but only under two conditions: a) In one thread, not one document in multithreaded app - one document per thread! And using one document in more than one thread is what you

Re: [Bulk] RE: [Bulk] RE: Exception while adding document in 3.0

2010-02-02 Thread Ganesh
Yes. I am using the objects across threads. Thanks for pointing out. But still i didn't see much increase in performance by reusing documents and field objects. Regards Ganesh - Original Message - From: Uwe Schindler u...@thetaphi.de To: java-user@lucene.apache.org Sent: Tuesday,

Re: [Bulk] RE: [Bulk] RE: Exception while adding document in 3.0

2010-02-02 Thread Ian Lea
But still i didn't see much increase in performance by reusing documents and field objects. Oh well, some you win, some you lose. Although it sounds like you're still winning. Are you following all the other advice on http://wiki.apache.org/lucene-java/ImproveIndexingSpeed. Maybe the best

Re: How to use search index while indexing

2010-02-02 Thread Hayri
Ian Lea wrote: Sounds like a job for near realtime search aka NRT. Take a look at IndexWriter.getReader(). http://wiki.apache.org/lucene-java/NearRealtimeSearch http://www.lucidimagination.com/blog/2009/04/10/real-time-search-with-lucene/ And more with the help of your favourite search

Re: ComplexPhraseQueryParser (Expanded Form and Boosting)

2010-02-02 Thread Karsten F.
Hi Nariman, In my understanding of ComplexPhraseQueryParser this class is not longer supported. http://issues.apache.org/jira/browse/LUCENE-1486#action_12782254 Instead with lucene 3.1 the new org.apache.lucene.queryParser.standard.parser.StandardSyntaxParser will do this job.

Re: How to use search index while indexing

2010-02-02 Thread Ian Lea
I'm not sure that I understand the question. Can you not use a searcher based on the reader returned by IndexWriter.getReader() to determine if the doc is already in the index? Or just use IndexWriter.updateDocument to save or replace as appropriate. -- Ian. On Tue, Feb 2, 2010 at 12:21 PM,

Re: ComplexPhraseQueryParser (Expanded Form and Boosting)

2010-02-02 Thread Ahmet Arslan
Second concern: boosting a phrase (java developer^10.0) doesn't seem to be applied when you look at the result explanations when using the ComplexPhraseQueryParser - it's respected on single word queries and it's respected on phrases using the basic QueryParser. I just tested and able to

RE: ComplexPhraseQueryParser (Expanded Form and Boosting)

2010-02-02 Thread Haghighi, Nariman
I'm not able to see the boost applied even with an additional term added. The original query: +(JOB_TITLE:java developer^15.0 TEXT:java developer) +LANGUAGE:EN +GATEWAY:work Modified to: +(JOB_TITLE:java developer^15.0 JOB_TITLE:java TEXT:java developer) +LANGUAGE:EN +GATEWAY:work

Re: How further reward documents matching more query terms?

2010-02-02 Thread Phan The Dai
Dear Lan Lea, Thanks much for your reply. Please tell me more details of coord. what is its default? how to customize it, why we have to define. Thank you much for understading my question. On Sat, Jan 30, 2010 at 2:46 AM, Ian Lea ian@gmail.com wrote: I presume that quote is from the

Re: Getting DF IDF

2010-02-02 Thread Phan The Dai
with my idea, using BooleanQuery, you can make every thing. On Mon, Feb 1, 2010 at 10:44 PM, Asif Nawaz asifna...@hotmail.com wrote: Hi, I am new to use lucene, I have a query string of multiple terms. i) i want to return query string by removing stop words and stemmed version of the query.

confused by the lucene boolean query with wildcard result

2010-02-02 Thread java8964 java8964
Hi, I have the following test case point to the index generated in our application. The result is confusing me and I don't know the reason. Lucene version: 2.9.0 JDK 1.6.0_18 public class IndexTest1 { public static void main(String[] args) { try { FSDirectory directory

Re: Can't get tokenization/stop works working

2010-02-02 Thread jchang
I am using org.apache.lucene.analysis.snowball.SnowballAnalyzer. Looking through luke, I see that www.fubar.com was indexed, not fubar. So, clearly, I'm not stripping out the stop words of www and com. Any ideas? -- View this message in context:

RE: During the wild card search, will lucene 2.9.0 to convert the search string to lower case?

2010-02-02 Thread java8964 java8964
Is there an analyzer like keyword analyzer, but will also lowering the data from lucene? Or I have to do a customer analyzer by myself? Thanks From: java8...@hotmail.com To: java-user@lucene.apache.org Subject: RE: During the wild card search, will lucene 2.9.0 to convert the search

RE: Can't get tokenization/stop works working

2010-02-02 Thread Digy
Seeing www.fubar.com in the index means that your analyzer returns it as a single token. To strip out www and com, you have to use an analyzer that returns tokens as www, fubar and com. Try to use a different analyzer( or write your own as below ). //a C# example public class

Limiting search result for web search engine

2010-02-02 Thread Mike Polzin
I am working on building a web search engine and I would like to build a reults page similar to what Google does. The functionality I am looking to include is what I refer to a rolling up sites, meaning that even if a particular site (defined by its base URL) has many relevent hits on various

Re: Limiting search result for web search engine

2010-02-02 Thread Anshum
Hi Mike, Not really through queries, but you may do this by writing a custom collector. You'd need some supporting data structure to mark/hash the occurrence of a domain in your result set. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the