RE: field sorted searches with unbounded hit count

2011-06-23 Thread Toke Eskildsen
On Thu, 2011-06-23 at 22:41 +0200, Tim Eck wrote: > I don't want to accuse anyone of bad code but always preallocating a > potentially large array in org.apache.lucene.util.PriorityQueue seems > non-ideal for the search I want to run. The current implementation of IndexSearcher uses threaded s

Re: Lucene sort performance roots?

2011-06-23 Thread Denis Bazhenov
Yes, sorry. I should explain it. What we are using is sorting by field value. We have around 1M documents which we are searching and returns them to the user in reverse order by creation date. Creation date is indexed in separated field in lucene of course. On Jun 24, 2011, at 4:52 PM, Dawid We

Re: Lucene sort performance roots?

2011-06-23 Thread Dawid Weiss
Can you describe the kind of sorting you're doing? Maybe the data is already sorted (and in RAM) and you're only getting it out? Dawid On Fri, Jun 24, 2011 at 3:32 AM, Denis Bazhenov wrote: > Well, maybe it's a bit controversial question, but anyway... > > Lucene is a great toolkit for search ap

Re: field sorted searches with unbounded hit count

2011-06-23 Thread Simon Willnauer
On Thu, Jun 23, 2011 at 10:41 PM, Tim Eck wrote: > Thanks for the idea Ian. I still need to think about it, but the race between > running the total count search and then the sorted search worries me. I have > very pretty specific visibility guarantees I must provide on this data (with > respec

Re: questions about searching lucene 3.2

2011-06-23 Thread Simon Willnauer
On Thu, Jun 23, 2011 at 3:46 PM, Bob Rhodes wrote: > Yeah I agree that this is the issue. I did get my query to work using the > "ClassicAnalyzer". I guess maybe I need to upgrade my indexes which will be a > big job. Any advice here is appreciated. > I didn't have any luck passing Version.LUCENE_

Lucene sort performance roots?

2011-06-23 Thread Denis Bazhenov
Well, maybe it's a bit controversial question, but anyway... Lucene is a great toolkit for search applications. And it's so fast in most of cases. I think I am understand why it's faster than relational databases for information retrieval. For example, Lucene use very efficient index than allows

Does {Filter}ing is faster than {Query}ing in Lucene?

2011-06-23 Thread Denis Bazhenov
While reading "Lucene in Action 2nd edition" I came across the description of Filter classes which are could be used for result filtering in Lucene. Lucene has a lot of filters repeating Query classes. For example, NumericRangeQuery and NumericRangeFilter. The book says that NRF does exactly th

RE: SEVERE: org.apache.solr.common.SolrException: Error loading class 'solr.KeywordMarkerFilterFactory'

2011-06-23 Thread abhayd
thanks. Our schema has After removing things worked fine. May be we used 3.1 schema defination and that mi

Re: Suggestion: make some more TokenFilters KeywordAttribute aware

2011-06-23 Thread Sujit Pal
Thanks Simon, I have opened a JIRA and attached a patch. I have verified that I haven't broken anything, and I have used these patched files to test in my local application and have verified that they work. https://issues.apache.org/jira/browse/LUCENE-3236 -sujit On Thu, 2011-06-23 at 08:21 +02

spaces in the field name

2011-06-23 Thread Nilesh Vijaywargiay
I have a situation where the field name consists of spaces. So a query like *short text: value* doesn't return any results as the query structure internally would be *defaultField:short text:value* * * *Any work around for including spaces in your field name?* * * *Nilesh* * *

RE: field sorted searches with unbounded hit count

2011-06-23 Thread Tim Eck
Thanks for the idea Ian. I still need to think about it, but the race between running the total count search and then the sorted search worries me. I have very pretty specific visibility guarantees I must provide on this data (with respect to concurrent updates). It'd be a bummer to have to bloc

Computing document frequencies for specific queries in Lucene

2011-06-23 Thread aengle1429
Hello, I currently am trying to get the following results... let's say I have 3 XML files that I parse using SAX: bob bob bob 3m 3m bob bob bob bob bob 3m bob bob bob bob I am currently indexing th

Re: Search multiple directories simultaneously

2011-06-23 Thread Cheng
thanks man. very condense and easy to follow. can i ask how the multiple search will impact the performance? i have probably 50GB data in each of the 10-20 folders. On Fri, Jun 24, 2011 at 1:04 AM, Uwe Schindler wrote: > IndexReader index1 = IndexReader.open(dir1); > IndexReader index2 = IndexR

RE: SEVERE: org.apache.solr.common.SolrException: Error loading class 'solr.KeywordMarkerFilterFactory'

2011-06-23 Thread Uwe Schindler
Solr 1.4 does not have this class nor it references it. Are you sure you not have added some Lucene/Solr 3.1 or 3.2 JAR files somewhere in your classpath? - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From:

RE: Search multiple directories simultaneously

2011-06-23 Thread Uwe Schindler
IndexReader index1 = IndexReader.open(dir1); IndexReader index2 = IndexReader.open(dir2); IndexReader index3 = IndexReader.open(dir3); ... IndexReader all = new MultiReader(index1, index2, index3,...); IndexSearcher searcher = new IndexSearcher(all); ...search your indexes... all.close(); index1.

SEVERE: org.apache.solr.common.SolrException: Error loading class 'solr.KeywordMarkerFilterFactory'

2011-06-23 Thread abhayd
hi we upgraded to solr 1.4. We are getting error SEVERE: org.apache.solr.common.SolrException: Error loading class 'solr.KeywordMarkerFilterFactory' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375) at org.apache.solr.core.SolrResourceLoader.newIns

Search multiple directories simultaneously

2011-06-23 Thread Cheng
Hi, I have multiple indexed folders (or directories), each holding indexing files for specific purposes. I want to do a search over these folders (or directories) in a same query. Is it possible? Thanks

RE: questions about searching lucene 3.2

2011-06-23 Thread Bob Rhodes
Yeah I agree that this is the issue. I did get my query to work using the "ClassicAnalyzer". I guess maybe I need to upgrade my indexes which will be a big job. Any advice here is appreciated. I didn't have any luck passing Version.LUCENE_24 to the StandardAnalyzer. There query still didn't work

Re: ComplexPhraseQueryParser with multiple fields

2011-06-23 Thread lichman
The same as touch*. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/ComplexPhraseQueryParser-with-multiple-fields-tp2879290p3099824.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. -

Re: ComplexPhraseQueryParser with multiple fields

2011-06-23 Thread Ahmet Arslan
> But now there's another issue. > I'm using SOLR and Lucene 3.1.0 and when sending a query > "Wildcard* phrase*" > it works as expected - but, when sending the query > "wildcard*" (Only one > word withing the phrase) I'm getting another exception: > > HTTP ERROR: 500 > Unknown query type "org.ap

Re: ComplexPhraseQueryParser with multiple fields

2011-06-23 Thread lichman
Thanks! Now it works. But now there's another issue. I'm using SOLR and Lucene 3.1.0 and when sending a query "Wildcard* phrase*" it works as expected - but, when sending the query "wildcard*" (Only one word withing the phrase) I'm getting another exception: HTTP ERROR: 500 Unknown query type "o

Re: ComplexPhraseQueryParser with multiple fields

2011-06-23 Thread Ahmet Arslan
> By the way - I'm using the > ComplexPhraseQueryParser that I've downloaded > from: > > https://issues.apache.org/jira/browse/SOLR-1604 > > And I've tried to use packages: > > - org.apache.lucene.search > - org.apache.lucene.queryParser > > Both, when compiled and added to the SOLR lib dir,

Re: ComplexPhraseQueryParser with multiple fields

2011-06-23 Thread lichman
By the way - I'm using the ComplexPhraseQueryParser that I've downloaded from: https://issues.apache.org/jira/browse/SOLR-1604 And I've tried to use packages: - org.apache.lucene.search - org.apache.lucene.queryParser Both, when compiled and added to the SOLR lib dir, caused the exception. A

Re: field sorted searches with unbounded hit count

2011-06-23 Thread Ian Lea
One possibility would be to execute the search first just to get the number of hits - see TotalHitCountCollector in recent versions of lucene, not sure when it was added - and use the hit count from that as the max docs to return. The counting only search would typically be very quick, certainly m

Re: IndexWriter.optimize not using it breaks my test case :(

2011-06-23 Thread Ian Lea
>From the 3.2.0 javadocs: "Optimize is a fairly costly operation, so you should only do it if your search performance really requires it. Many search applications do fine never calling optimize." See the FAQ and javadocs on searchers and writers for thread safety info. One thing that optimize d

Re: how to approach phrase queries and term grouping

2011-06-23 Thread Ian Lea
Have you read Lucene In Action 2nd edition? Highly recommended for anyone new to lucene and includes info and code on synonyms and position increments. The code is available somewhere as a free download. You may also want to read up on slop and span queries. See for example http://www.lucidimagi

Re: Lucene Searching

2011-06-23 Thread digy digy
Maybe, you need queryParser.setLowercaseExpandedTerms(false) DIGY On Thu, Jun 23, 2011 at 9:37 AM, Pranav goyal wrote: > I tried it and it worked, although it's having one peculiarity. > > When I search for Item_1 : it gives me 110 hits but when I use *Item_1* it > gives me 0 hits. What mistake

Re: Lucene Searching

2011-06-23 Thread Ian Lea
Looks OK to me. You are searching on Item without adding any docs with that field, you could use writer.updateDocument() rather than delete and add, but those are just quibbles and don't explain your searching problem. Having done most of the hard work, why don't you adapt the code you posted int

Re: Lucene Searching

2011-06-23 Thread Pranav goyal
Here's the code which I am implementing (Indexing and Searching codes are in different files) Indexing Part : d=new Document(); File indexDir = new File("index-dir"); KeywordAnalyzer analyzer = new KeywordAnalyzer(); IndexWriterConfig conf = new IndexWriterConfig

Re: Lucene Searching

2011-06-23 Thread Ian Lea
What exactly is "it"? Show us what you are indexing, how, and how you are building the query and we may be able to help. Whenever I see a report of incorrect results on a Mixed Case field I always suspect that the term is being lowercased on indexing and not at searching, or vice versa. -- Ian.

Re: ComplexPhraseQueryParser with multiple fields

2011-06-23 Thread lichman
Which patch are you referring to? The last one? And sure... I'll do the voting thing. -- View this message in context: http://lucene.472066.n3.nabble.com/ComplexPhraseQueryParser-with-multiple-fields-tp2879290p3099032.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. ---