Re: Indexing Weighted Tags per Document

2014-10-28 Thread Ramkumar R. Aiyengar
There are a few approaches possible here, we had a similar use case and went for the second one below. I primarily deal with Solr, so I don't know of Lucene-only examples, but hopefully you can dig this up.. (1) You can attach payloads to each occurrence of the tag, and modify the scoring to use t

Re: Indexing Weighted Tags per Document

2014-10-28 Thread Ralf Bierig
The second solution sounds great and a lot more natural than payloads. I know how to overwrite the Similarity class but this one would only be called at search time and then already use the existing term frequency. Looking up the probabilities every time a search is performed is probably also

MyAnalyzer and Lucene version <= 4.9.1

2014-10-28 Thread Ralf Bierig
How to write an own Analyzer in Lucene <= 4.9.1? Here my code, somehow the method tokenStream is not final and cannot be extended any more. How is one supposed to extend it? --- code --- class PayloadAnalyzer extends Analyzer { private PayloadEncoder encoder; PayloadAnalyzer(PayloadEn

Re: Making lucene indexing multi threaded

2014-10-28 Thread Erick Erickson
bq: When I loop the result set, I reuse the same Document instance. I really, really, _really_ hope you're calling new for the Document in the loop. Otherwise that single document will eventually contain all the data from your entire corpus! I'd expect some other errors to pop out if you are reall

RE: MyAnalyzer and Lucene version <= 4.9.1

2014-10-28 Thread Uwe Schindler
Hi, You have to implement createComponents(). The old way of Lucene 3 does no longer work because Analyzers have to provide reusable TokenStreams. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: R

Re: Questions about the Lucene query language

2014-10-28 Thread Prad Nelluru
Thanks, Jack! From: Jack Krupansky Sent: Monday, October 27, 2014 8:41 PM To: java-user@lucene.apache.org Subject: Re: Questions about the Lucene query language Pure negative queries are not supported, but all you need to do is include *:*, which translate

Re: MyAnalyzer and Lucene version <= 4.9.1

2014-10-28 Thread Ralf Bierig
Thanks a lot! :) Ralf On 28.10.2014 16:12, Uwe Schindler wrote: Hi, You have to implement createComponents(). The old way of Lucene 3 does no longer work because Analyzers have to provide reusable TokenStreams. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi

Query with many clauses

2014-10-28 Thread Pawel Rog
Hi, I have to run query with a lot of boolean should clauses. Queries like these were of course slow so I decided to change query to filter wrapped by ConstantScoreQuery but it also didn't help. Profiler shows that most of the time is spent on seekExact in BlockTreeTermsReader$FieldReader$SegmentT