Re: Grouping and tokens

2013-02-18 Thread Jack Krupansky
Please clarify exactly what you want to group by - give a specific example that makes it clear what terms should affect grouping and which shouldn't. -- Jack Krupansky -Original Message- From: Ramprakash Ramamoorthy Sent: Monday, February 18, 2013 6:12 AM To:

RE: What is equivalent to Document.setBoost() from Lucene 3.6 inLucene 4.1 ?

2013-02-18 Thread Uwe Schindler
The problem is: Lucene has never supported *real* per-document boosts. Those boosts were always per-field. As we only work per-field, it depends on the query how your results score. If you have a TermQuery, the per-field boost is used (the one from the field queried), but e.g. if you have

Re: What is equivalent to Document.setBoost() from Lucene 3.6 inLucene 4.1 ?

2013-02-18 Thread Paul Taylor
On 18/02/2013 16:26, Uwe Schindler wrote: The problem is: Lucene has never supported *real* per-document boosts. Those boosts were always per-field. As we only work per-field, it depends on the query how your results score. If you have a TermQuery, the per-field boost is used (the one from the

IndexSearcher.close() removed in 4.0

2013-02-18 Thread saisantoshi
I understand from the JIRA ticket(Lucene-3640) that the IndexSearcher.close() is no-op operation but not very clear on why it is a no-op? Could someone shed some light on this? We were using this method in the older versions and is it safe now to remove this call. Just want to understand the

CJK evaluation. Standardanalyzer and Querytime.

2013-02-18 Thread Lucenius
Hello community, i am doing an evaluation in the context of CJK. I compare some indexing strategies like unigram, bigram, unigram + bigram and word based indexing. 1. I used the Standardanalyzer for unigram. I think it works for chinese but it is doing some other staff for Japanese and Korean.

Re: IndexSearcher.close() removed in 4.0

2013-02-18 Thread Simon Willnauer
On Mon, Feb 18, 2013 at 7:32 PM, saisantoshi saisantosh...@gmail.com wrote: I understand from the JIRA ticket(Lucene-3640) that the IndexSearcher.close() is no-op operation but not very clear on why it is a no-op? Could someone shed some light on this? We were using this method in the older

Re: Need Help:How to Get the enumeration of Terms Ending with a given word

2013-02-18 Thread Simon Willnauer
On Thu, Feb 14, 2013 at 11:42 AM, VIGNESH S vigneshkln...@gmail.com wrote: Hi, I have two questions 1.How to Get the enumeration of Terms Ending with a given word I saw we can get enumerations of word starting at a given word by Indexreader.terms(term())) method unless you want to iterate

Re: IndexSearcher.close() removed in 4.0

2013-02-18 Thread Eric Charles
Hi, Why not having the IS#close() calling the wrapped IR#close() ? I would be happier having to only deal with the Searcher once created and forget it wraps a Reader: I create a Searcher, I close it. Thx, Eric On 18/02/2013 22:20, Simon Willnauer wrote: On Mon, Feb 18, 2013 at 7:32 PM,

Re: Grouping and tokens

2013-02-18 Thread Ramprakash Ramamoorthy
On Mon, Feb 18, 2013 at 9:47 PM, Jack Krupansky j...@basetechnology.comwrote: Please clarify exactly what you want to group by - give a specific example that makes it clear what terms should affect grouping and which shouldn't. Assume I am indexing a library data. Say there are the following

Re: Grouping and tokens

2013-02-18 Thread Jack Krupansky
Okay, so, fields that would normally need to be tokenized must be stored as both raw strings for grouping and tokenized text for keyword search. Simply use copyField to copy from one to the other. -- Jack Krupansky -Original Message- From: Ramprakash Ramamoorthy Sent: Monday,

Re: Grouping and tokens

2013-02-18 Thread Jack Krupansky
Oops, sorry for the Solr answer. In Lucene you need to simply index the same value, once as a raw string and a second time as a tokenized text field. Grouping would use the raw string version of the data. -- Jack Krupansky -Original Message- From: Jack Krupansky Sent: Monday,