Re: sort by field and score

2012-11-27 Thread Ian Lea
What are you getting for the scores? If it's NaN I think you'll need to use a TopFieldCollector. See for example http://www.gossamer-threads.com/lists/lucene/java-user/86309 -- Ian. On Tue, Nov 27, 2012 at 3:51 AM, Andy Yu ukour...@gmail.com wrote: Hi All, Now I want to sort by a field

Re: Does anyone have tips on managing cached filters?

2012-11-27 Thread Trejkaz
On Tue, Nov 27, 2012 at 9:31 AM, Robert Muir rcm...@gmail.com wrote: On Thu, Nov 22, 2012 at 11:10 PM, Trejkaz trej...@trypticon.org wrote: As for actually doing the invalidation, CachingWrapperFilter itself doesn't appear to have any mechanism for invalidation at all, so I imagine I will be

Re: Does anyone have tips on managing cached filters?

2012-11-27 Thread Robert Muir
On Tue, Nov 27, 2012 at 6:17 AM, Trejkaz trej...@trypticon.org wrote: Ah, yeah... I should have been clearer on what I meant there. If you want to make a filter which relies on data that isn't in the index, there is no mechanism for invalidation. One example of it is if you have a filter

info on how lucene conducsts a search?

2012-11-27 Thread geeky2
Hello all, can someone point me to info or docs on how a lucene search is conducted? i would like to have a better understanding of how this works in general - but also from a design perspective. for instance - a question that keeps coming up is, should we add content to a given core - or break

Re: what is the offsets and payload in DocsAndPositionsEnum for ??

2012-11-27 Thread Wu, Stephen T., Ph.D.
I think we're looking at doing something related. I haven't explored the Enums or know how to make a postings codec... But what is flexible indexing in Lucene 4.0 if it's not the ability to make new postings codecs? We're trying to incorporate attributes onto terms/spans in indexes. We'd also

Re: info on how lucene conducsts a search?

2012-11-27 Thread Ian Lea
http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/search/package-summary.html#package_description might help. Or Google something like how does lucene work. The question on cores might be better asked on the solr list, assuming you are talking about Solr cores. But I bet the answer

Re: info on how lucene conducsts a search?

2012-11-27 Thread geeky2
hello, thanks for the info. as you suggested - i did do a general search and found this slide presentation - which had some good general info. i am not sure what the source of this preso, how qualified the author (although he/she seems very good) or how current the information is?

Re: info on how lucene conducsts a search?

2012-11-27 Thread geeky2
Ian Lea wrote The question on cores might be better asked on the solr list, assuming you are talking about Solr cores. But I bet the answer will be a variant on either it depends or, my favourite, whatever works for you. yes - i am referring to solr cores. i was hoping to find a more

Re: info on how lucene conducsts a search?

2012-11-27 Thread Apostolis Xekoukoulotakis
http://wiki.apache.org/lucene-java/LucenePapers Many people have come to this list asking the same question,including myself. Most answers are practical ones. But lucene has so many interesting ideas in it, which triggers everyones academic curiosity, without caring for the results.

Re: what is the offsets and payload in DocsAndPositionsEnum for ??

2012-11-27 Thread David Causse
Hi, We use payloads but we can't use the whole lucene API. For example we use it to do some relation query for example : @quote(@speaker(obama) @discourse(health)) Search for all documents that contains a quote by Obama talking about health. We encode linguistic informations (standoff

Re: How does lucene handle the wildcard and fuzzy queries ?

2012-11-27 Thread Jack Krupansky
The proper answer to all of these questions is the same and very simple: If you want internal details, read the source code first. If you have specific questions then, fine, ask specific questions - but only after you've checked the code first. Also, questions or issues related to internals

Re: handling different scores related to queries

2012-11-27 Thread Jack Krupansky
Call the IndexSearch#explain method to get the technical details on how any query is scored. Call Explanation#toString to get the English description for the scoring. Or, using Solr, add the debugQuery=true parameter to your query request and look at the explain section for scoring

Re: info on how lucene conducsts a search?

2012-11-27 Thread Ian Lea
As you can tell from the title, Lucene In Action is more about using lucene than how it works internally, but yes, it is good and is worth buying. If you're worried about how up to date it is, keep a copy of the release notes and migration guides for later versions to hand. -- Ian. On Tue,

What is flexible indexing in Lucene 4.0 if it's not the ability to make new postings codecs?

2012-11-27 Thread Wu, Stephen T., Ph.D.
Following up on a previous question... What is flexible indexing in Lucene 4.0? We assumed it was the ability to easily make new postings formats/codecs -- but a response below says that would be tricky? stephen On 11/27/12 11:48 AM, David Causse dcau...@spotter.com wrote: Hi, We use

Re: What is flexible indexing in Lucene 4.0 if it's not the ability to make new postings codecs?

2012-11-27 Thread Michael McCandless
Flexible indexing is the ability to make your own codec, which controls the reading and writing of all index parts (postings, stored fields, term vectors, deleted docs, etc.). So for example if you want to store some postings as a bit set instead of the block format that's the default coming up

native, versioned XML-DBMS (that is full text search in versioned document collections)

2012-11-27 Thread Johannes.Lichtenberger
Hello, as posted some time ago I'm working on a native, versioned XML-DBMS [1]. I'd like to provide a full text index and I recently read about customized Codecs which can be plugged in. Usually data (for instance XML nodes) are stored on RecordPages. I'm still not sure if it is possible and

Re: Does anyone have tips on managing cached filters?

2012-11-27 Thread Trejkaz
On Wed, Nov 28, 2012 at 2:09 AM, Robert Muir rcm...@gmail.com wrote: I don't understand how a filter could become invalid even though the reader has not changed. I did state two ways in my last email, but just to re-iterate: (1): The filter reflects a query constructed from lines in a text

Re: Does anyone have tips on managing cached filters?

2012-11-27 Thread Robert Muir
On Wed, Nov 28, 2012 at 12:27 AM, Trejkaz trej...@trypticon.org wrote: On Wed, Nov 28, 2012 at 2:09 AM, Robert Muir rcm...@gmail.com wrote: I don't understand how a filter could become invalid even though the reader has not changed. I did state two ways in my last email, but just to