Difference between CategoryPath and Plain FacetFields with hierarchy

2015-03-04 Thread Gimantha Bandara
Hi, I am new to Lucene faceting and taxonomy. I saw few examples in some blogs and in facets guide. Some have used CategoryPath with TaxonomyWriters, TaxonomyReaders and FacetSearchParams. Some have used FacetFields without using taxonomyWriters and TaxonomyReaders. What is the difference between

Re: understanding the norm encode and decode

2015-03-04 Thread Adrien Grand
Hi, Floats require 32 bits but norms are encoded on a single byte. So there is a precision loss when encoding float values into a single byte. In your example, 0.75 and 0.89 are sufficiently close to each other so that they are encoded to the same byte. On Wed, Mar 4, 2015 at 4:48 AM, wangdong w

Re: understanding the norm encode and decode

2015-03-04 Thread Ahmet Arslan
Hi Adrien, I read somewhere that norms are stored using docValues. In my understanding, docvalues can store lossless float values. So the question is, why are still several decode/encode methods exist in similarity implementations? Intuitively switching to docvalues for norms should prevent prec

Re: understanding the norm encode and decode

2015-03-04 Thread Adrien Grand
Norms and doc values are indeed using the same API. However implementations differ a bit (eg. norms are stored in memory and use different compression schemes). The precision loss is up to the similarity. You could write a similarity impl which keeps full float precision, but scoring being fuzzy a

Re: Part of speech search with lucene

2015-03-04 Thread David Villarejo
Hi Mike, Your solution work! I've been trying it with PhraseQuery and It works pretty good. Thank you so much. David. 2015-03-03 23:00 GMT+01:00 Michael Sokolov : > I believe you can accomplish what you are talking about using PhraseQuery, > say: note that it has > > public void add(Term term,

Re: Part of speech search with lucene

2015-03-04 Thread Michael Sokolov
You're welcome; thanks for letting us know -Mike On 03/04/2015 01:21 PM, David Villarejo wrote: Hi Mike, Your solution work! I've been trying it with PhraseQuery and It works pretty good. Thank you so much. David. 2015-03-03 23:00 GMT+01:00 Michael Sokolov : I believe you can accomplish w

Re: understanding the norm encode and decode

2015-03-04 Thread wangdong
thank you for your disscussion. I am a junior user of lucene, so i am not**familiar with some deep concept you mentioned. my question is simple. I just want to know how to get 0.75 from decode(encode(0.89)) in offical document. why not 0.875? (0.875=0.5+0.25+0.125) thanks andrew 在 2015/3/

substring query

2015-03-04 Thread Stephen Rudd
I have created a slightly hairy document collection that contains 10s of millions of DNA sequence words that I wish to process to find rarer and unique words. Each of the words is between 100 characters (nucleotides) and 1000 characters in length. I have been able to use WildcardQuery and Fuzzy

Re: substring query

2015-03-04 Thread Herb Roitblat
Do you want to search for shingles? On 3/4/2015 9:16 PM, Stephen Rudd wrote: I have created a slightly hairy document collection that contains 10s of millions of DNA sequence words that I wish to process to find rarer and unique words. Each of the words is between 100 characters (nucleotides)

Re: Difference between CategoryPath and Plain FacetFields with hierarchy

2015-03-04 Thread Gimantha Bandara
Hi, Any help on this? Or Can someone point me to Faceted User guide of 4.10.3. I cannot find it. Is it only available for Older version? On Wed, Mar 4, 2015 at 2:38 PM, Gimantha Bandara wrote: > Hi, > > I am new to Lucene faceting and taxonomy. I saw few examples in some blogs > and in facets g