Re: Facets in Lucene 4.7.2

2014-06-17 Thread Sandeep Khanzode
Hi, Thanks again! This time, I have indexed data with the following specs. I run into 40 seconds for the FastTaxonomyFacetCounts to create all the facets. Is this as per your measurements? Subsequent runs fare much better probably because of the Windows file system cache. How can I speed

Re: Facets in Lucene 4.7.2

2014-06-17 Thread Shai Erera
Hi 40 seconds for faceted search is ... crazy. Also, note how the times don't differ much even though the number of hits is much higher (29K vs 15.1M) ... That, w/ that you say that subsequent queries are much faster (few seconds) suggests that something is seriously messed up w/ your

Re: Facets in Lucene 4.7.2

2014-06-17 Thread Sandeep Khanzode
Hi, Thanks for your response. It does sound pretty bad which is why I am not sure whether there is an issue with the code, the index, the searcher, or just the machine, as you say.  I will try with another machine just to make sure and post the results. Meanwhile, can you tell me if there is

Re: Facets in Lucene 4.7.2

2014-06-17 Thread Shai Erera
Nothing suspicious ... code looks fine. The call to FastTaxoFacetCounts actually computes the counts ... that's the expensive part of faceted search. How big is your taxonomy (number categories)? Is it hierarchical (i.e. are your dimensions flat, or deep like A/1/2/3/)? What does your

Re: Facets in Lucene 4.7.2

2014-06-17 Thread Sandeep Khanzode
If I am counting correctly, the $facets field in the index shows a count of approx. 28k. That does not sound like much, I guess. All my facets are flat and the FacetsConfig only defines a couple of them to be multi-valued. Let me know if I am not counting the taxonomy size correctly. The

Re: Facets in Lucene 4.7.2

2014-06-17 Thread Shai Erera
You can get the size of the taxonomy by calling taxoReader.getSize(). What does the 28K of the $facets field denote - the number of terms (drill-down)? If so, that sounds like your taxonomy is of that size. And indeed, this is a tiny taxonomy ... How many facets do you record per document? This

Re: Facets in Lucene 4.7.2

2014-06-16 Thread Sandeep Khanzode
Hi Shai, Thanks for the response. Appreciated! I understand that this particular use case has to be handled in a different way. Can you please help me with the below questions?  1.] Is there any API that gives me the count of a specific dimension from FacetCollector in response to a search

Re: Facets in Lucene 4.7.2

2014-06-16 Thread Sandeep Khanzode
Correction on [4] below. I do get doc/pos/tim/tip/dvd/dvm files in either ase. What I meant was the number of those files appear different in both cases. Also, does commit() stop the world and behave serially to flush the contents?   --- Thanks n Regards, Sandeep Ramesh

Re: Facets in Lucene 4.7.2

2014-06-14 Thread Shai Erera
Hi Currently there's now way to add e.g. terms to already indexed documents, you have to re-index them. The only updatable field type Lucene offers currently are DocValues fields. If the list of markers/flags is fixed in your case, and you can map them to an integer, I think you could use a

Facets in Lucene 4.7.2

2014-06-13 Thread Sandeep Khanzode
Hi,   I am evaluating Lucene Facets for a project. Since there is a lot of change in 4.7.2 for Facets, I am relying on UTs for reference. Please let me know if there are other sources of information.  I have a couple of questions: 1.] All categories in my application are flat, not

Re: Facets in Lucene 4.7.2

2014-06-13 Thread Shai Erera
Hi You can check the demo code here: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_4_8/lucene/demo/src/java/org/apache/lucene/demo/facet/. This code is updated with each release, so you always get a working code examples, even when the API changes. If you don't mind managing

Re: Facets in Lucene 4.7.2

2014-06-13 Thread Sandeep Khanzode
Hi Shai,   Thanks so much for the clear explanation. I agree on the first question. Taxonomy Writer with a separate index would probably be my approach too. For the second question: I am a little new to the Facets API so I will try to figure out the approach that you outlined below. However,