Re: Determine whether a MatchAllQuery or a Query with atleast one Term

2015-11-29 Thread Sandeep Khanzode
in the tree. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Sandeep Khanzode [mailto:sandeep_khanz...@yahoo.com.INVALID] > Sent: Saturday, November 28, 2015 12:22 PM > To: java-user@lucen

Re: Determine whether a MatchAllQuery or a Query with atleast one Term

2015-11-28 Thread Sandeep Khanzode
?  ---Thanks n Regards, Sandeep Ramesh Khanzode On Saturday, November 28, 2015 12:30 PM, Michael Wilkowski wrote: Instanceof? MW Sent from Mi phone On 28 Nov 2015 06:57, "Sandeep Khanzode" wrote: > Hi, > I have a question. > In my program, I need to chec

Determine whether a MatchAllQuery or a Query with atleast one Term

2015-11-27 Thread Sandeep Khanzode
Hi, I have a question. In my program, I need to check whether the input query is a MatchAll Query that contains no terms, or a Query (any variant) that has at least one term. For typical Term queries, this seems reasonable to be done with Query.extractTerms(Set<> terms) which gives the list of t

Upgrading Lucene Indices and maintaining same resultset

2015-05-27 Thread Sandeep Khanzode
Hi All, We have a Lucene 3.6-based index set which is quite large and currently in use.  What will be the upgrade path to (a) 4.x or (b) 5.x? With respect to the data migration, etc. What are the steps and is it technically possible? I read that 3.x to 5.x is not possible, and throws IndexTooStal

Re: BitSet in Filters

2014-08-12 Thread Sandeep Khanzode
index into the bitset. That's baked in to very low levels and isn't going to change AFAIK. Best, Erick On Mon, Aug 11, 2014 at 11:53 PM, Sandeep Khanzode wrote: Hi, >  >The current usage of BitSets in filters in Lucene is limited to applying only >on docIDs i.e. I ca

BitSet in Filters

2014-08-11 Thread Sandeep Khanzode
Hi,   The current usage of BitSets in filters in Lucene is limited to applying only on docIDs i.e. I can only construct a filter out of a BitSet if I have the DocumentIDs handy. However, with every update/delete i.e. CRUD modification, these will change, and I have to again redo the whole proce

Sort, Search & Facets

2014-07-07 Thread Sandeep Khanzode
Hi,   I am using Lucene 4.7.2 and my primary use case for Lucene is to do three things: (a) search, (b) sort by a number of fields for the search results, and (c) facet on probably an equal number of fields (probably the most standard use cases anyway). Let us say, I have a corpus of more than

DocIDs from Facet Results

2014-07-07 Thread Sandeep Khanzode
Hi, For Lucene 4.7.2 Facets, once we invoke FacetCollector and get the topNChildren into FacetResult, is there any mechanism that for a particular search result, I could get the docIds corresponding to any facet? Say, I have a facet defined on Field1. Upon Search and FacetCollection, I get FVa

Re: Incremental Field Updates

2014-07-01 Thread Sandeep Khanzode
ource fields. Under the covers this actually reads the document > out of the stored fields, deletes the old one and adds it > over again. > > FWIW, > Erick > > On Tue, Jul 1, 2014 at 5:32 AM, Sandeep Khanzode > wrote: > > Hi, > > > > I wanted to know of th

Incremental Field Updates

2014-07-01 Thread Sandeep Khanzode
Hi, I wanted to know of the best approach to follow if a few fields in my indexed documents are changing at run time (after index and before or during search), but a majority of them are created at index time. I could see the JIRA given below but it is scheduled for Lucene 4.9, I believe.   T

Searching on Large Indexes

2014-06-27 Thread Sandeep Khanzode
Hi, I have an index that runs into 200-300GB. It is not frequently updated. What are the best strategies to query on this index? 1.] Should I, at index time, split the content, like a hash based partition, into multiple separate smaller indexes and aggregate the results programmatically? 2.] Sh

IndexDocValues

2014-06-26 Thread Sandeep Khanzode
I came across this type when I checked this blog:  http://blog.trifork.com/2011/10/27/introducing-lucene-index-doc-values/   The blog mentions that the IndexDocValues are created as sorting types indexed specifically for the purpose and reduce the overhead created by the FieldCache. I could not l

SortedDocValuesField

2014-06-26 Thread Sandeep Khanzode
Hi,   I was checking the SortedDocValuesField and its performance in Sort as opposed to a normal i.e. StringField and its performance in the same sort. So, I used the same string/bytesref value in both fields and in separate JVM processes, I launched the two sorts. I used a RAMDirectory and cre

Re: Custom Sorting

2014-06-25 Thread Sandeep Khanzode
f this is a DB call, it will NOT perform. In order to be performant, you'll need to cache the values. Which is what is being done _for_ you by the FieldCache. So I think this is really a false path, or an "XY" problem. Why do you think you need to do this? Best, Erick On Tue, Jun

Custom Sorting

2014-06-24 Thread Sandeep Khanzode
Hi, I am trying to implement a sort order for search results in Lucene 4.7.2. If I want to use data for ordering that is not stored in Lucene as Fields, is there any way this can be done? Basically, I would have certain data that is associated logically to a document but stored elsewhere, like

Re: Facets in Lucene 4.7.2

2014-06-17 Thread Sandeep Khanzode
ns of categories), I don't think you can intentionally mess up something that much to end up w/ 40-45s response times! Shai On Tue, Jun 17, 2014 at 8:51 PM, Sandeep Khanzode < sandeep_khanz...@yahoo.com.invalid> wrote: > Hi, > > Thanks for your response. It does sound pretty

Re: Facets in Lucene 4.7.2

2014-06-17 Thread Sandeep Khanzode
ng mechanism for facets, through CachedOrdinalsReader. But I wouldn't go there until you verify that your IO system is good (try another machine, OS, disk ...)., and that the 40s times are truly from the faceting code. Shai On Tue, Jun 17, 2014 at 4:21 PM, Sandeep Khanzode < sandeep_khanz

Re: Facets in Lucene 4.7.2

2014-06-17 Thread Sandeep Khanzode
her threads can continue indexing (where before this flush would be a stop-the-world action, preventing indexing for a while). Shai On Mon, Jun 16, 2014 at 4:57 PM, Sandeep Khanzode < sandeep_khanz...@yahoo.com.invalid> wrote: > Correction on [4] below. I do get doc/pos/tim/tip/d

Re: Facets in Lucene 4.7.2

2014-06-16 Thread Sandeep Khanzode
Khanzode On Monday, June 16, 2014 7:10 PM, Sandeep Khanzode wrote: Hi Shai, Thanks for the response. Appreciated! I understand that this particular use case has to be handled in a different way. Can you please help me with the below questions?  1.] Is there any API that gives me the count of

Re: Facets in Lucene 4.7.2

2014-06-16 Thread Sandeep Khanzode
roach vs re-indexing the documents since the current implementation of updatable doc-values fields isn't optimized for a few document updates between index reopens. See here: http://shaierera.blogspot.com/2014/04/benchmarking-updatable-docvalues.html Shai On Fri, Jun 13, 2014 at 1

Re: Facets in Lucene 4.7.2

2014-06-13 Thread Sandeep Khanzode
f you can model that as a NumericDocValuesField added to documents (w/ the different markers/flags translated to numbers), then you can use Lucene's updatable numeric DocValues and write a custom Facets to aggregate on that NumericDocValues field. Shai On Fri, Jun 13, 2014 at 11:48

Facets in Lucene 4.7.2

2014-06-13 Thread Sandeep Khanzode
Hi,   I am evaluating Lucene Facets for a project. Since there is a lot of change in 4.7.2 for Facets, I am relying on UTs for reference. Please let me know if there are other sources of information.  I have a couple of questions: 1.] All categories in my application are flat, not hierarchical.