Re: SortingMergePolicy for already sorted segments

2014-06-16 Thread Ravikumar Govindarajan
Shai, This is the code snippet I use inside my class... public class MySorter extends Sorter { @Override public DocMap sort(AtomicReader reader) throws IOException { final Map docVsId = loadSortTerm(reader); final Sorter.DocComparator comparator = new Sorter.DocComparator() { @Override

RE: Lucene Upgrade from 2.9.x to 4.7.x

2014-06-16 Thread Buddhavarapu, Suresh
Thanks Uwe. I tried this path and I do not find any .cfs files. All that I see in my index directory after running upgrader is following files. -rw--- 1 root root 245 Jun 16 22:38 _1.fdt -rw--- 1 root root 45 Jun 16 22:38 _1.fdx -rw--- 1 root root 2809 Jun 16 22:38 _1.fnm -rw---

Re: SortingMergePolicy for already sorted segments

2014-06-16 Thread Shai Erera
I'm not sure that I follow ... where do you see DocMap being loaded up front? Specifically, Sorter.sort may return null of the readers are already sorted ... I think we already optimized for the case where the readers are sorted. Shai On Tue, Jun 17, 2014 at 4:04 AM, Ravikumar Govindarajan < rav

Re: Faceted Search User's Guide for Lucene 4.8.1

2014-06-16 Thread Shai Erera
I understand but since the facet module was (and still is) experimental, we were also experimenting w/ its APIs and ways to simplify them. The userguide was a mess - while valuable to newcomers, it was impossible to keep it up to date with the API changes ever since it was contributed to Lucene. Th

Re: ShingleAnalyzerWrapper question

2014-06-16 Thread Manjula Wijewickrema
Dear Steve, It works. Thanks. On Wed, Jun 11, 2014 at 6:18 PM, Steve Rowe wrote: > You should give sw rather than analyzer in the IndexWriter actor. > > Steve > www.lucidworks.com > On Jun 11, 2014 2:24 AM, "Manjula Wijewickrema" > wrote: > > > Hi, > > > > In my programme, I can index and

SortingMergePolicy for already sorted segments

2014-06-16 Thread Ravikumar Govindarajan
I am planning to use SortingMergePolicy where all the merge-participating segments are already sorted... I understand that I need to define a DocMap with old-new doc-id mappings. Is it possible to optimize the eager loading of DocMap and make it kind of lazy load on-demand? Ex: Pass List to the c

Re: Index Not Finding Results some times

2014-06-16 Thread Andrew Norman
Ah, that now makes sense. Changed the code and now it works. Thanks for the help. On Monday, 16 June 2014, Allison, Timothy B. wrote: > The problem is that you are using an analyzer at index time but then not > at search time. > > StandardAnalyzer will convert "Name1" to "name1" at index time.

RE: Index Not Finding Results some times

2014-06-16 Thread Allison, Timothy B.
The problem is that you are using an analyzer at index time but then not at search time. StandardAnalyzer will convert "Name1" to "name1" at index time. At search time, because you aren't using a query parser (which would by default lowercase your terms) you are literally searching for "Name1"

Re: Faceted Search User's Guide for Lucene 4.8.1

2014-06-16 Thread Nicola Buso
Hi Shai, I'm going to update from 4.6.1 to 4.8.1 :-( On Wed, 2014-06-11 at 14:05 +0300, Shai Erera wrote: > Hi > > We removed the userguide long time ago, and replaced it with better > documentation on the classes and package.html, as well as demo code that > you can find here: > https://svn.apa

AW: [lucene 4.6] NPE when calling IndexReader#openIfChanged

2014-06-16 Thread Clemens Wyss DEV
In this second heap dump I drilled down only one of the SegmentReaders (of the StandardDirectoryReader). Looks like StandardSegmentReader has 24 subReaders (SegmentReaders) Class Name | Shallow Heap | Retained Heap | Percentage -

Re: Facets in Lucene 4.7.2

2014-06-16 Thread Shai Erera
Hi 1.] Is there any API that gives me the count of a specific dimension from > FacetCollector in response to a search query. Currently, I use the > getTopChildren() with some value and then check the FacetResult object for > the actual number of dimensions hit along with their occurrences. Also, t

Re: Facets in Lucene 4.7.2

2014-06-16 Thread Sandeep Khanzode
Correction on [4] below. I do get doc/pos/tim/tip/dvd/dvm files in either ase. What I meant was the number of those files appear different in both cases. Also, does commit() stop the world and behave serially to flush the contents?   --- Thanks n Regards, Sandeep Ramesh Khanzo

Re: Lucene 4.8.1 - Taxonomy

2014-06-16 Thread Shai Erera
Err ... are you sure there's an index in the directory that you point Luke at? I see that the exception points to "." which suggests the local directory from where Luke was run. There's nothing special about the taxonomy index, as far as Luke should concern. However, note that I do not recommend t

Re: Facets in Lucene 4.7.2

2014-06-16 Thread Sandeep Khanzode
Hi Shai, Thanks for the response. Appreciated! I understand that this particular use case has to be handled in a different way. Can you please help me with the below questions?  1.] Is there any API that gives me the count of a specific dimension from FacetCollector in response to a search que

Lucene 4.8.1 - Taxonomy

2014-06-16 Thread Mrugesh Patel
Hi, I would like to open taxonomy indices in a tool (like Luke). Please could you help? Currently I am able to open other lucene indices in Luke 4.8.1 but unable to open taxonomy indices. When I try to open taxonomy indices in Luke 4.8.1 then it shows "org.apache.lucene.index.IndexNotFoundE

Index Not Finding Results some times

2014-06-16 Thread Andrew Norman
Hi, I am using Lucene 3.6.2 (I cannot upgrade due to 3rd party dependencies). I have written the following code below to illustrate the problem. I create a single document, add three fields, put it into the index. When I attempt to find the document using exact matches I can find the document 2 o

RE: Lucene Upgrade from 2.9.x to 4.7.x

2014-06-16 Thread Uwe Schindler
Hi, You must first download the 3.6.2 Lucene version and upgrade using the upgrade tool from the lucene-core-3.6.2.jar. After this, your index is in Lucene 3.6 format, which can be read with Lucene 4. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u.

RE: Lucene Upgrade from 2.9.x to 4.7.x

2014-06-16 Thread Buddhavarapu, Suresh
I was trying the Demo application from 4.7.2 on an index created by 2.9.1. I get a org.apache.lucene.index.IndexFormatTooOldException exception. I tried the upgrader tool. Same exception again. Is there an upgrader tool that can work with a 2.9.1 tool? Or Do I have to build one? Any guidelines w

Re: Hunspell low level interface in Lucene 4.8

2014-06-16 Thread Robert Muir
You don't have to wrap every word in a tokenstream, they can be reused! Sorry, but i think this is really the best API if you want to use lucene's analyzers. You can use the tokenstream API with 4.8 and benchmark it against using that stemmer api with 4.7 :) On Mon, Jun 16, 2014 at 4:16 AM, Micha

Re: [lucene 4.6] NPE when calling IndexReader#openIfChanged

2014-06-16 Thread Michael McCandless
Wait, in fst.ByteStore I see "only" 5'485'824 -- does this mean 5485824 bytes, or ~5.2 MB? This is probably "correct", meaning this is the RAM to hold the terms index. But I can't see from your heap dump output where the other ~51.3 MB is being used by StandardDirectoryReader. Mike McCandless h

Re: Hunspell low level interface in Lucene 4.8

2014-06-16 Thread Michal Lopuszynski
Hi Robert, thank you for your answer! Hmmm... I need a plain stemmer, i.e. a functionality taking a word and returning a list of stems. Wrapping every word in tokenstream, which does a lot of things I do not need, seems like an overkill and waste of resources... Is there any problem with keeping