Re: Consistent NRT searching with SearcherLifetimeManager and multiple instances

2023-12-14 Thread Steven Schlansker
simple as adding a searcher manager refresh listener to each replica that acquires and records every time we load new infos - and it might all just work! Maybe. We'll see... If there's still troubles, we can then add retry-to-different-instance as well. Thanks for the help :) > &

Consistent NRT searching with SearcherLifetimeManager and multiple instances

2023-12-13 Thread Steven Schlansker
Hi lucene-users, We use the lucene-replicator to have a single indexing node push commits and NRT updates to a set of replicas. Currently, each replica has the full dataset - there is no sharding. We use a SearcherLifetimeManager to try to provide consistent pagination over results. So when w

Re: How to Safely Rollback after Upgrading Lucene?

2023-06-23 Thread Steven Schlansker
Hi, > On Jun 23, 2023, at 2:34 PM, Yixun Xu wrote: > > Hello, > > I have a service that creates and manages Lucene indices. The service is > using Lucene 8 and I want to upgrade to Lucene 9, and I would like to be > able to rollback the upgrade in case I encounter any issues (*). The > problem

Re: Need your perspective on Garbage Collection

2023-01-03 Thread Steven Schlansker
> Am 03.01.2023 um 13:49 schrieb _ SATNAM: > > Hi, > > The issue is my garbage collection is running quite often i configure my > > JVM as recommended (Gone though several articles ,blogs on lucene) also > > provide enough RAM and memory (not as large to trigger GC ) .Main cause of > > concern is

Re: Replicator PrimaryNode waits forever for remotes to close

2022-07-01 Thread Steven Schlansker
vering this and proposing a fix! > > Mike McCandless > > http://blog.mikemccandless.com > > > On Wed, Jun 29, 2022 at 7:36 PM Steven Schlansker > wrote: > Hi Lucene fans, > > We use lucene-replicator to copy our indexes from a primary to replica nodes. > Usua

Replicator PrimaryNode waits forever for remotes to close

2022-06-29 Thread Steven Schlansker
welcome change? Is there a better way to avoid hanging here, other than to be bug-free? It's quite challenging to figure out where the CopyState wasn't released, as only a count is kept. Thanks! Steven Schlansker ---

Querying into a Collector visits documents multiple times

2021-09-21 Thread Steven Schlansker
Hi Lucene users, I am developing a search application that needs to do some basic summary statistics. We use Lucene 8.9.0. To improve performance for e.g. summing a value across 10,000 documents, we are using DocValues as columnar storage. In order to retrieve the DocValues without collecting all

Re: Index keeps growing, then shrinks on restart

2014-11-13 Thread Steven Schlansker
On Nov 10, 2014, at 3:03 PM, Rob Nikander wrote: > Hi, > > I have an index that's about 700 MB, and it grows over days to until it > causes problems with disk size, at about 5GB. If the JVM process ends, the > index shrinks back to about 700MB, I'm calling IndexWriter.commit() all the > time.

Re: Can RAMDirectory work for gigabyte data which needs refreshing of the index all the time?

2014-05-16 Thread Steven Schlansker
On May 7, 2014, at 6:46 AM, Cheng wrote: > > I have an index of multiple gigabytes which serves 5-10 threads and needs > refreshing very often. I wonder if RAMDirectory is the good candidate for > this purpose. If not, what kind of directory is better? We found that loading and unloading RAMDir

Re: BytesRef equals() method

2014-01-22 Thread Steven Schlansker
On Wed, 22 Jan 2014 07:14:59 +0100 Yann-Erwan Perio wrote: > On Tue, Jan 21, 2014 at 7:54 PM, Steven Schlansker > wrote: > > Certainly, but my problem still persists if I do not do it. I spent > the whole night debugging the code, to no avail. As a matter of fact, > when

Re: BytesRef equals() method

2014-01-21 Thread Steven Schlansker
On Jan 21, 2014, at 7:32 AM, Yann-Erwan Perio wrote: > Hello, > > I have been working a bit with BytesRef recently, and I wonder whether > the content of the equals() method, and more specifically the content > of the bytesEquals(BytesRef other) method, is the intended one. > > I was made awar

Re: DocValues formats hold large byte[][]s even when using MMapDirectory

2013-10-02 Thread Steven Schlansker
ch release of Lucene will require a full reindex when using this, which is a serious bummer. So I think I'll hold out for 4.5 and hope that that solves my problem. Thanks for the help! > > On Wed, Oct 2, 2013 at 2:11 PM, Steven Schlansker wrote: >> Hi, >> >> I hav

DocValues formats hold large byte[][]s even when using MMapDirectory

2013-10-02 Thread Steven Schlansker
Hi, I have a search application using Lucene 4.4.0 with various BinaryDocValues and SortedSetDocValues. We use MMapDirectory to help keep the Java heap small / GC pause times short and instead rely on the OS buffer cache to keep things fast, which I gather is generally considered a "best practi

Re: In memory index (current status in Lucene)

2013-07-01 Thread Steven Schlansker
On Jul 1, 2013, at 2:41 PM, Lance Norskog wrote: > My current open source project is a Directory that is just like RAMDirectory, > but everything is memory-mapped. The idea is it creates a disk file, opens > it, and immediately deletes the file. The file still exists until the > IndexReader/W

Re: In memory index (current status in Lucene)

2013-06-28 Thread Steven Schlansker
On Jun 28, 2013, at 2:29 PM, Emmanuel Espina wrote: > I'm building a distributed index (mostly as a reasearch project for > school) and I'm evaluating indexing the entire collection in memory > (like google, facebook and others have done years ago). The obvious > reason for this is performance c

Re: Seemingly very difficult to wrap an Analyzer with CharFilter

2013-06-14 Thread Steven Schlansker
On Jun 12, 2013, at 5:26 PM, Michael Sokolov wrote: > On 6/12/2013 7:02 PM, Steven Schlansker wrote: >> On Jun 12, 2013, at 3:44 PM, Michael Sokolov >> wrote: >> >>> You may not have noticed that CharFilter extends Reader. The expected >>> pattern he

Re: Seemingly very difficult to wrap an Analyzer with CharFilter

2013-06-12 Thread Steven Schlansker
Thanks for the pointer. Steven > On 6/11/2013 7:52 PM, Steven Schlansker wrote: >> Hi everyone, >> >> I am trying to add a CharFilter to my Analyzer. I started with a >> StandardAnalyzer wrapped with an ASCIIFoldingFilter. Then I realized that >> it does not hand

Seemingly very difficult to wrap an Analyzer with CharFilter

2013-06-11 Thread Steven Schlansker
Hi everyone, I am trying to add a CharFilter to my Analyzer. I started with a StandardAnalyzer wrapped with an ASCIIFoldingFilter. Then I realized that it does not handle searches for names that include punctuation well, for example I want a PrefixQuery "pf" to match "P.F. Chang's" or "zaras"

Re: PrefixQuery with short prefix does not match documents

2013-05-28 Thread Steven Schlansker
se/LUCENE-4845 ... > > Mike McCandless > > http://blog.mikemccandless.com > > > On Fri, May 24, 2013 at 7:06 PM, Steven Schlansker > wrote: >> Hi everyone, >> >> I am building an autocomplete index. The index contains both the names and >> a small

PrefixQuery with short prefix does not match documents

2013-05-24 Thread Steven Schlansker
.TopTermsScoringBooleanQueryRewrite(1)); Query mainQuery = new BooleanQuery(); mainQuery.add(allowedTypes, Occur.MUST); mainQuery.add(prefixQuery, Occur.MUST); Am I missing something obvious? Thanks, Steven Schlansker

Re: Using an AnalyzerWrapper with ASCIIFoldingFilter

2013-03-15 Thread Steven Schlansker
On Mar 15, 2013, at 11:25 AM, "Uwe Schindler" wrote: > Hi, > > The API did not really change. The API definitely did change, as before you would override the now-final tokenStream method. But you are correct that this was not the root of the problem. > The bug is in your test: > If you wou

Using an AnalyzerWrapper with ASCIIFoldingFilter

2013-03-15 Thread Steven Schlansker
Hi everyone, I am trying to port forward to 4.2 some Lucene 3.2-era code that uses the ASCIIFoldingFilter. The token stream handling has changed significantly since them, and I cannot figure out what I am doing wrong. It seems that I should extend AnalyzerWrapper so that I can intercept the To