Compare / Diff between stored values of two lucene indexes.

2023-02-13 Thread David Port Louis
Hello, I'm trying to take the diff between the stored doc vals of two lucene indexes just like taking diff between two tables in SQL. My current approach is two query both the indexes in java and check for the equality of the result. Is there any other efficient method to perform the same. T

Lucene indexes error

2021-07-08 Thread Mohd Ahtesham
Dear concern, I'm getting below error from my application code when trying to index data. = java.io.EOFException: read past EOF: MMapIndexInput(path="/attachments/produpg/TNIndexes/8OYDBB3YXK250409/LOGS_Indexes_STAGI

Lucene indexes corrupted : Missing .si file

2021-07-06 Thread Bhavit Singh Sengar
Hello Fellas, I am in dire need of your expertise. We have a java process that saves its logs in Lucene indexes. We re-start this process every midnight. While starting, it initializes IndexWriter. We recently had an incident in our production environment when we tried to restart our java process

Re: Disk Free decrease in a directory containing only live lucene indexes

2020-01-21 Thread Uwe Schindler
chrieb Riccardo Tasso : >Hi, > I'm running a lucene based application on a linux system. > >The application writes and read many lucene indexes under the same >directory, which doesn't contain other data. > >We are monitoring the indexes directory and we noticed tha

Disk Free decrease in a directory containing only live lucene indexes

2020-01-21 Thread Riccardo Tasso
Hi, I'm running a lucene based application on a linux system. The application writes and read many lucene indexes under the same directory, which doesn't contain other data. We are monitoring the indexes directory and we noticed that the disk usage as calculated by the df util

Re: Lucene indexes getting deleted after application restart

2016-07-06 Thread Michael McCandless
Call IW.commit on a periodic basis, e.g. every N (!= 1) docs, or every M bytes or something? Mike McCandless http://blog.mikemccandless.com On Wed, Jul 6, 2016 at 1:57 PM, Desteny Child wrote: > Hi! > > In my Spring/Lucene application I'm using Lucene IndexWriter, > TrackingIndexWriter, Search

Lucene indexes getting deleted after application restart

2016-07-06 Thread Desteny Child
Hi! In my Spring/Lucene application I'm using Lucene IndexWriter, TrackingIndexWriter, SearcherManager and ControlledRealTimeReopenThread. I use open mode - IndexWriterConfig.OpenMode.CREATE_OR_APPEND. Right now I'm trying to index a thousands of a documents. For this purpose I have added Apache

Re: Lucene indexes reverting to past state

2015-08-26 Thread Michael McCandless
Are you calling IndexWriter.commit when you shut down the app? Mike McCandless http://blog.mikemccandless.com On Tue, Aug 25, 2015 at 11:49 PM, Loamy Hound wrote: > *Summary:* > > Lucene indexes appear to revert to some past state after an application > restart. > > *Back

Lucene indexes reverting to past state

2015-08-25 Thread Loamy Hound
*Summary:* Lucene indexes appear to revert to some past state after an application restart. *Background:* We're running an enterprise application written in Java/Spring/Hibernate, deployed within Jetty, with a Postgres backend. See below for version info. We use Lucene to index ce

Re: Is housekeeping of Lucene indexes block index update but allow search ?

2014-08-11 Thread Gaurav gupta
Kumaran, Below is the code snippet for concurrent writes (i.e. concurrent updates/deletes etc.) alongwith Search operation using the NRT Manger APIs. Let me know if you need any other details or have any suggesstion for me :- public class LuceneEngineInstance implements IndexEngineInstance { p

Re: Is housekeeping of Lucene indexes block index update but allow search ?

2014-08-05 Thread Kumaran Ramasubramanian
Hi Gaurav Thanks for the clarification. If possible, please share your NRT manager API related code example. i believe, it will help me to understand little better. - Kumaran R On Tue, Aug 5, 2014 at 12:39 PM, Gaurav gupta wrote: > Thanks Kumaran and Erik for resolving my queries. > > K

Re: Is housekeeping of Lucene indexes block index update but allow search ?

2014-08-05 Thread Gaurav gupta
Thanks Kumaran and Erik for resolving my queries. Kumaran, You are right at only one indexwriter can write as it acquire the lock but using the NRT manager APis - TrackingIndexWriter multiple concurrent upd

Re: Is housekeeping of Lucene indexes block index update but allow search ?

2014-08-04 Thread Erick Erickson
Right. 1> Occasionally the merge will require 2x the disk space. (3x in compound file system). The merging is, indeed, done in the background, it is NOT a blocking operation. 2> n/a. It shouldn't block at all. Here's a cool video by Mike McCandless on the merging process, plus some explanations:

Re: Is housekeeping of Lucene indexes block index update but allow search ?

2014-08-04 Thread Kumaran R
Hi Gaurav 1.When you opened index to write,till you close that index, there will be a lock to do further write. But not for search. During merge, index needs 3X ( not sure 2X?) of more storage space, i believe that is the reason for no blocking for search. ( any other experts can clarify you more

Is housekeeping of Lucene indexes block index update but allow search ?

2014-08-04 Thread Gaurav gupta
Hi, We are planning to use Lucene 4.8.1 over Oracle (1 to 2 TB data) and seeking information on "How Lucene conduct housekeeping or maintenance of indexes over a period of time". *Is it a blocking operation for write and search or it will not block anything while merging is going on? * I found :

Re: Lucene Indexes explanantion

2013-06-10 Thread Jack Krupansky
ious terms: - Indexed terms - Stored values - Payloads - DocValues -- Jack Krupansky -Original Message- From: nikhil desai Sent: Monday, June 10, 2013 8:36 PM To: java-user@lucene.apache.org Subject: Re: Lucene Indexes explanantion I don't think I could get much from what you said, c

Re: Lucene Indexes explanantion

2013-06-10 Thread nikhil desai
. And there are > DocValues as well. > > > -- Jack Krupansky > > -Original Message- From: nikhil desai > Sent: Monday, June 10, 2013 8:06 PM > To: java-user@lucene.apache.org > Subject: Re: Lucene Indexes explanantion > > > Sure. Thanks Jack. > I don&#

Re: Lucene Indexes explanantion

2013-06-10 Thread Jack Krupansky
@lucene.apache.org Subject: Re: Lucene Indexes explanantion Sure. Thanks Jack. I don't have much experience working with Lucene, however, here is what I am trying to resolve. I learned that the Custom attributes cannot be used for indexing or searching purposes. However I wanted the attributes to be use

Re: Lucene Indexes explanantion

2013-06-10 Thread nikhil desai
intelligent questions. > > -- Jack Krupansky > > -Original Message- From: nikhil desai > Sent: Monday, June 10, 2013 1:24 PM > To: java-user@lucene.apache.org > Subject: Lucene Indexes explanantion > > > Hello, > > My first time post in this group. > >

Re: Lucene Indexes explanantion

2013-06-10 Thread Jack Krupansky
amount of Java code using Lucene. Otherwise, you won't have enough context to understand or even ask intelligent questions. -- Jack Krupansky -Original Message- From: nikhil desai Sent: Monday, June 10, 2013 1:24 PM To: java-user@lucene.apache.org Subject: Lucene Indexes expl

Lucene Indexes explanantion

2013-06-10 Thread nikhil desai
Hello, My first time post in this group. I have been using Lucene recently. I have a question. Where can I find a good explanation on Indexes. Or rather how indexing (Not really the mathematical aspect) happens in Lucene, what all attributes(charTerm, Offset etc) come into play? And the way it i

Re: How do I best store my IRC log data in lucene indexes?

2013-01-25 Thread Ian Lea
Adding a message type field is the way to do it. Then you can use QueryWrapperFilter and CachingWrapperFilter, something like Term t = new Term("messtype", messtype); TermQuery tq = new TermQuery(t); QueryWrapperFilter qwf = new QueryWrapperFilter(tq); CachingWrapperFilter cwf = new CachingWrappe

Re: How do I best store my IRC log data in lucene indexes?

2013-01-25 Thread crocket
Do you mean http://lucene.apache.org/core/4_1_0/core/org/apache/lucene/analysis/CachingTokenFilter.htmlby a cached filter? And how would you restrict searches to particular message types fast with a cached filter? I'm a beginner. On Fri, Jan 25, 2013 at 6:51 PM, Ian Lea wrote: > Unless there's

Re: How do I best store my IRC log data in lucene indexes?

2013-01-25 Thread crocket
How do you propose I differentiate different message types if I put all of them in one index directory? I thought of adding a message type field, but it doesn't seem to be a good way. On Fri, Jan 25, 2013 at 6:51 PM, Ian Lea wrote: > Unless there's good reason not to (massive size? different s

Re: How do I best store my IRC log data in lucene indexes?

2013-01-25 Thread Ian Lea
Unless there's good reason not to (massive size? different systems? conflicting update schedules?) I'd store everything in the one index. Consider a cached filter for fast restriction of searches to particular message types. -- Ian. On Thu, Jan 24, 2013 at 1:06 PM, crocket wrote: > I have th

Re: Adding metadata to Lucene indexes?

2011-11-04 Thread Francisco A. Lozano
Thank you :) this is very useful. Until today I maintained a "metadata" key=>value text file inside the lucene directories, but this feature looks better. Francisco A. Lozano On Fri, Nov 4, 2011 at 08:39, Uwe Schindler wrote: > You can read the Map without opening an IndexReader just by a sta

RE: Adding metadata to Lucene indexes?

2011-11-04 Thread Uwe Schindler
-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Uwe Schindler [mailto:u...@thetaphi.de] > Sent: Friday, November 04, 2011 8:31 AM > To: java-user@lucene.apache.org > Subject: RE: Adding metadata to Lucene indexes? > > It must be p

RE: Adding metadata to Lucene indexes?

2011-11-04 Thread Uwe Schindler
o: java-user@lucene.apache.org > Subject: Re: Adding metadata to Lucene indexes? > > This metadata Map needs to be written on every commit, or if I just use plain > commit() without the Map<> it keeps the old values? > > > Francisco A. Lozano > > > > On Thu, Nov 3, 201

Re: Adding metadata to Lucene indexes?

2011-11-03 Thread Francisco A. Lozano
ndexReader. >>> >>> - >>> Uwe Schindler >>> H.-H.-Meier-Allee 63, D-28213 Bremen >>> http://www.thetaphi.de >>> eMail: u...@thetaphi.de >>> >>>> -Original Message- >>>> From: Ian Lea [mailto:ian@

Re: Adding metadata to Lucene indexes?

2011-11-03 Thread Greg Bowyer
e 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Ian Lea [mailto:ian@gmail.com] Sent: Thursday, November 03, 2011 4:05 PM To: java-user@lucene.apache.org Subject: Re: Adding metadata to Lucene indexes? You could add a dedicated document to the index storing

Re: Adding metadata to Lucene indexes?

2011-11-03 Thread Jochen
riginal Message- From: Ian Lea [mailto:ian@gmail.com] Sent: Thursday, November 03, 2011 4:05 PM To: java-user@lucene.apache.org Subject: Re: Adding metadata to Lucene indexes? You could add a dedicated document to the index storing whatever you want. There is no requirement for lucene doc

RE: Adding metadata to Lucene indexes?

2011-11-03 Thread Uwe Schindler
om: Ian Lea [mailto:ian@gmail.com] > Sent: Thursday, November 03, 2011 4:05 PM > To: java-user@lucene.apache.org > Subject: Re: Adding metadata to Lucene indexes? > > You could add a dedicated document to the index storing whatever you want. > There is no requirement for lucene docs

Re: Adding metadata to Lucene indexes?

2011-11-03 Thread Ian Lea
You could add a dedicated document to the index storing whatever you want. There is no requirement for lucene docs to bear any relation to each other. -- Ian. On Wed, Nov 2, 2011 at 10:09 AM, Jochen wrote: > Hi, > > is it possible to add metadata to a Lucene index (not to the indivudual > Fie

Adding metadata to Lucene indexes?

2011-11-03 Thread Jochen
Hi, is it possible to add metadata to a Lucene index (not to the indivudual Fields or Documents contained in the index). We need to periodically update an index by importing an XML document, and are looking for a nice cozy place to store an import date and a checksum that tells us if our inpu

Re: Adding Encryption to lucene indexes

2011-08-14 Thread Grant Ingersoll
You might try searching JIRA, I believe there is an issue in there that attempts to provide an encrypted Directory implementation. You might also just use file system encryption. On Aug 12, 2011, at 8:09 PM, Chris Zakian wrote: > Hey, thanks for your reply Shaneal, > > I do have a person to

Re: Adding Encryption to lucene indexes

2011-08-12 Thread Chris Zakian
Hey, thanks for your reply Shaneal, I do have a person to consult with about the crypto code, it is just a matter of figuring out which streams to grab. So encrypting all of the write operations in IndexOutput (and DataOutput) and decrypting to plaintext in IndexInput on the way out should let me

Re: Adding Encryption to lucene indexes

2011-08-12 Thread Shaneal Manek
For starters, you probably shouldn't be writing your own crypto code (unless you're a professional cryptographer, or your project has access to one to audit your code). See, for example, http://chargen.matasano.com/chargen/2009/7/22/if-youre-typing-the-letters-a-e-s-into-your-code-youre-doing.html.

Adding Encryption to lucene indexes

2011-08-12 Thread Chris Zakian
Hello, I am currently adding Lucene (in combination with hibernate search) to a medical record service. As such, I need to encrypt the indexes so that unauthorized people don't have access to them by bypassing the system's database security. I was wondering if anyone had a) implemented a security

Re: Checksum and transactional safety for lucene indexes

2010-09-24 Thread Yonik Seeley
On Tue, Sep 21, 2010 at 12:53 AM, Lance Norskog wrote: > If an index file is not completely written to disk, it never become > available. Lucene has a file describing the current active index segments. > It writes all new files to the disk, and changes the description file > (segments.gen) only af

Re: Checksum and transactional safety for lucene indexes

2010-09-24 Thread Pulkit Singhal
In order to determine the integrity of an index file, I found that the easiest way was to use IndexReader.open(directory) and if there were any problems with the data then catch the exceptions and make a new one. I also see that the API offers IndexReader.indexExists() ... would that be a better a

Re: Checksum and transactional safety for lucene indexes

2010-09-20 Thread Lance Norskog
If an index file is not completely written to disk, it never become available. Lucene has a file describing the current active index segments. It writes all new files to the disk, and changes the description file (segments.gen) only after that. If the index files are corrupted, all bets are of

Checksum and transactional safety for lucene indexes

2010-09-20 Thread Pulkit Singhal
Hello Everyone, What happens if: a) lucene index gets written half-way to the disk and then something goes wrong? b) the index gets corrupted on the file system? When we open that directory location again using FSDirectory implementations: a) Is there any provision for the code to clean out the p

Re: Deciding memory requirements for Lucene indexes proactively -- How to?

2010-05-18 Thread Ian Lea
> Is there a way (perhaps a formulae) to accurately > judge  the memory requirement for a Lucene index? > (May be based on number of documents or index > size etc?) The short answer is no, although there are some things you can estimate based on the number of fields, terms etc. Sorting will use m

Deciding memory requirements for Lucene indexes proactively -- How to?

2010-05-17 Thread Maduranga Kannangara
Hi guys Is there a way (perhaps a formulae) to accurately judge the memory requirement for a Lucene index? (May be based on number of documents or index size etc?) Reason I am asking is that we had two indexes running on separate Tomcat instances and we decided to move both these webapps (Solr)

Re: Understanding lucene indexes and disk I/O

2010-04-13 Thread Michael McCandless
On Tue, Apr 13, 2010 at 11:55 AM, Burton-West, Tom wrote: > At some point maybe the File Formats Document could be updated to make it > clear that the tii has an entry similar to the IntexInterval'th tis entry but > instead of holding frq/prx deltas it holds absolute pointers. Is it worth > e

RE: Understanding lucene indexes and disk I/O

2010-04-13 Thread Burton-West, Tom
, 2010 5:27 AM To: java-user@lucene.apache.org Subject: Re: Understanding lucene indexes and disk I/O Hi Tom, Fear not: we only scan up to 128 terms, to find the specific term. First, the terms dict index (tii) is fully loaded into RAM, and then a binary search is done on this (in-RAM) to find t

Re: Understanding lucene indexes and disk I/O

2010-04-13 Thread Michael McCandless
Hi Tom, Fear not: we only scan up to 128 terms, to find the specific term. First, the terms dict index (tii) is fully loaded into RAM, and then a binary search is done on this (in-RAM) to find the nearest index term just before the term you want. Then, we seek to that spot in the main terms dict

Understanding lucene indexes and disk I/O

2010-04-12 Thread Burton-West, Tom
Hi all, Please let me know if this should be posted instead to the Lucene java-dev list. We have very large tis files (about 36 GB). I have not been too concerned as I assumed that due to the indexing of the tis file by the tii file, only a small portion of the file needed to be read. However

Re: Synchronizing Lucene indexes across 2 application servers

2009-06-20 Thread Ken Krugler
I've a web application which uses Lucene for search functionality. Lucene search requests are served by web services sitting on 2 application servers (IIS 7).The 2 application servers are Load balanced using "netscaler". Both these servers have a batch job running which updates search indexes on

Re: Synchronizing Lucene indexes across 2 application servers

2009-06-19 Thread Otis Gospodnetic
09 12:10:42 AM > Subject: Synchronizing Lucene indexes across 2 application servers > > > I've a web application which uses Lucene for search functionality. Lucene > search requests are served by web services sitting on 2 application servers > (IIS 7).The 2 application serve

Re: Synchronizing Lucene indexes across 2 application servers

2009-06-19 Thread Ian Lea
Or have a third master index, as Joel suggests, apply all updates to that index, only, then at the end of each batch index update run, use rsync or equivalent to push the master index out to the 2 search servers and then tell them to reopen their indexes. -- Ian. On Fri, Jun 19, 2009 at 9:23 AM

Re: Synchronizing Lucene indexes across 2 application servers

2009-06-19 Thread Joel Halbert
do they have to be kept in synch in real time? does each server handle writes to its own index which then need to be propagated to the other server's index? From a simplicity point of view, to minimise the amount of self consistency checking that needs to happen I would suggest even having a thi

Synchronizing Lucene indexes across 2 application servers

2009-06-18 Thread mitu2009
at any of the 2 application servers could be serving search request depending upon its availability. Any inputs please? Thanks for reading! -- View this message in context: http://www.nabble.com/Synchronizing-Lucene-indexes-across-2-application-servers-tp24105223p24105223.html Sent from the Luce

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread jayson.minard
e >>> that Mark put in (r773862) >>> >>> -Yonik >>> >>> - >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: jav

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread jayson.minard
on of Lucene >> that Mark put in (r773862) >> >> -Yonik >> >> --------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional comma

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread jayson.minard
3862) > > -Yonik > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > -- View this message in context: http://w

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread jayson.minard
java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > -- View this message in context: http://www.nabble.com/Getting-errors-reading-lucene-indexes-using-recent-lucene-from-Solr-tp23545868p23546170.html Sent from the L

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread Yonik Seeley
On Thu, May 14, 2009 at 2:16 PM, Yonik Seeley wrote: > Hmmm, OK... so you created the index with Lucene and are reading it with Solr. > What version of Lucene did you use to create? Oops, vise versa, right? -Yonik - To unsubscr

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread jayson.minard
Reader.java:218) > at > org.apache.lucene.index.SegmentReader.document(SegmentReader.java:914) > > > Anyone know what could cause this error? Is it something writing the > index incorrectly, or a read-side issue, or other? > > Thanks > -- Jayson > -- View this message in context: http://w

Re: Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread Yonik Seeley
On Thu, May 14, 2009 at 2:01 PM, jayson.minard wrote: > > When using the Solr trunk tip, get this error now when reading an index > created by Solr with Lucene directly: > > java.lang.IndexOutOfBoundsException: Index: 24, Size: 0 >        at java.util.ArrayList.RangeCheck(ArrayList.java:547) >    

Getting errors reading lucene indexes using recent lucene from Solr

2009-05-14 Thread jayson.minard
incorrectly, or a read-side issue, or other? Thanks -- Jayson -- View this message in context: http://www.nabble.com/Getting-errors-reading-lucene-indexes-using-recent-lucene-from-Solr-tp23545868p23545868.html Sent from the Lucene - Java Users mailing list archive at Nabble.com

RE: Lucene indexes

2009-02-24 Thread Steven A Rowe
On 2/24/2009 at 5:36 PM, Chris Hostetter wrote: > Shingling is (lucene specific?) vernacular for word based ngrams "Shingle" is not a Lucene-specific term - here's an entry, e.g., from an IBM "Glossary of terms for enterprise search" at

RE: Lucene indexes

2009-02-24 Thread Chris Hostetter
: The problem that I am trying to solve is : How to index phrases (rather : than phrase querying)? I have a Questions/Answers corpus, the : architecture I am using for IR creates one index for questions and : another one for answers (based on single terms) and then matches between : them. I wa

Re: Lucene indexes

2009-02-24 Thread Shashi Kant
you correctly). HTH, Shashi - Original Message From: Nada Mimouni To: java-user@lucene.apache.org Sent: Tuesday, February 24, 2009 9:22:19 AM Subject: RE: Lucene indexes Thank you Erick. I am totally aware that Lucene uses inverted index (class: IndexWriter). I have read in the

RE: Lucene indexes

2009-02-24 Thread Nada Mimouni
[mailto:erickerick...@gmail.com] Sent: Tue 2/24/2009 2:13 PM To: java-user@lucene.apache.org Subject: Re: Lucene indexes I have to ask why do you care? Which is another way of asking what problem you're trying to solve that you think this information would help with. As far as I know Lucene

Re: Lucene indexes

2009-02-24 Thread Erick Erickson
I have to ask why do you care? Which is another way of asking what problem you're trying to solve that you think this information would help with. As far as I know Lucene is an inverted index, period. You use IndexWriter to create it. Really the best way to get a sense for which classes to use is

Lucene indexes

2009-02-24 Thread Nada Mimouni
Hello everybody, 1) What is the difference between : - inverted index - nextword index - common index 2) Which one(s) is(are) supported by Lucene? 3) Which class(es) create this(those) index(es)? Thank you in advance for your help. Nada Mimouni --

[ANN] katta-0.1.0 release - distribute lucene indexes in a grid

2008-09-18 Thread Stefan Groschupf
After 5 month work we are happy to announce the first developer preview release of katta. This release contains all functionality to serve a large, sharded lucene index on many servers. Katta is standing on the shoulders of the giants lucene, hadoop and zookeeper. Main features: + Plays wel

Re: How can we know if 2 lucene indexes are same?

2008-09-06 Thread Shalin Shekhar Mangar
Sounds good. If other people find it useful, I'm all for it. I just didn't want to burden you guys to change the index format only for my use-case :-) On Sat, Sep 6, 2008 at 3:44 PM, Michael McCandless < [EMAIL PROTECTED]> wrote: > > I think this may also be useful for a reader to have richer log

Re: How can we know if 2 lucene indexes are same?

2008-09-06 Thread Michael McCandless
I think this may also be useful for a reader to have richer logic as to whether it's time to reopen. It's a way for the writer to do some minimal communication to the readers on what changes were just committed. Ie we have the static IndexReader.getVersion method, that opens the latest

Re: How can we know if 2 lucene indexes are same?

2008-09-05 Thread markharw00d
I think this could be a generally useful feature? +1. I could definitely use a "commitUserData" option for the same reasons. Thinking more on this, we may not need to modify the index format at all for this use-case. This is easily achieved in the current system by adding a dummy document

Re: How can we know if 2 lucene indexes are same?

2008-09-05 Thread Shalin Shekhar Mangar
On Fri, Sep 5, 2008 at 9:52 PM, Michael McCandless < [EMAIL PROTECTED]> wrote: > > Well this is certainly a nice challenging problem :) Yes it is :-) I think this could be a generally useful feature? > > So you're thinking IndexWriter.commit() would take an optional opaque > argument (maybe a S

Re: How can we know if 2 lucene indexes are same?

2008-09-05 Thread Michael McCandless
Shalin Shekhar Mangar wrote: On Fri, Sep 5, 2008 at 6:03 PM, Michael McCandless < [EMAIL PROTECTED]> wrote: Large segment merges will also send huge traffic. You may just want to send all updates (document adds/deletes) to all slaves directly? It'd be nice if you could somehow NOT sync

Re: How can we know if 2 lucene indexes are same?

2008-09-05 Thread 叶双明
Just think about the cost of indexing that many documents on each slave . It may slow down the responses from live slaves. I think there must be something like search service at the slaves incude a IndexSearcher or other equals object, and indexing that many documents by a IndexWriter , isn't the

Re: How can we know if 2 lucene indexes are same?

2008-09-05 Thread Shalin Shekhar Mangar
On Fri, Sep 5, 2008 at 6:03 PM, Michael McCandless < [EMAIL PROTECTED]> wrote: > > Large segment merges will also send huge traffic. You may just want to > send all updates (document adds/deletes) to all slaves directly? It'd be > nice if you could somehow NOT sync the effects of segment merging

Re: How can we know if 2 lucene indexes are same?

2008-09-05 Thread 叶双明
There is more and more complex, actually I hava a small index system can config multiple index server for query, In my opinion, because index update operating is synchronized between different Thread that update the index, so for indexing new data : can process data that want to index at the ma

Re: How can we know if 2 lucene indexes are same?

2008-09-05 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Fri, Sep 5, 2008 at 6:20 PM, Jason Rutherglen <[EMAIL PROTECTED]> wrote: > In Ocean I had to use a transaction log and execute everything that > way like SQL database replication. Then let each node handle it's own > merging process. Syncing the indexes is used to get a new node up to > speed,

Re: How can we know if 2 lucene indexes are same?

2008-09-05 Thread Jason Rutherglen
In Ocean I had to use a transaction log and execute everything that way like SQL database replication. Then let each node handle it's own merging process. Syncing the indexes is used to get a new node up to speed, otherwise it's avoided for the reasons mentioned in the previous email. On Fri, Se

Re: How can we know if 2 lucene indexes are same?

2008-09-05 Thread Michael McCandless
Shalin Shekhar Mangar wrote: Let me try to explain. I have a master where indexing is done. I have multiple slaves for querying. If I commit+optimize on the master and then rsync the index, the data transferred on the network is huge. An alternate way is to commit on master, transfer the

Re: How can we know if 2 lucene indexes are same?

2008-09-05 Thread Shalin Shekhar Mangar
Let me try to explain. I have a master where indexing is done. I have multiple slaves for querying. If I commit+optimize on the master and then rsync the index, the data transferred on the network is huge. An alternate way is to commit on master, transfer the delta to the slave and issue an optim

Re: How can we know if 2 lucene indexes are same?

2008-09-05 Thread 叶双明
Do you use index at the slave as a backup for index at the master?? And in case the master break down, you can turn the query to the slave?? When add a Document to master, also add it to the slave? Sorry, I don't clear about what your problem, can you show more detail about what do you worry abou

Re: How can we know if 2 lucene indexes are same?

2008-09-04 Thread Noble Paul നോബിള്‍ नोब्ळ्
I am not using the same index with different writers. These are two separate indexes both have their own reader/writer I just wanted to minimize the network load by avoiding the download of an optimized index if the contents are indeed same. --noble On Thu, Sep 4, 2008 at 7:36 PM, Michael McCandl

Re: How can we know if 2 lucene indexes are same?

2008-09-04 Thread 叶双明
I see now, thanks Michael McCandless, good explain!! 2008/9/4, Michael McCandless <[EMAIL PROTECTED]>: > > > Sorry, I should have said: you must always use the same writer, ie as of > 2.3, while IndexWriter.optimize (or normal segment merging) is running, > under one thread, another thread can use

Re: How can we know if 2 lucene indexes are same?

2008-09-04 Thread Michael McCandless
Sorry, I should have said: you must always use the same writer, ie as of 2.3, while IndexWriter.optimize (or normal segment merging) is running, under one thread, another thread can use that *same* writer to add/delete/update documents, and both are free to make changes to the index. Be

Re: How can we know if 2 lucene indexes are same?

2008-09-04 Thread 叶双明
I don't agreed with Michael McCandless. :) I konw that after 2.3, add and delete can run in one IndexWriter at one time, and also lucene has a update method which delete documents by term then add the new document. In my test, either LockObtainFailedException with thread sleep sentence: org.apac

Re: How can we know if 2 lucene indexes are same?

2008-09-04 Thread Michael McCandless
Actually, as of 2.3, this is no longer true: merges and optimizing run in the background, and allow add/update/delete documents to run at the same time. I think it's probably best to use application logic (outside of Lucene) to keep track of what updates happened to the master while the

Re: How can we know if 2 lucene indexes are same?

2008-09-04 Thread 叶双明
No documents can added into index when the index is optimizing, or optimizing can't run durling documents adding to the index. So, without other error, I think we can beleive the two index are indeed the same. :) 2008/9/4 Noble Paul നോബിള്‍ नोब्ळ् <[EMAIL PROTECTED]> > The use case is as follow

Re: How can we know if 2 lucene indexes are same?

2008-09-03 Thread Noble Paul നോബിള്‍ नोब्ळ्
The use case is as follows I have two indexes . One at the master and one at the slave. The user occasionally keeps committing on the master and the delta is replicated everytime. But when the optimize happens the transfer size can be really large. So I am thinking of doing the optimize separatel

Re: How can we know if 2 lucene indexes are same?

2008-08-29 Thread Karl Wettin
29 aug 2008 kl. 11.35 skrev Noble Paul നോബിള്‍ नोब्ळ्: hi, I wish to know if the contents of two indexes have same data. will all the files be exactly same if I put same set of documents to both? If you insert the documents in the same order with the same settings and both indices are

Re: Re: How can we know if 2 lucene indexes are same?

2008-08-29 Thread tom
AUTOMATIC REPLY Tom Roberts is out of the office till 2nd September 2008. LUX reopens on 1st September 2008 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: How can we know if 2 lucene indexes are same?

2008-08-29 Thread tom
AUTOMATIC REPLY Tom Roberts is out of the office till 2nd September 2008. LUX reopens on 1st September 2008 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

How can we know if 2 lucene indexes are same?

2008-08-29 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi, I wish to know if the contents of two indexes have same data. will all the files be exactly same if I put same set of documents to both? --Noble - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMA

Re: Lucene indexes and relationship

2007-09-15 Thread Grant Ingersoll
Sounds like faceting, I think. Have you looked at Solr? -Grant On Sep 15, 2007, at 1:15 AM, Mohammad Norouzi wrote: Hello, In our application, we have many categories (indexes) in which different kind of information have been indexed. we provided a facility for our users to opt their cate

Lucene indexes and relationship

2007-09-14 Thread Mohammad Norouzi
Hello, In our application, we have many categories (indexes) in which different kind of information have been indexed. we provided a facility for our users to opt their category to search and we also provided a way that they select more than one category to search, afterwards, we must return back t

Re: lucene indexes back up strategies

2007-05-01 Thread larry hughes
ation on this? Seems like this deletion policy is the key for making live backups. Thank you. LH Michael McCandless-3 wrote: > > > > "larry hughes" <[EMAIL PROTECTED]> wrote: > >> I'm pondering on long term maintenance issues with Lucene indexe

Re: lucene indexes back up strategies

2007-04-27 Thread Chris Hostetter
: > Wow, I did not know Lucene 2.1 can do all of this. The problem is that I'm : > currently using 2.0. Is there something similar to what you just mentioned : > in dealing with 2.0 indexes--backing up piecewise? Thanks again. : : Hmm, OK. Pre-2.1 Lucene will overwrite at least the file "segmen

Re: lucene indexes back up strategies

2007-04-27 Thread Michael McCandless
"larry hughes" <[EMAIL PROTECTED]> wrote: > Wow, I did not know Lucene 2.1 can do all of this. The problem is that I'm > currently using 2.0. Is there something similar to what you just mentioned > in dealing with 2.0 indexes--backing up piecewise? Thanks again. Hmm, OK. Pre-2.1 Lucene will

Re: lucene indexes back up strategies

2007-04-27 Thread larry hughes
gt; "larry hughes" <[EMAIL PROTECTED]> wrote: > >> I'm pondering on long term maintenance issues with Lucene indexes >> and would like to know of anyone's suggestions or recommendations to >> backing up these indexes. My goal is to have a weekly, or

Re: lucene indexes back up strategies

2007-04-27 Thread Michael McCandless
"larry hughes" <[EMAIL PROTECTED]> wrote: > I'm pondering on long term maintenance issues with Lucene indexes > and would like to know of anyone's suggestions or recommendations to > backing up these indexes. My goal is to have a weekly, or even > daily,

lucene indexes back up strategies

2007-04-27 Thread larry hughes
I'm pondering on long term maintenance issues with Lucene indexes and would like to know of anyone's suggestions or recommendations to backing up these indexes. My goal is to have a weekly, or even daily, snapshot of the current index to make sure it is recoverable if the index gets

  1   2   >