Re: Merge Index Filling up Disk Space

2006-12-26 Thread Erick Erickson
First, it probably would have been a good thing to start a new thread on this topic, since it's only vaguely related to disk space ... That said, sure. Note that there's no requirement in lucene that all documents in an index have the same fields. Also, there's no reason you can't use two separat

Re: Merge Index Filling up Disk Space

2006-12-26 Thread Harini Raghavan
Hi, I have another related problem. I am adding news articles for a company to the lucene index. As of now if the articles are mapped to more than one company, they are added so many times in the index. As the no. of companies mapped to each article increases, this will not be a scalable impl

Re: Merge Index Filling up Disk Space

2006-12-26 Thread Michael McCandless
Harini Raghavan wrote: Yes I think I got hit IOException. I assumed that the.tmp files are not required and deleted them manually from the indes directory as they were more than 10G. Is that ok? Yes, they are indeed not necessary so deleting them is fine. This (deleting partially created file

Re: Merge Index Filling up Disk Space

2006-12-26 Thread Harini Raghavan
Yes I think I got hit IOException. I assumed that the.tmp files are not required and deleted them manually from the indes directory as they were more than 10G. Is that ok? Michael McCandless wrote: Harini Raghavan wrote: Thank you for the response. I don't have readers open on the index, bu

Re: Merge Index Filling up Disk Space

2006-12-22 Thread Michael McCandless
Harini Raghavan wrote: Thank you for the response. I don't have readers open on the index, but while the optimize/merge was running I was searching on the index. Would that make any difference? You're welcome! Right, a searcher opens an IndexReader. So this means you should see peak @ 3X th

Re: Merge Index Filling up Disk Space

2006-12-22 Thread Mark Miller
A Searcher uses a Reader to read the index for searching. - Mark Harini Raghavan wrote: Hi Mike, Thank you for the response. I don't have readers open on the index, but while the optimize/merge was running I was searching on the index. Would that make any difference? Also after the optimizin

Re: Merge Index Filling up Disk Space

2006-12-22 Thread Harini Raghavan
Hi Mike, Thank you for the response. I don't have readers open on the index, but while the optimize/merge was running I was searching on the index. Would that make any difference? Also after the optimizing the index I had some .tmp files which were > 10G and did not get merged. Could that also

Re: Merge Index Filling up Disk Space

2006-12-21 Thread Yonik Seeley
On 12/21/06, Michael McCandless <[EMAIL PROTECTED]> wrote: I *think* it's really max 2X even with compound file (if no readers)? Because, in IndexWriter.mergeSegments we: 1. Create the newly merged segment in non-compound format (brings us up to 2X, when it's the last merge). 2. Co

Re: Merge Index Filling up Disk Space

2006-12-21 Thread Michael McCandless
Yonik Seeley wrote: On 12/21/06, Michael McCandless <[EMAIL PROTECTED]> wrote: Harini Raghavan wrote: > I am using lucene 1.9.1 for search functionality in my j2ee application > using JBoss as app server. The lucene index directory size is almost 20G > right now. There is a Quartz job that is

Re: Merge Index Filling up Disk Space

2006-12-21 Thread Yonik Seeley
On 12/21/06, Michael McCandless <[EMAIL PROTECTED]> wrote: Harini Raghavan wrote: > I am using lucene 1.9.1 for search functionality in my j2ee application > using JBoss as app server. The lucene index directory size is almost 20G > right now. There is a Quartz job that is adding data to the inde

Re: Merge Index Filling up Disk Space

2006-12-21 Thread Michael McCandless
Harini Raghavan wrote: I am using lucene 1.9.1 for search functionality in my j2ee application using JBoss as app server. The lucene index directory size is almost 20G right now. There is a Quartz job that is adding data to the index evey min and around 2 documents get added to the index e

RE: Merge Index Filling up Disk Space

2006-12-21 Thread Rob Staveley (Tom)
@lucene.apache.org Subject: Merge Index Filling up Disk Space Hi All, I am using lucene 1.9.1 for search functionality in my j2ee application using JBoss as app server. The lucene index directory size is almost 20G right now. There is a Quartz job that is adding data to the index evey min and arou

Re: Merge Index Filling up Disk Space

2006-12-21 Thread Mark Miller
When Lucene optimizes the Index (which it semi does naturally as the index grows) it creates a copy of the index, so you can expect the space requirements for an index to be double the index at an absolute minimum. If you are adding 20,000 docs a day and working with an index that is already 20

Merge Index Filling up Disk Space

2006-12-21 Thread Harini Raghavan
Hi All, I am using lucene 1.9.1 for search functionality in my j2ee application using JBoss as app server. The lucene index directory size is almost 20G right now. There is a Quartz job that is adding data to the index evey min and around 2 documents get added to the index every day.When t