Re: Updates during Optimize

2011-04-13 Thread stockii
The current limitation or pause is when the ram buffer is flushing to disk 

- when an optimize starts and is running ~4 hours, you say, that DIH is
flushing the doc`s during this pause into the index ? 

-
--- System 

One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores  100.000

- Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
- Solr2 for Update-Request  - delta every Minute - 4GB Xmx
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Updates-during-Optimize-tp2811183p2815064.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Updates during Optimize

2011-04-13 Thread Mark Miller
Not cleanly currently. SOLR-2193: Re-architect Update Handler, should take care 
of this though.

- Mark

On Apr 12, 2011, at 8:21 AM, stockii wrote:

 Hello.
 
 When is start an optimize (which takes more than 4 hours) no updates from
 DIH are possible.
 i thougt solr is copy the hole index and then start an optimize from the
 copy and not lock the index and optimize this ... =(
 
 any way to do both in the same time ? 
 
 
 
 
 -
 --- System 
 
 
 One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
 1 Core with 31 Million Documents other Cores  100.000
 
 - Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
 - Solr2 for Update-Request  - delta every Minute - 4GB Xmx
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Updates-during-Optimize-tp2811183p2811183.html
 Sent from the Solr - User mailing list archive at Nabble.com.

- Mark Miller
lucidimagination.com

Lucene/Solr User Conference
May 25-26, San Francisco
www.lucenerevolution.org







Re: Updates during Optimize

2011-04-12 Thread Shawn Heisey

On 4/12/2011 6:21 AM, stockii wrote:

Hello.

When is start an optimize (which takes more than 4 hours) no updates from
DIH are possible.
i thougt solr is copy the hole index and then start an optimize from the
copy and not lock the index and optimize this ... =(

any way to do both in the same time ?


You can't index and optimize at the same time, and I'm pretty sure that 
there isn't any way to make it possible that wouldn't involve a major 
rewrite of Lucene, and possibly Solr.  The devs would have to say 
differently if my understanding is wrong.


The optimize takes place at the Lucene level.  I can't give you much 
in-depth information, but I can give you some high level stuff.  What 
it's doing is equivalent to a merge, down to one segment.  This is not 
the same as a straight file copy.  It must read the entire Lucene data 
structure and build a new one from scratch.  The process removes deleted 
documents and will also upgrade the version number of the index if it 
was written with an older version of Lucene.  It's very likely that the 
reading side of the process is nearly as comprehensive as the CheckIndex 
program, but it also has to write out a new index segment.


The net result -- the process gives your CPU and especially your I/O 
subsystem a workout, simultaneously.  If you were to make your I/O 
subsystem faster, you would probably see a major improvement in your 
optimize times.


On my installation, it takes about 11 minutes to optimize one my 16GB 
shards, each with 9 million docs.  These live in virtual machines that 
are stored on a six-drive RAID10 array using 7200RPM SATA disks.  One of 
my pie-in-the-sky upgrade dreams is to replace that with a four-drive 
RAID10 array using SSD, the other two drives would be regular SATA -- a 
mirrored OS partition.


Thanks,
Shawn



Re: Updates during Optimize

2011-04-12 Thread Jason Rutherglen
You can index and optimize at the same time.  The current limitation
or pause is when the ram buffer is flushing to disk, however that's
changing with the DocumentsWriterPerThread implementation, eg,
LUCENE-2324.

On Tue, Apr 12, 2011 at 8:34 AM, Shawn Heisey s...@elyograg.org wrote:
 On 4/12/2011 6:21 AM, stockii wrote:

 Hello.

 When is start an optimize (which takes more than 4 hours) no updates from
 DIH are possible.
 i thougt solr is copy the hole index and then start an optimize from the
 copy and not lock the index and optimize this ... =(

 any way to do both in the same time ?

 You can't index and optimize at the same time, and I'm pretty sure that
 there isn't any way to make it possible that wouldn't involve a major
 rewrite of Lucene, and possibly Solr.  The devs would have to say
 differently if my understanding is wrong.

 The optimize takes place at the Lucene level.  I can't give you much
 in-depth information, but I can give you some high level stuff.  What it's
 doing is equivalent to a merge, down to one segment.  This is not the same
 as a straight file copy.  It must read the entire Lucene data structure and
 build a new one from scratch.  The process removes deleted documents and
 will also upgrade the version number of the index if it was written with an
 older version of Lucene.  It's very likely that the reading side of the
 process is nearly as comprehensive as the CheckIndex program, but it also
 has to write out a new index segment.

 The net result -- the process gives your CPU and especially your I/O
 subsystem a workout, simultaneously.  If you were to make your I/O subsystem
 faster, you would probably see a major improvement in your optimize times.

 On my installation, it takes about 11 minutes to optimize one my 16GB
 shards, each with 9 million docs.  These live in virtual machines that are
 stored on a six-drive RAID10 array using 7200RPM SATA disks.  One of my
 pie-in-the-sky upgrade dreams is to replace that with a four-drive RAID10
 array using SSD, the other two drives would be regular SATA -- a mirrored OS
 partition.

 Thanks,
 Shawn