Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-17 Thread Mike Klaas

Hi Jayson,

It is on my list of things to do.  I've been having a very busy week  
and and am also working all weekend.  I hope to get to it next week  
sometime, if no-one else has taken it.


cheers,
-mike

On 8-May-09, at 10:15 PM, jayson.minard wrote:



First cut of updated handler now in:
https://issues.apache.org/jira/browse/SOLR-1155

Needs review from those that know Lucene better, and double check  
for errors

in locking or other areas of the code.  Thanks.

--j


jayson.minard wrote:


Can we move this to patch files within the JIRA issue please.  Will  
make

it easier to review and help out a as a patch to current trunk.

--j


Jim Murphy wrote:




Yonik Seeley-2 wrote:


...your code snippit elided and edited below ...





Don't take this code as correct (or even compiling) but is this the
essence?  I moved shared access to the writer inside the read lock  
and
kept the other non-commit bits to the write lock.  I'd need to  
rethink

the locking in a more fundamental way but is this close to idea?



public void commit(CommitUpdateCommand cmd) throws IOException {

   if (cmd.optimize) {
 optimizeCommands.incrementAndGet();
   } else {
 commitCommands.incrementAndGet();
   }

   Future[] waitSearcher = null;
   if (cmd.waitSearcher) {
 waitSearcher = new Future[1];
   }

   boolean error=true;
   iwCommit.lock();
   try {
 log.info(start +cmd);

 if (cmd.optimize) {
   closeSearcher();
   openWriter();
   writer.optimize(cmd.maxOptimizeSegments);
 }
   finally {
 iwCommit.unlock();
}


 iwAccess.lock();
 try
{
 writer.commit();
}
finally
{
 iwAccess.unlock();
}

 iwCommit.lock();
 try
{
 callPostCommitCallbacks();
 if (cmd.optimize) {
   callPostOptimizeCallbacks();
 }
 // open a new searcher in the sync block to avoid opening it
 // after a deleteByQuery changed the index, or in between  
deletes

 // and adds of another commit being done.
 core.getSearcher(true,false,waitSearcher);

 // reset commit tracking
 tracker.didCommit();

 log.info(end_commit_flush);

 error=false;
   }
   finally {
 iwCommit.unlock();
 addCommands.set(0);
 deleteByIdCommands.set(0);
 deleteByQueryCommands.set(0);
 numErrors.set(error ? 1 : 0);
   }

   // if we are supposed to wait for the searcher to be  
registered, then

we should do it
   // outside of the synchronized block so that other update  
operations

can proceed.
   if (waitSearcher!=null  waitSearcher[0] != null) {
  try {
   waitSearcher[0].get();
 } catch (InterruptedException e) {
   SolrException.log(log,e);
 } catch (ExecutionException e) {
   SolrException.log(log,e);
 }
   }
 }









--
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23457422.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-17 Thread jayson.minard

Thanks Mike, I'm running it in a few environments that do not have
post-commit hooks and so far have not seen any issues.  A white-box review
will be helpful in seeing things that may rarely occur, or if I had any
misuse if internal data structures that I do not know well enough to
measure.

--j



Mike Klaas wrote:
 
 Hi Jayson,
 
 It is on my list of things to do.  I've been having a very busy week  
 and and am also working all weekend.  I hope to get to it next week  
 sometime, if no-one else has taken it.
 
 cheers,
 -mike
 
 On 8-May-09, at 10:15 PM, jayson.minard wrote:
 

 First cut of updated handler now in:
 https://issues.apache.org/jira/browse/SOLR-1155

 Needs review from those that know Lucene better, and double check  
 for errors
 in locking or other areas of the code.  Thanks.

 --j


 
 
 

-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23587440.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Autocommit blocking adds? AutoCommit Speedup?

2009-05-14 Thread Gargate, Siddharth
Hi all,
I am also facing the same issue where autocommit blocks all
other requests. I having around 1,00,000 documents with average size of
100K each. It took more than 20 hours to index. 
I have currently set autocommit maxtime to 7 seconds, mergeFactor to 25.
Do I need more configuration changes?
Also I see that memory usage goes to peak level of heap specified(6 GB
in my case). Looks like Solr spends most of the time in GC. 
According to my understanding, fix for Solr-1155 would be that commit
will run in background and new documents will be queued in the memory.
But I am afraid of the memory consumption by this queue if commit takes
much longer to complete.

Thanks,
Siddharth

-Original Message-
From: jayson.minard [mailto:jayson.min...@gmail.com] 
Sent: Saturday, May 09, 2009 10:45 AM
To: solr-user@lucene.apache.org
Subject: Re: Autocommit blocking adds? AutoCommit Speedup?


First cut of updated handler now in:
https://issues.apache.org/jira/browse/SOLR-1155

Needs review from those that know Lucene better, and double check for
errors
in locking or other areas of the code.  Thanks.

--j


jayson.minard wrote:
 
 Can we move this to patch files within the JIRA issue please.  Will
make
 it easier to review and help out a as a patch to current trunk.
 
 --j
 
 
 Jim Murphy wrote:
 
 
 
 Yonik Seeley-2 wrote:
 
 ...your code snippit elided and edited below ...
 
 
 
 
 Don't take this code as correct (or even compiling) but is this the
 essence?  I moved shared access to the writer inside the read lock
and
 kept the other non-commit bits to the write lock.  I'd need to
rethink
 the locking in a more fundamental way but is this close to idea? 
 
 
 
  public void commit(CommitUpdateCommand cmd) throws IOException {
 
 if (cmd.optimize) {
   optimizeCommands.incrementAndGet();
 } else {
   commitCommands.incrementAndGet();
 }
 
 Future[] waitSearcher = null;
 if (cmd.waitSearcher) {
   waitSearcher = new Future[1];
 }
 
 boolean error=true;
 iwCommit.lock();
 try {
   log.info(start +cmd);
 
   if (cmd.optimize) {
 closeSearcher();
 openWriter();
 writer.optimize(cmd.maxOptimizeSegments);
   }
 finally {
   iwCommit.unlock();
  }
 
 
   iwAccess.lock(); 
   try
  {
   writer.commit();
  }
  finally
  {
   iwAccess.unlock(); 
  }
 
   iwCommit.lock(); 
   try
  {
   callPostCommitCallbacks();
   if (cmd.optimize) {
 callPostOptimizeCallbacks();
   }
   // open a new searcher in the sync block to avoid opening it
   // after a deleteByQuery changed the index, or in between
deletes
   // and adds of another commit being done.
   core.getSearcher(true,false,waitSearcher);
 
   // reset commit tracking
   tracker.didCommit();
 
   log.info(end_commit_flush);
 
   error=false;
 }
 finally {
   iwCommit.unlock();
   addCommands.set(0);
   deleteByIdCommands.set(0);
   deleteByQueryCommands.set(0);
   numErrors.set(error ? 1 : 0);
 }
 
 // if we are supposed to wait for the searcher to be registered,
then
 we should do it
 // outside of the synchronized block so that other update
operations
 can proceed.
 if (waitSearcher!=null  waitSearcher[0] != null) {
try {
 waitSearcher[0].get();
   } catch (InterruptedException e) {
 SolrException.log(log,e);
   } catch (ExecutionException e) {
 SolrException.log(log,e);
   }
 }
   }
 
 
 
 
 
 

-- 
View this message in context:
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp2
3435224p23457422.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-14 Thread Jack Godwin
20+ hours? I index 3 million records in 3 hours.  Is your auto commit
causing a snapshot?  What do you have listed in the events.

Jack

On 5/14/09, Gargate, Siddharth sgarg...@ptc.com wrote:
 Hi all,
   I am also facing the same issue where autocommit blocks all
 other requests. I having around 1,00,000 documents with average size of
 100K each. It took more than 20 hours to index.
 I have currently set autocommit maxtime to 7 seconds, mergeFactor to 25.
 Do I need more configuration changes?
 Also I see that memory usage goes to peak level of heap specified(6 GB
 in my case). Looks like Solr spends most of the time in GC.
 According to my understanding, fix for Solr-1155 would be that commit
 will run in background and new documents will be queued in the memory.
 But I am afraid of the memory consumption by this queue if commit takes
 much longer to complete.

 Thanks,
 Siddharth

 -Original Message-
 From: jayson.minard [mailto:jayson.min...@gmail.com]
 Sent: Saturday, May 09, 2009 10:45 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Autocommit blocking adds? AutoCommit Speedup?


 First cut of updated handler now in:
 https://issues.apache.org/jira/browse/SOLR-1155

 Needs review from those that know Lucene better, and double check for
 errors
 in locking or other areas of the code.  Thanks.

 --j


 jayson.minard wrote:

 Can we move this to patch files within the JIRA issue please.  Will
 make
 it easier to review and help out a as a patch to current trunk.

 --j


 Jim Murphy wrote:



 Yonik Seeley-2 wrote:

 ...your code snippit elided and edited below ...




 Don't take this code as correct (or even compiling) but is this the
 essence?  I moved shared access to the writer inside the read lock
 and
 kept the other non-commit bits to the write lock.  I'd need to
 rethink
 the locking in a more fundamental way but is this close to idea?



  public void commit(CommitUpdateCommand cmd) throws IOException {

 if (cmd.optimize) {
   optimizeCommands.incrementAndGet();
 } else {
   commitCommands.incrementAndGet();
 }

 Future[] waitSearcher = null;
 if (cmd.waitSearcher) {
   waitSearcher = new Future[1];
 }

 boolean error=true;
 iwCommit.lock();
 try {
   log.info(start +cmd);

   if (cmd.optimize) {
 closeSearcher();
 openWriter();
 writer.optimize(cmd.maxOptimizeSegments);
   }
 finally {
   iwCommit.unlock();
  }


   iwAccess.lock();
   try
  {
   writer.commit();
  }
  finally
  {
   iwAccess.unlock();
  }

   iwCommit.lock();
   try
  {
   callPostCommitCallbacks();
   if (cmd.optimize) {
 callPostOptimizeCallbacks();
   }
   // open a new searcher in the sync block to avoid opening it
   // after a deleteByQuery changed the index, or in between
 deletes
   // and adds of another commit being done.
   core.getSearcher(true,false,waitSearcher);

   // reset commit tracking
   tracker.didCommit();

   log.info(end_commit_flush);

   error=false;
 }
 finally {
   iwCommit.unlock();
   addCommands.set(0);
   deleteByIdCommands.set(0);
   deleteByQueryCommands.set(0);
   numErrors.set(error ? 1 : 0);
 }

 // if we are supposed to wait for the searcher to be registered,
 then
 we should do it
 // outside of the synchronized block so that other update
 operations
 can proceed.
 if (waitSearcher!=null  waitSearcher[0] != null) {
try {
 waitSearcher[0].get();
   } catch (InterruptedException e) {
 SolrException.log(log,e);
   } catch (ExecutionException e) {
 SolrException.log(log,e);
   }
 }
   }







 --
 View this message in context:
 http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp2
 3435224p23457422.html
 Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Sent from my mobile device


RE: Autocommit blocking adds? AutoCommit Speedup?

2009-05-14 Thread jayson.minard

Siddharth,

The settings you have in your solrconfig for ramBufferSizeMB and
maxBufferedDocs control how much memory may be used during indexing besides
any overhead with the documents being in-flight at a given moment
(deserialized into memory but not yet handed to lucene).  There are
streaming versions of the client/server that help with that as well by
trying to process them as they arrive.

The patch SOLR-1155 does not add more memory use, but rather lets the
threads proceed through to Lucene without blocking within Solr as often.  So
instead of a stuck thread holding the documents in memory they will be
moving threads doing the same.

So the buffer sizes mentioned above along with the amount of documents you
send at a time will push your memory footprint.  Send smaller batches (less
efficient) or stream; or make sure you have enough memory for the amount of
docs you send at a time.  

For indexing I slow my commits down if there is no need for the documents to
become available for query right away.  For pure indexing, a long autoCommit
time and large max document count ebfore auto committing helps.  Committing
isn't what flushes them out of memory, it is what makes the on-disk version
part of the overall index.  Over committing will slow you way down. 
Especially if you have any listeners on the commits doing a lot of work
(i.e. Solr distribution).

Also, if you are querying on the indexer that can eat memory and compete
with the memory you are trying to reserve for indexing.  So a split model of
indexing and querying on different instances lets you tune each the best;
but then you have a gap in time from indexing to querying as the trade-off.

It is hard to say what is going on with GC without knowing what garbage
collection settings you are passing to the VM, and what version of the Java
VM you are using.  Which garbage collector are you using and what tuning
parameters?

I tend to use Parallel GC on my indexers with GC Overhead limit turned off
allowing for some pauses (which users don't see on a back-end indexer) but
good GC with lower heap fragmentation.  I tend to use concurrent mark and
sweep GC on my query slaves with tuned incremental mode and pacing which is
a low pause collector taking advantage of the cores on my servers and can
incrementally keep up with the needs of a query slave.

-- Jayson


Gargate, Siddharth wrote:
 
 Hi all,
   I am also facing the same issue where autocommit blocks all
 other requests. I having around 1,00,000 documents with average size of
 100K each. It took more than 20 hours to index. 
 I have currently set autocommit maxtime to 7 seconds, mergeFactor to 25.
 Do I need more configuration changes?
 Also I see that memory usage goes to peak level of heap specified(6 GB
 in my case). Looks like Solr spends most of the time in GC. 
 According to my understanding, fix for Solr-1155 would be that commit
 will run in background and new documents will be queued in the memory.
 But I am afraid of the memory consumption by this queue if commit takes
 much longer to complete.
 
 Thanks,
 Siddharth
 
 
-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23540569.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-14 Thread jayson.minard

Indexing speed comes down to a lot of factors.  The settings as talked about
above, VM settings, the size of the documents, how many are sent at a time,
how active you can keep the indexer (i.e. one thread sending documents lets
the indexer relax whereas N threads keeps pressure on the indexer), how
often you commit and of course the hardware you are running on.  Disk I/O is
a big factor along with having enough cores and memory to buffer and process
the documents.

Comparing two sets of numbers is tough.  We have indexes that range from
indexing a few million an hour up through 18-20M per hour in a indexing
cluster for distributed search.

--j


Jack Godwin wrote:
 
 20+ hours? I index 3 million records in 3 hours.  Is your auto commit
 causing a snapshot?  What do you have listed in the events.
 
 Jack
 
 On 5/14/09, Gargate, Siddharth sgarg...@ptc.com wrote:
 Hi all,
  I am also facing the same issue where autocommit blocks all
 other requests. I having around 1,00,000 documents with average size of
 100K each. It took more than 20 hours to index.
 I have currently set autocommit maxtime to 7 seconds, mergeFactor to 25.
 Do I need more configuration changes?
 Also I see that memory usage goes to peak level of heap specified(6 GB
 in my case). Looks like Solr spends most of the time in GC.
 According to my understanding, fix for Solr-1155 would be that commit
 will run in background and new documents will be queued in the memory.
 But I am afraid of the memory consumption by this queue if commit takes
 much longer to complete.

 Thanks,
 Siddharth

 
 

-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23540643.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-08 Thread Jim Murphy

Any pointers to this newer more concurrent behavior in lucene?  I can try an
experiment where I downgrade the iwCommit lock to the iwAccess lock to allow
updates to happen during commit.  

Would you expect that to work?

Thanks for bootstrapping me on this. 

Jim



Yonik Seeley-2 wrote:
 
 On Thu, May 7, 2009 at 8:37 PM, Jim Murphy jim.mur...@pobox.com wrote:
 Interesting.  So is there a JIRA ticket open for this already? Any chance
 of
 getting it into 1.4?
 
 No ticket currently open, but IMO it could make it for 1.4.
 
 Its seriously kicking out butts right now.  We write
 into our masters with ~50ms response times till we hit the autocommit
 then
 add/update response time is 10-30 seconds.  Ouch.
 
 It's probably been made a little worse lately since Lucene now does
 fsync on index files before writing the segments file that points to
 those files.  A necessary evil to prevent index corruption.
 
 I'd be willing to work on submitting a patch given a better understanding
 of
 the issue.
 
 Great, go for it!
 
 -Yonik
 http://www.lucidimagination.com
 
 

-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23452011.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-08 Thread Yonik Seeley
On Fri, May 8, 2009 at 4:27 PM, Jim Murphy jim.mur...@pobox.com wrote:

 Any pointers to this newer more concurrent behavior in lucene?

At the API level we care about, IndexWriter.commit() instead of close()

Also, we shouldn't have to worry about other parts of the code closing
the writer on us since things like deleteByQuery no longer need to
close the writer to work.

core.getSearcher()... if we don't lock until it's finished, then what
could currently happen is that you could wind up with a newer version
of the index than you thought you might.  I think this should be fine
though.

We'd need to think about what type of synchronization may be needed
for postCommit and postOptimize hooks too.

Here's the relevant code:

iwCommit.lock();
try {
  log.info(start +cmd);

  if (cmd.optimize) {
openWriter();
writer.optimize(cmd.maxOptimizeSegments);
  }

  closeWriter();

  callPostCommitCallbacks();
  if (cmd.optimize) {
callPostOptimizeCallbacks();
  }
  // open a new searcher in the sync block to avoid opening it
  // after a deleteByQuery changed the index, or in between deletes
  // and adds of another commit being done.
  core.getSearcher(true,false,waitSearcher);



-Yonik
http://www.lucidimagination.com


Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-08 Thread jayson.minard

Created issue:

https://issues.apache.org/jira/browse/SOLR-1155



Jim Murphy wrote:
 
 Any pointers to this newer more concurrent behavior in lucene?  I can try
 an experiment where I downgrade the iwCommit lock to the iwAccess lock to
 allow updates to happen during commit.  
 
 Would you expect that to work?
 
 Thanks for bootstrapping me on this. 
 
 Jim
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23453693.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-08 Thread Jim Murphy



Yonik Seeley-2 wrote:
 
 ...your code snippit elided and edited below ...
 



Don't take this code as correct (or even compiling) but is this the essence? 
I moved shared access to the writer inside the read lock and kept the other
non-commit bits to the write lock.  I'd need to rethink the locking in a
more fundamental way but is this close to idea? 



 public void commit(CommitUpdateCommand cmd) throws IOException {

if (cmd.optimize) {
  optimizeCommands.incrementAndGet();
} else {
  commitCommands.incrementAndGet();
}

Future[] waitSearcher = null;
if (cmd.waitSearcher) {
  waitSearcher = new Future[1];
}

boolean error=true;
iwCommit.lock();
try {
  log.info(start +cmd);

  if (cmd.optimize) {
closeSearcher();
openWriter();
writer.optimize(cmd.maxOptimizeSegments);
  }
finally {
  iwCommit.unlock();
 }


  iwAccess.lock(); 
  try
 {
  writer.commit();
 }
 finally
 {
  iwAccess.unlock(); 
 }

  iwCommit.lock(); 
  try
 {
  callPostCommitCallbacks();
  if (cmd.optimize) {
callPostOptimizeCallbacks();
  }
  // open a new searcher in the sync block to avoid opening it
  // after a deleteByQuery changed the index, or in between deletes
  // and adds of another commit being done.
  core.getSearcher(true,false,waitSearcher);

  // reset commit tracking
  tracker.didCommit();

  log.info(end_commit_flush);

  error=false;
}
finally {
  iwCommit.unlock();
  addCommands.set(0);
  deleteByIdCommands.set(0);
  deleteByQueryCommands.set(0);
  numErrors.set(error ? 1 : 0);
}

// if we are supposed to wait for the searcher to be registered, then we
should do it
// outside of the synchronized block so that other update operations can
proceed.
if (waitSearcher!=null  waitSearcher[0] != null) {
   try {
waitSearcher[0].get();
  } catch (InterruptedException e) {
SolrException.log(log,e);
  } catch (ExecutionException e) {
SolrException.log(log,e);
  }
}
  }



-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23454419.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-08 Thread jayson.minard

Can we move this to patch files within the JIRA issue please.  Will make it
easier to review and help out a as a patch to current trunk.

--j


Jim Murphy wrote:
 
 
 
 Yonik Seeley-2 wrote:
 
 ...your code snippit elided and edited below ...
 
 
 
 
 Don't take this code as correct (or even compiling) but is this the
 essence?  I moved shared access to the writer inside the read lock and
 kept the other non-commit bits to the write lock.  I'd need to rethink the
 locking in a more fundamental way but is this close to idea? 
 
 
 
  public void commit(CommitUpdateCommand cmd) throws IOException {
 
 if (cmd.optimize) {
   optimizeCommands.incrementAndGet();
 } else {
   commitCommands.incrementAndGet();
 }
 
 Future[] waitSearcher = null;
 if (cmd.waitSearcher) {
   waitSearcher = new Future[1];
 }
 
 boolean error=true;
 iwCommit.lock();
 try {
   log.info(start +cmd);
 
   if (cmd.optimize) {
 closeSearcher();
 openWriter();
 writer.optimize(cmd.maxOptimizeSegments);
   }
 finally {
   iwCommit.unlock();
  }
 
 
   iwAccess.lock(); 
   try
  {
   writer.commit();
  }
  finally
  {
   iwAccess.unlock(); 
  }
 
   iwCommit.lock(); 
   try
  {
   callPostCommitCallbacks();
   if (cmd.optimize) {
 callPostOptimizeCallbacks();
   }
   // open a new searcher in the sync block to avoid opening it
   // after a deleteByQuery changed the index, or in between deletes
   // and adds of another commit being done.
   core.getSearcher(true,false,waitSearcher);
 
   // reset commit tracking
   tracker.didCommit();
 
   log.info(end_commit_flush);
 
   error=false;
 }
 finally {
   iwCommit.unlock();
   addCommands.set(0);
   deleteByIdCommands.set(0);
   deleteByQueryCommands.set(0);
   numErrors.set(error ? 1 : 0);
 }
 
 // if we are supposed to wait for the searcher to be registered, then
 we should do it
 // outside of the synchronized block so that other update operations
 can proceed.
 if (waitSearcher!=null  waitSearcher[0] != null) {
try {
 waitSearcher[0].get();
   } catch (InterruptedException e) {
 SolrException.log(log,e);
   } catch (ExecutionException e) {
 SolrException.log(log,e);
   }
 }
   }
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23455432.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-08 Thread jayson.minard

First cut of updated handler now in:
https://issues.apache.org/jira/browse/SOLR-1155

Needs review from those that know Lucene better, and double check for errors
in locking or other areas of the code.  Thanks.

--j


jayson.minard wrote:
 
 Can we move this to patch files within the JIRA issue please.  Will make
 it easier to review and help out a as a patch to current trunk.
 
 --j
 
 
 Jim Murphy wrote:
 
 
 
 Yonik Seeley-2 wrote:
 
 ...your code snippit elided and edited below ...
 
 
 
 
 Don't take this code as correct (or even compiling) but is this the
 essence?  I moved shared access to the writer inside the read lock and
 kept the other non-commit bits to the write lock.  I'd need to rethink
 the locking in a more fundamental way but is this close to idea? 
 
 
 
  public void commit(CommitUpdateCommand cmd) throws IOException {
 
 if (cmd.optimize) {
   optimizeCommands.incrementAndGet();
 } else {
   commitCommands.incrementAndGet();
 }
 
 Future[] waitSearcher = null;
 if (cmd.waitSearcher) {
   waitSearcher = new Future[1];
 }
 
 boolean error=true;
 iwCommit.lock();
 try {
   log.info(start +cmd);
 
   if (cmd.optimize) {
 closeSearcher();
 openWriter();
 writer.optimize(cmd.maxOptimizeSegments);
   }
 finally {
   iwCommit.unlock();
  }
 
 
   iwAccess.lock(); 
   try
  {
   writer.commit();
  }
  finally
  {
   iwAccess.unlock(); 
  }
 
   iwCommit.lock(); 
   try
  {
   callPostCommitCallbacks();
   if (cmd.optimize) {
 callPostOptimizeCallbacks();
   }
   // open a new searcher in the sync block to avoid opening it
   // after a deleteByQuery changed the index, or in between deletes
   // and adds of another commit being done.
   core.getSearcher(true,false,waitSearcher);
 
   // reset commit tracking
   tracker.didCommit();
 
   log.info(end_commit_flush);
 
   error=false;
 }
 finally {
   iwCommit.unlock();
   addCommands.set(0);
   deleteByIdCommands.set(0);
   deleteByQueryCommands.set(0);
   numErrors.set(error ? 1 : 0);
 }
 
 // if we are supposed to wait for the searcher to be registered, then
 we should do it
 // outside of the synchronized block so that other update operations
 can proceed.
 if (waitSearcher!=null  waitSearcher[0] != null) {
try {
 waitSearcher[0].get();
   } catch (InterruptedException e) {
 SolrException.log(log,e);
   } catch (ExecutionException e) {
 SolrException.log(log,e);
   }
 }
   }
 
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23457422.html
Sent from the Solr - User mailing list archive at Nabble.com.



Autocommit blocking adds? AutoCommit Speedup?

2009-05-07 Thread Jim Murphy

Question 1: I see in DirectUpdateHandler2 that there is a read/Write lock
used between addDoc and commit.  

My mental model of the process was this: clients can add/update documents
until the auto commit threshold was hit.  At that point the commit tracker
would schedule a background commit.  The commit would run and NOT BLOCK
subsequent adds.  clearly thast not happening because when the autocommit
background thread runs it gets the iwCommit lock blocking anyone in addDoc
trying to get iwAccess lock.

Is this just the way it is or is it possible to configure Solr to process
the pending documents int he background, queuing new documents in memory as
before.  

Question 2: I ask this question because autocommits are taking a LONG time
to complete, like 10-25 seconds.  I have a 40M document index many 10s of
GBs.  What can I do to speed this up?

Thanks

Jim
-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23435224.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-07 Thread Yonik Seeley
On Thu, May 7, 2009 at 5:03 PM, Jim Murphy jim.mur...@pobox.com wrote:
 Question 1: I see in DirectUpdateHandler2 that there is a read/Write lock
 used between addDoc and commit.

 My mental model of the process was this: clients can add/update documents
 until the auto commit threshold was hit.  At that point the commit tracker
 would schedule a background commit.  The commit would run and NOT BLOCK
 subsequent adds.  clearly thast not happening because when the autocommit
 background thread runs it gets the iwCommit lock blocking anyone in addDoc
 trying to get iwAccess lock.

Background: in the past, you had to close the Lucene IndexWriter so
all changes would be flushed to disk (and you could then open a new
IndexReader to seel the changes).  You obviously can't be adding new
documents while you're trying to close the writer - hence the locking.
 It as also the case that readers and writers had to be opened and
closed in the right way to handle things like deletes (which had to go
through the reader).

This is no longer the case, and we should revisit the locking.  I do
think we should be able to continue indexing while doing a commit.

-Yonik
http://www.lucidimagination.com


Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-07 Thread Jim Murphy

Interesting.  So is there a JIRA ticket open for this already? Any chance of
getting it into 1.4?  Its seriously kicking out butts right now.  We write
into our masters with ~50ms response times till we hit the autocommit then
add/update response time is 10-30 seconds.  Ouch.

I'd be willing to work on submitting a patch given a better understanding of
the issue. 

Jim
-- 
View this message in context: 
http://www.nabble.com/Autocommit-blocking-adds---AutoCommit-Speedup--tp23435224p23438134.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Autocommit blocking adds? AutoCommit Speedup?

2009-05-07 Thread Yonik Seeley
On Thu, May 7, 2009 at 8:37 PM, Jim Murphy jim.mur...@pobox.com wrote:
 Interesting.  So is there a JIRA ticket open for this already? Any chance of
 getting it into 1.4?

No ticket currently open, but IMO it could make it for 1.4.

 Its seriously kicking out butts right now.  We write
 into our masters with ~50ms response times till we hit the autocommit then
 add/update response time is 10-30 seconds.  Ouch.

It's probably been made a little worse lately since Lucene now does
fsync on index files before writing the segments file that points to
those files.  A necessary evil to prevent index corruption.

 I'd be willing to work on submitting a patch given a better understanding of
 the issue.

Great, go for it!

-Yonik
http://www.lucidimagination.com