I'm still having trouble with this.  My program will run for a while, then
hang up at the same place.  Here is my add/commit process:

I am using StreamingUpdateSolrServer with queue size = 100 and num threads =
3.  My indexing process spawns 8 threads to process a subset of RSS feeds
which each thread then loops through.  Once a thread has processed a new
article, it constructs a new SolrInputDocument, creates a temporary
Collection<SolrInputDocument> containing just the one new document, then
calls server.add(docs).  I never call commit() or optimize() from my java
code (I did before though, but I took that out).

On the server side, I have these related settings:
  <updateHandler class="solr.DirectUpdateHandler2">
    <autoCommit>
      <maxDocs>300</maxDocs>
      <maxTime>10000</maxTime>
    </autoCommit>
</updateHandler>

I also have replication set up, as this is the master, here are the
settings:
<requestHandler name="/replication" class="solr.ReplicationHandler" >
    <lst name="master">
      <str name="replicateAfter">commit</str>
      <str name="replicateAfter">startup</str>
      <str name="confFiles">schema.xml,stopwords.txt</str>
    </lst>
</requestHandler>

Those are the only extra settings I've set.  I also have a cron job running
every minute executing this command:
curl http://localhost:8985/solr/mycore/update -F stream.body=' <commit />'

Otherwise I don't see the numDocs number increase on the admin statistics
page.

This process will soon be ONLY for indexing.  Is there a better way to
optimize it?  I replicate from the slaves every 60 seconds, and I want
documents to be available to the slaves as soon as possible.  Currently I
have a search process that has some IndexSearcher's on the Solr index (it's
a pure Lucene program), could that be causing issues?  This process never
opens an IndexWriter.

Thanks!


On Tue, Jul 13, 2010 at 10:52 AM, Max Lynch <ihas...@gmail.com> wrote:

> Great, thanks!
>
>
> On Tue, Jul 13, 2010 at 2:55 AM, Fornoville, Tom <tom.fornovi...@truvo.com
> > wrote:
>
>> If you're only adding documents you can also have a go with
>> StreamingUpdateSolrServer instead of the CommonsHttpSolrServer.
>> Couple that with the suggestion of master/slave so the searches don't
>> interfere with the indexing and you should have a pretty responsive
>> system.
>>
>> -----Original Message-----
>> From: Robert Petersen [mailto:rober...@buy.com]
>> Sent: maandag 12 juli 2010 22:30
>> To: solr-user@lucene.apache.org
>> Subject: RE: CommonsHttpSolrServer add document hangs
>>
>> You could try a master slave setup using replication perhaps, then the
>> slave serves searches and indexing commits on the master won't hang up
>> searches at least...
>>
>> Here is the description:  http://wiki.apache.org/solr/SolrReplication
>>
>>
>> -----Original Message-----
>> From: Max Lynch [mailto:ihas...@gmail.com]
>> Sent: Monday, July 12, 2010 11:57 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: CommonsHttpSolrServer add document hangs
>>
>> Thanks Robert,
>>
>> My script did start going again, but it was waiting for about half an
>> hour
>> which seems a bit excessive to me.  Is there some tuning I can do on the
>> solr end to optimize for my use case, which is very heavy on commits and
>> very light on searches (I do most of my searches on the raw Lucene index
>> in
>> the background)?
>>
>> Thanks.
>>
>> On Mon, Jul 12, 2010 at 12:06 PM, Robert Petersen <rober...@buy.com>
>> wrote:
>>
>> > Maybe solr is busy doing a commit or optimize?
>> >
>> > -----Original Message-----
>> > From: Max Lynch [mailto:ihas...@gmail.com]
>> > Sent: Monday, July 12, 2010 9:59 AM
>> > To: solr-user@lucene.apache.org
>> > Subject: CommonsHttpSolrServer add document hangs
>> >
>> > Hey guys,
>> > I'm using Solr 1.4.1 and I've been having some problems lately with
>> code
>> > that adds documents through a CommonsHttpSolrServer.  It seems that
>> > randomly
>> > the call to theserver.add() will hang.  I am currently running my code
>> > in a
>> > single thread, but I noticed this would happen in multi threaded code
>> as
>> > well.  The jar version of commons-httpclient is 3.1.
>> >
>> > I got a thread dump of the process, and one thread seems to be waiting
>> > on
>> > the org.apache.commons.httpclient.MultiThreadedHttpConnectionManager
>> as
>> > shown below.  All other threads are in a RUNNABLE state (besides the
>> > Finalizer daemon).
>> >
>> >     [java] Full thread dump Java HotSpot(TM) 64-Bit Server VM
>> (16.3-b01
>> > mixed mode):
>> >     [java]
>> >     [java] "MultiThreadedHttpConnectionManager cleanup" daemon prio=10
>> > tid=0x00007f441051c800 nid=0x527c in Object.wait()
>> [0x00007f4417e2f000]
>> >     [java]    java.lang.Thread.State: WAITING (on object monitor)
>> >     [java]     at java.lang.Object.wait(Native Method)
>> >     [java]     - waiting on <0x00007f443ae5b290> (a
>> > java.lang.ref.ReferenceQueue$Lock)
>> >     [java]     at
>> > java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
>> >     [java]     - locked <0x00007f443ae5b290> (a
>> > java.lang.ref.ReferenceQueue$Lock)
>> >     [java]     at
>> > java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
>> >     [java]     at
>> >
>> org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$Referen
>> > ceQueueThread.run(MultiThreadedHttpConnectionManager.java:1122)
>> >
>> > Any ideas?
>> >
>> > Thanks.
>> >
>>
>
>

Reply via email to