Re: Commit and OpenSearcher not working as expected.

2012-12-16 Thread shreejay
Hi Mark, 

That was a typo in my post. I am using openSearcher only. But still see the
same log files. 

/update/?commit=true&openSearcher=false




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Commit-and-OpenSearcher-not-working-as-expected-tp4027419p4027451.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit and OpenSearcher not working as expected.

2012-12-16 Thread Mark Miller
Try openSearcher instead?

- Mark

On Dec 16, 2012, at 8:18 PM, shreejay  wrote:

> Hello. 
> 
> I am running a commit on a solrCloud collection using a cron job. The
> command is as follows:
> 
> aa.aa.aa.aa:8983/solr/ABCCollection/update?commit=true&opensearcher=false
> 
> But when i see the logs I see the the commit has been called with
> openSearcher=true. 
> 
> The directupdatehandler2 in my solrconfig file looks like this:
> /
> 
>   
>   0
>   0
>   
> 
> 0 
>
> 
>false
>false  
> 
>
>  ${solr.data.dir:}
>
> 
>  /
> 
> 
> 
> And these are the logs :
> http://pastebin.com/bGh2GRvx
> 
> 
> I am not sure why openSearcher is being called. I am indexing a ton of
> documents right now, and am not using search at all. Also read in the Wiki,
> that keeping openSearcher=false is recommended for solrcloud.
> http://wiki.apache.org/solr/SolrConfigXml#Update_Handler_Section
> 
> 
> Is there some place else where openSearcher has to be set while calling a
> commit? 
> 
> 
> --Shreejay
> 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Commit-and-OpenSearcher-not-working-as-expected-tp4027419.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Commit when a segment is written

2012-06-21 Thread Erick Erickson
I don't think autocommit is deprecated, it's just commented out of the config
and using commitWithin (assuming you're working from SolrJ) is preferred if
possible.

But what governs "a particular set of docs"? What are the criteria
that determine when
you want to commit? Flushes and commits are orthogonal. A segment is kept open
through multiple flushes. That is, there can be many flushes and the documents
still aren't searchable until the first commit (but it sounds like
you're aware of that).

Have you tried using autocommit? And what version of Solr are you using?

And finally, what is your use case for frequent commits? If you're going after
NRT functionality, have  you looked at the NRT stuff in 4.x?

Best
Erick

On Thu, Jun 21, 2012 at 8:01 AM, Ramprakash Ramamoorthy
 wrote:
> Dear,
>
>        I am using Lucene/Solr for my log search tool. Is there a way I can
> perform a commit operation on my IndexWriter when a particular set of docs
> is flushed from memory to the disk. My RamBufferSize is 24Mb and
> MergeFactor is 10.
>
>        Or is calling commit in frequent intervals irrespective of the
> flushes the only way? I wish the autocommit  feature was not deprecated.
>
>
> --
> With Thanks and Regards,
> Ramprakash Ramamoorthy,
> Engineer Trainee,
> Zoho Corporation.
> +91 9626975420


Re: commit question

2012-05-16 Thread Mark Miller

On May 16, 2012, at 5:23 AM, marco crivellaro wrote:

> Hi all,
> this might be a silly question but I've found different opinions on the
> subject.
> 
> When a search is run after a commit is performed will the result include all
> document(s) committed until last commit?
> 
> use case (sync):
> 1- add document
> 2- commit
> 3- search (faceted)
> 
> will faceted search on point 3 include the document added at point 1?
> 
> thank you,
> Marco Crivellaro
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/commit-question-tp3984044.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Yes - as long as that commit has the option of waiting for a new searcher set 
to true. Otherwise it would be a race.

- Mark Miller
lucidimagination.com













Re: commit fail

2012-04-30 Thread Erick Erickson
In the 3.6 world, LukeRequestHandler does some...er...really expensive
things when you click into the admin/schema browser. This is _much_
better in trunk BTW.

So, as Yonik says, LukeRequestHandler probably accounts for
one of the threads.

Does this occur when nobody is playing around with the admin
handler?

Erick

On Sat, Apr 28, 2012 at 10:03 AM, Yonik Seeley
 wrote:
> On Sat, Apr 28, 2012 at 7:02 AM, mav.p...@holidaylettings.co.uk
>  wrote:
>> Hi,
>>
>> This is what the thread dump looks like.
>>
>> Any ideas?
>
> Looks like the thread taking up CPU is in LukeRequestHandler
>
>> 1062730578@qtp-1535043768-5' Id=16, RUNNABLE on lock=, total cpu
>> time=16156160.ms user time=16153110.msat
>> org.apache.solr.handler.admin.LukeRequestHandler.getIndexedFieldsInfo(LukeR
>> equestHandler.java:320)
>
> That probably accounts for the 1 CPU doing things... but it's not
> clear at all why commits are failing.
>
> Perhaps the commit is succeeding, but the client is just not waiting
> long enough for it to complete?
>
> -Yonik
> lucenerevolution.com - Lucene/Solr Open Source Search Conference.
> Boston May 7-10


Re: commit fail

2012-04-28 Thread Yonik Seeley
On Sat, Apr 28, 2012 at 7:02 AM, mav.p...@holidaylettings.co.uk
 wrote:
> Hi,
>
> This is what the thread dump looks like.
>
> Any ideas?

Looks like the thread taking up CPU is in LukeRequestHandler

> 1062730578@qtp-1535043768-5' Id=16, RUNNABLE on lock=, total cpu
> time=16156160.ms user time=16153110.msat
> org.apache.solr.handler.admin.LukeRequestHandler.getIndexedFieldsInfo(LukeR
> equestHandler.java:320)

That probably accounts for the 1 CPU doing things... but it's not
clear at all why commits are failing.

Perhaps the commit is succeeding, but the client is just not waiting
long enough for it to complete?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: commit fail

2012-04-28 Thread mav.p...@holidaylettings.co.uk
Hi,

This is what the thread dump looks like.

Any ideas?

Mav

Java HotSpot(TM) 64-Bit Server VM20.1-b02Thread Count: current=19,
peak=20, daemon=6'DestroyJavaVM' Id=26, RUNNABLE on lock=, total cpu
time=198450.ms user time=196890.ms'Timer-2' Id=25, TIMED_WAITING
on lock=java.util.TaskQueue@33799a1e, total cpu time=0.ms user
time=0.msat java.lang.Object.wait(Native Method)
at java.util.TimerThread.mainLoop(Timer.java:509)
at java.util.TimerThread.run(Timer.java:462)
'pool-3-thread-1' Id=24, WAITING on
lock=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@
747541f8, total cpu time=0.ms user time=0.msat
sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await
(AbstractQueuedSynchronizer.java:1987)
at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
 
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947
) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
907) 
at java.lang.Thread.run(Thread.java:662)
'pool-1-thread-1' Id=23, WAITING on
lock=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@
3e3e3c83, total cpu time=480.ms user time=460.msat
sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await
(AbstractQueuedSynchronizer.java:1987)
at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
 
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947
) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
907) 
at java.lang.Thread.run(Thread.java:662)
'Timer-1' Id=21, TIMED_WAITING on lock=java.util.TaskQueue@67f6dc61, total
cpu time=180.ms user time=120.msat java.lang.Object.wait(Native
Method) 
at java.util.TimerThread.mainLoop(Timer.java:509)
at java.util.TimerThread.run(Timer.java:462)
'2021372560@qtp-1535043768-9 - Acceptor0 SocketConnector@0.0.0.0:8983'
Id=20, RUNNABLE on lock=, total cpu time=60.ms user time=60.msat
java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
at java.net.ServerSocket.implAccept(ServerSocket.java:462)
at java.net.ServerSocket.accept(ServerSocket.java:430)
at org.mortbay.jetty.bio.SocketConnector.accept(SocketConnector.java:99)
at 
org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708
) 
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:58
2) 
'1384828782@qtp-1535043768-8' Id=19, TIMED_WAITING on
lock=org.mortbay.thread.QueuedThreadPool$PoolThread@528acf6e, total cpu
time=274160.ms user time=273060.msat java.lang.Object.wait(Native
Method) 
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:62
6) 
'1715374531@qtp-1535043768-7' Id=18, RUNNABLE on lock=, total cpu
time=15725890.ms user time=15723380.msat
sun.management.ThreadImpl.getThreadInfo1(Native Method)
at sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:154)
at 
org.apache.jsp.admin.threaddump_jsp._jspService(org.apache.jsp.admin.thread
dump_jsp:264) 
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:109)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at 
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:
389) 
at 
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:486)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:380)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:327)
at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:126)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java
:275) 
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandle
r.java:1212) 
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.hand

Re: commit stops

2012-04-27 Thread Bill Bell
We also see extreme slowness using Solr 3.6 when trying to commit a delete. We 
also get hangs. We do 1 commit at most a week. Rebuilding from scratching using 
DIH works fine and has never hung.

Bill Bell
Sent from mobile


On Apr 27, 2012, at 5:59 PM, "mav.p...@holidaylettings.co.uk" 
 wrote:

> Thanks for the reply
> 
> The client expects a response within 2 minutes and after that will report
> an error. When we build fresh it seems to work and the operation takes a
> second or two to complete. Once it gets to a stage it hangs it simply
> won't accept any further commits. I did an index check and all was ok.
> 
> I don¹t see any major commit happening at any  time, it seems to just
> hang. Even starting up and shutting down takes ages.
> 
> We make 3 - 4 commits a day.
> 
> We use solr 3.5
> 
> No autocommit
> 
> 
> 
> On 28/04/2012 00:56, "Yonik Seeley"  wrote:
> 
>> On Fri, Apr 27, 2012 at 9:18 AM, mav.p...@holidaylettings.co.uk
>>  wrote:
>>> We have an index of about 3.5gb which seems to work fine until it
>>> suddenly stops accepting new commits.
>>> 
>>> Users can still search on the front end but nothing new can be
>>> committed and it always times out on commit.
>>> 
>>> Any ideas?
>> 
>> Perhaps the commit happens to cause a major merge which may take a
>> long time (and solr isn't going to allow overlapping commits).
>> How long does a commit request take to time out?
>> 
>> What Solr version is this?  Do you have any kind of auto-commit set
>> up?  How often are you manually committing?
>> 
>> -Yonik
>> lucenerevolution.com - Lucene/Solr Open Source Search Conference.
>> Boston May 7-10
> 


Re: commit fail

2012-04-27 Thread Yonik Seeley
On Fri, Apr 27, 2012 at 8:23 PM, mav.p...@holidaylettings.co.uk
 wrote:
> Hi again,
>
> This is the only log entry I can find, regarding the failed commits…
>
> Still timing out as far as the client is concerned and there is actually 
> nothing happening on the server in terms of load (staging environment).
>
> 1 CPU core seems busy constantly with solr but unsure what is happening.

You can get a thread dump to see what the various threads are doing
(use the solr admin, or kill -3).  Sounds like it could just be either
merging in progress or a commit in progress.

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: commit stops

2012-04-27 Thread mav.p...@holidaylettings.co.uk
Thanks for the reply

The client expects a response within 2 minutes and after that will report
an error. When we build fresh it seems to work and the operation takes a
second or two to complete. Once it gets to a stage it hangs it simply
won't accept any further commits. I did an index check and all was ok.

I don¹t see any major commit happening at any  time, it seems to just
hang. Even starting up and shutting down takes ages.

We make 3 - 4 commits a day.

We use solr 3.5

No autocommit



On 28/04/2012 00:56, "Yonik Seeley"  wrote:

>On Fri, Apr 27, 2012 at 9:18 AM, mav.p...@holidaylettings.co.uk
> wrote:
>> We have an index of about 3.5gb which seems to work fine until it
>>suddenly stops accepting new commits.
>>
>> Users can still search on the front end but nothing new can be
>>committed and it always times out on commit.
>>
>> Any ideas?
>
>Perhaps the commit happens to cause a major merge which may take a
>long time (and solr isn't going to allow overlapping commits).
>How long does a commit request take to time out?
>
>What Solr version is this?  Do you have any kind of auto-commit set
>up?  How often are you manually committing?
>
>-Yonik
>lucenerevolution.com - Lucene/Solr Open Source Search Conference.
>Boston May 7-10



Re: commit stops

2012-04-27 Thread Yonik Seeley
On Fri, Apr 27, 2012 at 9:18 AM, mav.p...@holidaylettings.co.uk
 wrote:
> We have an index of about 3.5gb which seems to work fine until it suddenly 
> stops accepting new commits.
>
> Users can still search on the front end but nothing new can be committed and 
> it always times out on commit.
>
> Any ideas?

Perhaps the commit happens to cause a major merge which may take a
long time (and solr isn't going to allow overlapping commits).
How long does a commit request take to time out?

What Solr version is this?  Do you have any kind of auto-commit set
up?  How often are you manually committing?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: commit stops

2012-04-27 Thread mav.p...@holidaylettings.co.uk
One more thing I noticed is the the schema browser in the admin interface also 
eventually times out…

Any ideas from anyone ?

From: Mav Peri
To: "solr-user@lucene.apache.org" 
mailto:solr-user@lucene.apache.org>>
Subject: commit stops

Hi

We have an index of about 3.5gb which seems to work fine until it suddenly 
stops accepting new commits.

Users can still search on the front end but nothing new can be committed and it 
always times out on commit.

Any ideas?

Thanks in advance




2012-04-27 14:14:20.537:WARN::Committed before 500 
null||org.mortbay.jetty.EofException|?at 
org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:791)|?at 
org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:569)|?at
 org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1012)|?at 
sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:278)|?at 
sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:122)|?at 
java.io.OutputStreamWriter.flush(OutputStreamWriter.java:212)|?at 
org.apache.solr.common.util.FastWriter.flush(FastWriter.java:115)|?at 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:344)|?at
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:265)|?at
 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)|?at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)|?at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)|?at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)|?at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)|?at 
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)|?at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)|?at
 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)|?at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)|?at 
org.mortbay.jetty.Server.handle(Server.java:326)|?at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)|?at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)|?at
 org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)|?at 
org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)|?at 
org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)|?at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)|?at
 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)|Caused
 by: java.net.SocketException: Broken pipe|?at 
java.net.SocketOutputStream.socketWrite0(Native Method)|?at 
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)|?at 
java.net.SocketOutputStream.write(SocketOutputStream.java:136)|?at 
org.mortbay.io.ByteArrayBuffer.writeTo(ByteArrayBuffer.java:368)|?at 
org.mortbay.io.bio.StreamEndPoint.flush(StreamEndPoint.java:129)|?at 
org.mortbay.io.bio.StreamEndPoint.flush(StreamEndPoint.java:161)|?at 
org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:714)|?... 25 more|
2012-04-27 14:14:20.537:WARN::/solr/update/
java.lang.IllegalStateException: Committed
at org.mortbay.jetty.Response.resetBuffer(Response.java:1023)
at org.mortbay.jetty.Response.sendError(Response.java:240)
at 
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:380)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:283)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
27-Apr-2012 14:14:20 org.apache.solr.common.SolrException log
SEVERE: org.mortbay.jetty.EofException
at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.

Re: Commit Strategy for SolrCloud when Talking about 200 million records.

2012-03-23 Thread Mark Miller

On Mar 23, 2012, at 12:49 PM, I-Chiang Chen wrote:

> Caused by: java.lang.OutOfMemoryError: Map failed

Hmm...looks like this is the key info here. 

- Mark Miller
lucidimagination.com













Re: Commit Strategy for SolrCloud when Talking about 200 million records.

2012-03-23 Thread I-Chiang Chen
We saw couple distinct errors and all machines in a shard is identical:

-On the leader of the shard
Mar 21, 2012 1:58:34 AM org.apache.solr.common.SolrException log
SEVERE: shard update error StdNode:
http://blah.blah.net:8983/solr/master2-slave1/:org.apache.solr.common.SolrException:
Map failed
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:488)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:251)
at
org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:319)
at
org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:300)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

followed by

Mar 21, 2012 1:58:52 AM org.apache.solr.common.SolrException log
SEVERE: shard update error StdNode:
http://blah.blah.net:8983/solr/master2-slave1/:org.apache.solr.common.SolrException:
java.io.IOException: Map failed
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:488)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:251)
at
org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:319)
at
org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:300)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

followed by

Mar 21, 2012 1:58:55 AM
org.apache.solr.update.processor.DistributedUpdateProcessor doFinish
INFO: Could not tell a replica to recover
org.apache.solr.client.solrj.SolrServerException:
http://blah.blah.net:8983/solr
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:496)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:251)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.doFinish(DistributedUpdateProcessor.java:347)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.finish(DistributedUpdateProcessor.java:816)
at
org.apache.solr.update.processor.LogUpdateProcessor.finish(LogUpdateProcessorFactory.java:176)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:433)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at org.apache.commons.httpclient.HttpParser.readRawLine(HttpParser.java:78)
at org.apache.commons.httpclient.HttpParser.readLine(HttpParser.java:106)
at
org.apache.commons.httpclient.HttpConnecti

Re: Commit Strategy for SolrCloud when Talking about 200 million records.

2012-03-23 Thread Markus Jelsma
We did some tests too with many millions of documents and auto-commit enabled. 
It didn't take long for the indexer to stall and in the meantime the number of 
open files exploded, to over 16k, then 32k.

On Friday 23 March 2012 12:20:15 Mark Miller wrote:
> What issues? It really shouldn't be a problem.
> 
> On Mar 22, 2012, at 11:44 PM, I-Chiang Chen  wrote:
> > At this time we are not leveraging the NRT functionality. This is the
> > initial data load process where the idea is to just add all 200 millions
> > records first. Than do a single commit at the end to make them
> > searchable. We actually disabled auto commit at this time.
> > 
> > We have tried to leave auto commit enabled during the initial data load
> > process and ran into multiple issues that leads to botched loading
> > process.
> > 
> > On Thu, Mar 22, 2012 at 2:15 PM, Mark Miller  
wrote:
> >> On Mar 21, 2012, at 9:37 PM, I-Chiang Chen wrote:
> >>> We are currently experimenting with SolrCloud functionality in Solr
> >>> 4.0. The goal is to see if Solr 4.0 trunk with is current state is
> >>> able to handle roughly 200million documents. The document size is not
> >>> big around
> >> 
> >> 40
> >> 
> >>> fields no more than a KB, most of which are empty majority of times.
> >>> 
> >>> The setup we have is 4 servers w/ 2 shards w/ 2 servers per shard. We
> >>> are running in Tomcat.
> >>> 
> >>> The questions are giving the approximate data volume, is it a realistic
> >> 
> >> to
> >> 
> >>> expect above setup can handle it.
> >> 
> >> So 100 million docs per machine essentially? Totally depends on the
> >> hardware and what features you are using - but def in the realm of
> >> possibility.
> >> 
> >>> Giving the number of documents should
> >>> commit every x documents or rely on auto commits?
> >> 
> >> The number of docs shouldn't really matter here. Do you need near real
> >> time search?
> >> 
> >> You should be able to commit about as frequently as you'd like with NRT
> >> (eg every 1 second if you'd like) - either using soft auto commit or
> >> commitWithin.
> >> 
> >> Then you want to do a hard commit less frequently - every minute (or
> >> more or less) with openSearcher=false.
> >> 
> >> eg
> >> 
> >>
> >>
> >>  15000
> >>  false
> >>
> >>
> >>> 
> >>> --
> >>> -IC
> >> 
> >> - Mark Miller
> >> lucidimagination.com

-- 
Markus Jelsma - CTO - Openindex


Re: Commit Strategy for SolrCloud when Talking about 200 million records.

2012-03-23 Thread Mark Miller
What issues? It really shouldn't be a problem. 


On Mar 22, 2012, at 11:44 PM, I-Chiang Chen  wrote:

> At this time we are not leveraging the NRT functionality. This is the
> initial data load process where the idea is to just add all 200 millions
> records first. Than do a single commit at the end to make them searchable.
> We actually disabled auto commit at this time.
> 
> We have tried to leave auto commit enabled during the initial data load
> process and ran into multiple issues that leads to botched loading process.
> 
> On Thu, Mar 22, 2012 at 2:15 PM, Mark Miller  wrote:
> 
>> 
>> On Mar 21, 2012, at 9:37 PM, I-Chiang Chen wrote:
>> 
>>> We are currently experimenting with SolrCloud functionality in Solr 4.0.
>>> The goal is to see if Solr 4.0 trunk with is current state is able to
>>> handle roughly 200million documents. The document size is not big around
>> 40
>>> fields no more than a KB, most of which are empty majority of times.
>>> 
>>> The setup we have is 4 servers w/ 2 shards w/ 2 servers per shard. We are
>>> running in Tomcat.
>>> 
>>> The questions are giving the approximate data volume, is it a realistic
>> to
>>> expect above setup can handle it.
>> 
>> So 100 million docs per machine essentially? Totally depends on the
>> hardware and what features you are using - but def in the realm of
>> possibility.
>> 
>>> Giving the number of documents should
>>> commit every x documents or rely on auto commits?
>> 
>> The number of docs shouldn't really matter here. Do you need near real
>> time search?
>> 
>> You should be able to commit about as frequently as you'd like with NRT
>> (eg every 1 second if you'd like) - either using soft auto commit or
>> commitWithin.
>> 
>> Then you want to do a hard commit less frequently - every minute (or more
>> or less) with openSearcher=false.
>> 
>> eg
>> 
>>
>>  15000
>>  false
>>
>> 
>>> 
>>> --
>>> -IC
>> 
>> - Mark Miller
>> lucidimagination.com
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> -- 
> -IC


Re: Commit Strategy for SolrCloud when Talking about 200 million records.

2012-03-22 Thread I-Chiang Chen
At this time we are not leveraging the NRT functionality. This is the
initial data load process where the idea is to just add all 200 millions
records first. Than do a single commit at the end to make them searchable.
We actually disabled auto commit at this time.

We have tried to leave auto commit enabled during the initial data load
process and ran into multiple issues that leads to botched loading process.

On Thu, Mar 22, 2012 at 2:15 PM, Mark Miller  wrote:

>
> On Mar 21, 2012, at 9:37 PM, I-Chiang Chen wrote:
>
> > We are currently experimenting with SolrCloud functionality in Solr 4.0.
> > The goal is to see if Solr 4.0 trunk with is current state is able to
> > handle roughly 200million documents. The document size is not big around
> 40
> > fields no more than a KB, most of which are empty majority of times.
> >
> > The setup we have is 4 servers w/ 2 shards w/ 2 servers per shard. We are
> > running in Tomcat.
> >
> > The questions are giving the approximate data volume, is it a realistic
> to
> > expect above setup can handle it.
>
> So 100 million docs per machine essentially? Totally depends on the
> hardware and what features you are using - but def in the realm of
> possibility.
>
> > Giving the number of documents should
> > commit every x documents or rely on auto commits?
>
> The number of docs shouldn't really matter here. Do you need near real
> time search?
>
> You should be able to commit about as frequently as you'd like with NRT
> (eg every 1 second if you'd like) - either using soft auto commit or
> commitWithin.
>
> Then you want to do a hard commit less frequently - every minute (or more
> or less) with openSearcher=false.
>
> eg
>
> 
>   15000
>   false
> 
>
> >
> > --
> > -IC
>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
>
>
>


-- 
-IC


Re: Commit Strategy for SolrCloud when Talking about 200 million records.

2012-03-22 Thread Mark Miller

On Mar 21, 2012, at 9:37 PM, I-Chiang Chen wrote:

> We are currently experimenting with SolrCloud functionality in Solr 4.0.
> The goal is to see if Solr 4.0 trunk with is current state is able to
> handle roughly 200million documents. The document size is not big around 40
> fields no more than a KB, most of which are empty majority of times.
> 
> The setup we have is 4 servers w/ 2 shards w/ 2 servers per shard. We are
> running in Tomcat.
> 
> The questions are giving the approximate data volume, is it a realistic to
> expect above setup can handle it.

So 100 million docs per machine essentially? Totally depends on the hardware 
and what features you are using - but def in the realm of possibility.

> Giving the number of documents should
> commit every x documents or rely on auto commits?

The number of docs shouldn't really matter here. Do you need near real time 
search?

You should be able to commit about as frequently as you'd like with NRT (eg 
every 1 second if you'd like) - either using soft auto commit or commitWithin.

Then you want to do a hard commit less frequently - every minute (or more or 
less) with openSearcher=false.

eg

  
   15000 
   false 
 

> 
> -- 
> -IC

- Mark Miller
lucidimagination.com













Re: Commit call - ReadTimeoutException -> usage scenario for big update requests and the ioexception case

2012-02-07 Thread Torsten Krah

Am 07.02.2012 15:12, schrieb Erick Erickson:

Right, I suspect you're hitting merges.


Guess so.

How often are you

committing?


One time, after all work is done.

In other words, why are you committing explicitly?

It's often better to use commitWithin on the add command
and just let Solr do its work without explicitly committing.


Tika does extract my docs and i'll fetch the results (memory, disk) - 
externally.
I all went ok like expected, i'll take those docs and add it to my solr 
server instance.
After i am done with add + deletes i'll do commit. One commit for all 
those docs - adding and deleting.
If something went wrong before or between adding, update or deleting 
docs, i do call rollback and all is like before (i am doing the update 
from one source only so i can be sure that no one can call commit in 
between).


CommitWithin will break my possibility to rollback things, that why i 
want to explicitly call commit here.




Going forward, this is fixed in trunk by the DocumentWriterPerThread
improvements.


Will this be backported to upcoming 3.6?



Best
Erick

On Mon, Feb 6, 2012 at 11:09 AM, Torsten Krah
  wrote:

Hi,

i wonder if it is possible to commit data to solr without having to
catch SockedReadTimeout Exceptions.

I am calling commit(false, false) using a streaming server instance -
but i still have to wait>  30 seconds and catch the timeout from http
method.
I does not matter if its 30 or 60, it will fail depending on how long it
takes until the update request is processed, or can i tweak things here?

So whats the way to go here? Any other option or must i fetch those
exception and go on like done now.
The operation itself does finish successful - later on when its done -
on server side and all stuff is committed and searchable.


regards

Torsten





smime.p7s
Description: S/MIME Kryptografische Unterschrift


Re: Commit call - ReadTimeoutException -> usage scenario for big update requests and the ioexception case

2012-02-07 Thread Erick Erickson
Right, I suspect you're hitting merges. How often are you
committing? In other words, why are you committing explicitly?
It's often better to use commitWithin on the add command
and just let Solr do its work without explicitly committing.

Going forward, this is fixed in trunk by the DocumentWriterPerThread
improvements.

Best
Erick

On Mon, Feb 6, 2012 at 11:09 AM, Torsten Krah
 wrote:
> Hi,
>
> i wonder if it is possible to commit data to solr without having to
> catch SockedReadTimeout Exceptions.
>
> I am calling commit(false, false) using a streaming server instance -
> but i still have to wait > 30 seconds and catch the timeout from http
> method.
> I does not matter if its 30 or 60, it will fail depending on how long it
> takes until the update request is processed, or can i tweak things here?
>
> So whats the way to go here? Any other option or must i fetch those
> exception and go on like done now.
> The operation itself does finish successful - later on when its done -
> on server side and all stuff is committed and searchable.
>
>
> regards
>
> Torsten


Re: Commit and sessions

2012-01-27 Thread Sami Siren
On Fri, Jan 27, 2012 at 3:25 PM, Jan Høydahl  wrote:
> Hi,
>
> Yep, anything added between two commits must be regarded as lost in case of 
> crash.
> You can of course minimize this interval by using a low "commitWithin". But 
> after a crash you should always investigate whether the last minutes of adds 
> made it.

In addition to what Jan said I think you also need to watch out for
out of memory exceptions and filled disk space because I think you
loose your docs (since last commit) in those cases too.

--
 Sami Siren


Re: Commit and sessions

2012-01-27 Thread Jan Høydahl
Hi,

Yep, anything added between two commits must be regarded as lost in case of 
crash.
You can of course minimize this interval by using a low "commitWithin". But 
after a crash you should always investigate whether the last minutes of adds 
made it.

A transaction log feature is being developed, but not there yet: 
https://issues.apache.org/jira/browse/SOLR-2700

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 27. jan. 2012, at 13:05, Per Steffensen wrote:

> Hi
> 
> If I have added some document to solr, but not done explicit commit yet, and 
> I get a power outage, will I then loose data? Or asked in another way, does 
> data go into persistent store before commit? How to avoid possibility of 
> loosing data?
> 
> Does solr have some kind of session concept, so that different threads can 
> add documents to the same solr, and when one of them says "commit" it is only 
> the documents added by this thread that gets committed? Or is it always "all 
> documents added by any thread since last commit" that gets committed?
> 
> Regards, Per Steffensen



Re: Commit without an update handler?

2012-01-05 Thread Martin Koch
Yes.

However, something must actually have been updated in the index before a
commit on the master causes the slave to update (this is what was confusing
me).

Since I'll be updating the index fairly often, this will not be a problem
for me.

If, however, the external file field is updated often, but the index proper
isn't, this could be a problem.

Thanks,
/Martin

On Thu, Jan 5, 2012 at 2:56 PM, Erick Erickson wrote:

> Hmmm, does it work just to put this in the masters index and let
> replication to its tricks and issue your commit on the master?
>
> Or am I missing something here?
>
> Best
> Erick
>
> On Tue, Jan 3, 2012 at 1:33 PM, Martin Koch  wrote:
> > Hi List
> >
> > I have a Solr cluster set up in a master/slave configuration where the
> > master acts as an indexing node and the slaves serve user requests.
> >
> > To avoid accidental posts of new documents to the slaves, I have disabled
> > the update handlers.
> >
> > However, I use an externalFileField. When the file is updated, I need to
> > issue a commit to reload the new file. This requires an update handler.
> Is
> > there an update handler that doesn't accept new documents, but will
> effect
> > a commit?
> >
> > Thanks,
> > /Martin
>


Re: Commit without an update handler?

2012-01-05 Thread Erick Erickson
Hmmm, does it work just to put this in the masters index and let
replication to its tricks and issue your commit on the master?

Or am I missing something here?

Best
Erick

On Tue, Jan 3, 2012 at 1:33 PM, Martin Koch  wrote:
> Hi List
>
> I have a Solr cluster set up in a master/slave configuration where the
> master acts as an indexing node and the slaves serve user requests.
>
> To avoid accidental posts of new documents to the slaves, I have disabled
> the update handlers.
>
> However, I use an externalFileField. When the file is updated, I need to
> issue a commit to reload the new file. This requires an update handler. Is
> there an update handler that doesn't accept new documents, but will effect
> a commit?
>
> Thanks,
> /Martin


Re: commit to jira and change Status and Resolution

2011-09-02 Thread Erick Erickson
Bug (ahem, that is nudge) the committers over on the dev list to pick
it up and commit it. They'll alter the status & etc.


Best
Erick

On Thu, Sep 1, 2011 at 2:37 AM, Bernd Fehling
 wrote:
> Hi list,
>
> I have fixed an issue and created a patch (SOLR-2726) but how to
> change "Status" and "Resolution" in jira?
>
> And how to commit this, any idea?
>
> Regards,
> Bernd
>


Re: commit time and lock

2011-07-25 Thread Jonathan Rochkind

Thanks, this is helpful.

I do indeed periodically update or delete just about every doc in the 
index, so it makes sense that optimization might be neccesary even in 
post 1.4, but I'm still on 1.4 -- add this to another thing to look into 
rather than assume after I upgrade.


Indeed I was aware that it would trigger a pretty complete index 
replication, but, since it seemed to greatly improve performance (in 
1.4), so it goes. But yes, I'm STILL only updating once a day, even with 
all that. (And in fact, I'm only replicating once a day too, ha).


On 7/25/2011 10:50 AM, Erick Erickson wrote:

Yeah, the 1.4 code base is "older". That is, optimization will have more
effect on that vintage code than on 3.x and trunk code.

I should have been a bit more explicit in that other thread. In the case
where you add a bunch of documents, optimization doesn't buy you all
that much currently. If you delete a bunch of docs (or update a bunch of
existing docs), then optimization will reclaim resources. So you *could*
have a case where the size of your index shrank drastically after
optimization (say you updated the same 100K documents 10 times then
optimized).

But even that is "it depends" (tm). The new segment merging, as I remember,
will possibly reclaim deleted resources, but I'm parroting people who actually
know, so you might want to verify that if it

Optimization will almost certainly trigger a complete index replication to any
slaves configured, though.

So the usual advice is to optimize maybe once a day or week during off hours
as a starting point unless and until you can verify that your
particular situation
warrants optimizing more frequently.

Best
Erick

On Fri, Jul 22, 2011 at 11:53 AM, Jonathan Rochkind  wrote:

How old is 'older'?  I'm pretty sure I'm still getting much faster performance 
on an optimized index in Solr 1.4.

This could be due to the nature of my index and queries (which include some 
medium sized stored fields, and extensive facetting -- facetting on up to a 
dozen fields in every request, where each field can include millions of unique 
values. Amazing I can do this with good performance at all!).

It's also possible i'm wrong about that faster performance, i haven't done 
robustly valid benchmarking on a clone of my production index yet. But it 
really looks like that way to me, from what investigation I have done.

If the answer is that optimization is believed no longer neccesary on versions 
LATER than 1.4, that might be the simplest explanation.

From: Pierre GOSSE [pierre.go...@arisem.com]
Sent: Friday, July 22, 2011 10:23 AM
To: solr-user@lucene.apache.org
Subject: RE: commit time and lock

Hi Mark

I've read that in a thread title " Weird optimize performance degradation", where Erick Erickson 
states that "Older versions of Lucene would search faster on an optimized index, but this is no longer 
necessary.", and more recently in a thread you initiated a month ago "Question about 
optimization".

I'll also be very interested if anyone had a more precise idea/datas of 
benefits and tradeoff of optimize vs merge ...

Pierre


-Message d'origine-
De : Marc SCHNEIDER [mailto:marc.schneide...@gmail.com]
Envoyé : vendredi 22 juillet 2011 15:45
À : solr-user@lucene.apache.org
Objet : Re: commit time and lock

Hello,

Pierre, can you tell us where you read that?
"I've read here that optimization is not always a requirement to have an
efficient index, due to some low level changes in lucene 3.xx"

Marc.

On Fri, Jul 22, 2011 at 2:10 PM, Pierre GOSSEwrote:


Solr will response for search during optimization, but commits will have to
wait the end of the optimization process.

During optimization a new index is generated on disk by merging every
single file of the current index into one big file, so you're server will be
busy, especially regarding disk access. This may alter your response time
and has very negative effect on the replication of index if you have a
master/slave architecture.

I've read here that optimization is not always a requirement to have an
efficient index, due to some low level changes in lucene 3.xx, so maybe you
don't really need optimization. What version of solr are you using ? Maybe
someone can point toward a relevant link about optimization other than solr
wiki
http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations

Pierre


-----Message d'origine-
De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
Envoyé : vendredi 22 juillet 2011 12:45
À : solr-user@lucene.apache.org
Objet : Re: commit time and lock

Thanks for clarity.

One more thing I want to know about optimization.

Right now I am planning to optimize the server in 24 hour. Optimization is
also time taking ( last time took around 13 minutes), so I want to know
that
:

Re: commit time and lock

2011-07-25 Thread Erick Erickson
Yeah, the 1.4 code base is "older". That is, optimization will have more
effect on that vintage code than on 3.x and trunk code.

I should have been a bit more explicit in that other thread. In the case
where you add a bunch of documents, optimization doesn't buy you all
that much currently. If you delete a bunch of docs (or update a bunch of
existing docs), then optimization will reclaim resources. So you *could*
have a case where the size of your index shrank drastically after
optimization (say you updated the same 100K documents 10 times then
optimized).

But even that is "it depends" (tm). The new segment merging, as I remember,
will possibly reclaim deleted resources, but I'm parroting people who actually
know, so you might want to verify that if it

Optimization will almost certainly trigger a complete index replication to any
slaves configured, though.

So the usual advice is to optimize maybe once a day or week during off hours
as a starting point unless and until you can verify that your
particular situation
warrants optimizing more frequently.

Best
Erick

On Fri, Jul 22, 2011 at 11:53 AM, Jonathan Rochkind  wrote:
> How old is 'older'?  I'm pretty sure I'm still getting much faster 
> performance on an optimized index in Solr 1.4.
>
> This could be due to the nature of my index and queries (which include some 
> medium sized stored fields, and extensive facetting -- facetting on up to a 
> dozen fields in every request, where each field can include millions of 
> unique values. Amazing I can do this with good performance at all!).
>
> It's also possible i'm wrong about that faster performance, i haven't done 
> robustly valid benchmarking on a clone of my production index yet. But it 
> really looks like that way to me, from what investigation I have done.
>
> If the answer is that optimization is believed no longer neccesary on 
> versions LATER than 1.4, that might be the simplest explanation.
> 
> From: Pierre GOSSE [pierre.go...@arisem.com]
> Sent: Friday, July 22, 2011 10:23 AM
> To: solr-user@lucene.apache.org
> Subject: RE: commit time and lock
>
> Hi Mark
>
> I've read that in a thread title " Weird optimize performance degradation", 
> where Erick Erickson states that "Older versions of Lucene would search 
> faster on an optimized index, but this is no longer necessary.", and more 
> recently in a thread you initiated a month ago "Question about optimization".
>
> I'll also be very interested if anyone had a more precise idea/datas of 
> benefits and tradeoff of optimize vs merge ...
>
> Pierre
>
>
> -Message d'origine-
> De : Marc SCHNEIDER [mailto:marc.schneide...@gmail.com]
> Envoyé : vendredi 22 juillet 2011 15:45
> À : solr-user@lucene.apache.org
> Objet : Re: commit time and lock
>
> Hello,
>
> Pierre, can you tell us where you read that?
> "I've read here that optimization is not always a requirement to have an
> efficient index, due to some low level changes in lucene 3.xx"
>
> Marc.
>
> On Fri, Jul 22, 2011 at 2:10 PM, Pierre GOSSE wrote:
>
>> Solr will response for search during optimization, but commits will have to
>> wait the end of the optimization process.
>>
>> During optimization a new index is generated on disk by merging every
>> single file of the current index into one big file, so you're server will be
>> busy, especially regarding disk access. This may alter your response time
>> and has very negative effect on the replication of index if you have a
>> master/slave architecture.
>>
>> I've read here that optimization is not always a requirement to have an
>> efficient index, due to some low level changes in lucene 3.xx, so maybe you
>> don't really need optimization. What version of solr are you using ? Maybe
>> someone can point toward a relevant link about optimization other than solr
>> wiki
>> http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations
>>
>> Pierre
>>
>>
>> -Message d'origine-
>> De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
>> Envoyé : vendredi 22 juillet 2011 12:45
>> À : solr-user@lucene.apache.org
>> Objet : Re: commit time and lock
>>
>> Thanks for clarity.
>>
>> One more thing I want to know about optimization.
>>
>> Right now I am planning to optimize the server in 24 hour. Optimization is
>> also time taking ( last time took around 13 minutes), so I want to know
>> that
>> :
>>
>> 1. when optimization is under process that time will solr server response
>&

Re: commit time and lock

2011-07-24 Thread William Bell
What does the committers think about adding a index queue in Solr?

Then we can have lots of one-off index requests that would queue up...

On Fri, Jul 22, 2011 at 3:14 AM, Pierre GOSSE  wrote:
> Solr still respond to search queries during commit, only new indexations 
> requests will have to wait (until end of commit?). So I don't think your 
> users will experience increased response time during commits (unless your 
> server is much undersized).
>
> Pierre
>
> -Message d'origine-
> De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
> Envoyé : jeudi 21 juillet 2011 20:27
> À : solr-user@lucene.apache.org
> Objet : Re: commit time and lock
>
> Actually i m worried about the response time. i k commiting around 500
> docs in every 5 minutes. as i know,correct me if i m wrong; at the
> time of commiting solr server stop responding. my concern is how to
> minimize the response time so user not need to wait. or any other
> logic will require for my case. please suggest.
>
> regards
> jonty
>
> On Tuesday, June 21, 2011, Erick Erickson  wrote:
>> What is it you want help with? You haven't told us what the
>> problem you're trying to solve is. Are you asking how to
>> speed up indexing? What have you tried? Have you
>> looked at: http://wiki.apache.org/solr/FAQ#Performance?
>>
>> Best
>> Erick
>>
>> On Tue, Jun 21, 2011 at 2:16 AM, Jonty Rhods  wrote:
>>> I am using solrj to index the data. I have around 5 docs indexed. As at
>>> the time of commit due to lock server stop giving response so I was
>>> calculating commit time:
>>>
>>> double starttemp = System.currentTimeMillis();
>>> server.add(docs);
>>> server.commit();
>>> System.out.println("total time in commit = " + (System.currentTimeMillis() -
>>> starttemp)/1000);
>>>
>>> It taking around 9 second to commit the 5000 docs with 15 fields. However I
>>> am not confirm the lock time of index whether it is start
>>> since server.add(docs); time or server.commit(); time only.
>>>
>>> If I am changing from above to following
>>>
>>> server.add(docs);
>>> double starttemp = System.currentTimeMillis();
>>> server.commit();
>>> System.out.println("total time in commit = " + (System.currentTimeMillis() -
>>> starttemp)/1000);
>>>
>>> then commit time becomes less then 1 second. I am not sure which one is
>>> right.
>>>
>>> please help.
>>>
>>> regards
>>> Jonty
>>>
>>
>



-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076


Re: commit time and lock

2011-07-22 Thread Shawn Heisey

On 7/22/2011 9:32 AM, Pierre GOSSE wrote:

Merging does not happen often enough to keep deleted documents to a low enough 
count ?

Maybe there's a need to have "partial" optimization available in solr, meaning 
that segment with too much deleted document could be copied to a new file without 
unnecessary datas. That way cleaning deleted datas could be compatible with having light 
replications.

I'm worried by this idea of deleted documents influencing relevance scores, any 
pointer to how important this influence may be ?


I've got a pretty high mergeFactor, for fast indexing. Also, I want to 
know for sure and control when merges happen, so I am not leaving it up 
to Lucene/Solr.


Right now the largest number of deleted documents on any shard at this 
moment is 45347.  The shard (17.65GB) contains 9663271 documents, in six 
segments.  That will be one HUGE segment (from the last optimize) and 
five very very tiny segments, each with only a few thousand documents in 
them.  Tonight when the document distribution process runs, that index 
will be optimized again.  Tomorrow night a different shard will be 
optimized.


Deleted documents can (and do) happen anywhere in the index, so even if 
I had a lot of largish segments rather than one huge segment, it's very 
likely that just expunging deletes would still result in the entire 
index being merged, so I am not losing anything by doing a full 
optimize, and I am gaining a small bit of performance.


The 45000 deletes mentioned above represent less than half a percent of 
the shard, so the influence on relevance is *probably* not large ... but 
that's not something I can say definitively.  I think it all depends on 
what people are searching for and how common the terms in the deleted 
documents are.


Thanks,
Shawn



RE: commit time and lock

2011-07-22 Thread Jonathan Rochkind
How old is 'older'?  I'm pretty sure I'm still getting much faster performance 
on an optimized index in Solr 1.4. 

This could be due to the nature of my index and queries (which include some 
medium sized stored fields, and extensive facetting -- facetting on up to a 
dozen fields in every request, where each field can include millions of unique 
values. Amazing I can do this with good performance at all!). 

It's also possible i'm wrong about that faster performance, i haven't done 
robustly valid benchmarking on a clone of my production index yet. But it 
really looks like that way to me, from what investigation I have done. 

If the answer is that optimization is believed no longer neccesary on versions 
LATER than 1.4, that might be the simplest explanation. 

From: Pierre GOSSE [pierre.go...@arisem.com]
Sent: Friday, July 22, 2011 10:23 AM
To: solr-user@lucene.apache.org
Subject: RE: commit time and lock

Hi Mark

I've read that in a thread title " Weird optimize performance degradation", 
where Erick Erickson states that "Older versions of Lucene would search faster 
on an optimized index, but this is no longer necessary.", and more recently in 
a thread you initiated a month ago "Question about optimization".

I'll also be very interested if anyone had a more precise idea/datas of 
benefits and tradeoff of optimize vs merge ...

Pierre


-Message d'origine-
De : Marc SCHNEIDER [mailto:marc.schneide...@gmail.com]
Envoyé : vendredi 22 juillet 2011 15:45
À : solr-user@lucene.apache.org
Objet : Re: commit time and lock

Hello,

Pierre, can you tell us where you read that?
"I've read here that optimization is not always a requirement to have an
efficient index, due to some low level changes in lucene 3.xx"

Marc.

On Fri, Jul 22, 2011 at 2:10 PM, Pierre GOSSE wrote:

> Solr will response for search during optimization, but commits will have to
> wait the end of the optimization process.
>
> During optimization a new index is generated on disk by merging every
> single file of the current index into one big file, so you're server will be
> busy, especially regarding disk access. This may alter your response time
> and has very negative effect on the replication of index if you have a
> master/slave architecture.
>
> I've read here that optimization is not always a requirement to have an
> efficient index, due to some low level changes in lucene 3.xx, so maybe you
> don't really need optimization. What version of solr are you using ? Maybe
> someone can point toward a relevant link about optimization other than solr
> wiki
> http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations
>
> Pierre
>
>
> -Message d'origine-
> De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
> Envoyé : vendredi 22 juillet 2011 12:45
> À : solr-user@lucene.apache.org
> Objet : Re: commit time and lock
>
> Thanks for clarity.
>
> One more thing I want to know about optimization.
>
> Right now I am planning to optimize the server in 24 hour. Optimization is
> also time taking ( last time took around 13 minutes), so I want to know
> that
> :
>
> 1. when optimization is under process that time will solr server response
> or
> not?
> 2. if server will not response then how to do optimization of server fast
> or
> other way to do optimization so our user will not have to wait to finished
> optimization process.
>
> regards
> Jonty
>
>
>
> On Fri, Jul 22, 2011 at 2:44 PM, Pierre GOSSE  >wrote:
>
> > Solr still respond to search queries during commit, only new indexations
> > requests will have to wait (until end of commit?). So I don't think your
> > users will experience increased response time during commits (unless your
> > server is much undersized).
> >
> > Pierre
> >
> > -Message d'origine-
> > De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
> > Envoyé : jeudi 21 juillet 2011 20:27
> > À : solr-user@lucene.apache.org
> > Objet : Re: commit time and lock
> >
> > Actually i m worried about the response time. i k commiting around 500
> > docs in every 5 minutes. as i know,correct me if i m wrong; at the
> > time of commiting solr server stop responding. my concern is how to
> > minimize the response time so user not need to wait. or any other
> > logic will require for my case. please suggest.
> >
> > regards
> > jonty
> >
> > On Tuesday, June 21, 2011, Erick Erickson 
> wrote:
> > > What is it you want help with? You haven't told us what the
> > > problem you're trying to solve is. Are you asking how 

RE: commit time and lock

2011-07-22 Thread Pierre GOSSE
Merging does not happen often enough to keep deleted documents to a low enough 
count ?

Maybe there's a need to have "partial" optimization available in solr, meaning 
that segment with too much deleted document could be copied to a new file 
without unnecessary datas. That way cleaning deleted datas could be compatible 
with having light replications.

I'm worried by this idea of deleted documents influencing relevance scores, any 
pointer to how important this influence may be ?

Pierre

-Message d'origine-
De : Shawn Heisey [mailto:s...@elyograg.org] 
Envoyé : vendredi 22 juillet 2011 16:42
À : solr-user@lucene.apache.org
Objet : Re: commit time and lock

On 7/22/2011 8:23 AM, Pierre GOSSE wrote:
> I've read that in a thread title " Weird optimize performance degradation", 
> where Erick Erickson states that "Older versions of Lucene would search 
> faster on an optimized index, but this is no longer necessary.", and more 
> recently in a thread you initiated a month ago "Question about optimization".
>
> I'll also be very interested if anyone had a more precise idea/datas of 
> benefits and tradeoff of optimize vs merge ...

My most recent testing has been with Solr 3.2.0.  I have noticed some 
speedup after optimizing an index, but the gain is not 
earth-shattering.  My index consists of 7 shards.  One of them is small, 
and receives all new documents every two minutes.  The others are large, 
and aside from deletes, are mostly static.  Once a day, the oldest data 
is distributed from the small shard to its proper place in the other six 
shards.

The small shard is optimized once an hour, and usually takes less than a 
minute.  I optimize one large shard every day, so each one gets 
optimized once every six days.  That optimize takes 10-15 minutes.  The 
only reason that I optimize is to remove deleted documents, whatever 
speedup I get is just icing on the cake.  Deleted documents take up 
space and continue to influence the relevance scoring of queries, so I 
want to remove them.

Thanks,
Shawn



Re: commit time and lock

2011-07-22 Thread Shawn Heisey

On 7/22/2011 8:23 AM, Pierre GOSSE wrote:

I've read that in a thread title " Weird optimize performance degradation", where Erick Erickson 
states that "Older versions of Lucene would search faster on an optimized index, but this is no longer 
necessary.", and more recently in a thread you initiated a month ago "Question about 
optimization".

I'll also be very interested if anyone had a more precise idea/datas of 
benefits and tradeoff of optimize vs merge ...


My most recent testing has been with Solr 3.2.0.  I have noticed some 
speedup after optimizing an index, but the gain is not 
earth-shattering.  My index consists of 7 shards.  One of them is small, 
and receives all new documents every two minutes.  The others are large, 
and aside from deletes, are mostly static.  Once a day, the oldest data 
is distributed from the small shard to its proper place in the other six 
shards.


The small shard is optimized once an hour, and usually takes less than a 
minute.  I optimize one large shard every day, so each one gets 
optimized once every six days.  That optimize takes 10-15 minutes.  The 
only reason that I optimize is to remove deleted documents, whatever 
speedup I get is just icing on the cake.  Deleted documents take up 
space and continue to influence the relevance scoring of queries, so I 
want to remove them.


Thanks,
Shawn



RE: commit time and lock

2011-07-22 Thread Pierre GOSSE
Hi Mark

I've read that in a thread title " Weird optimize performance degradation", 
where Erick Erickson states that "Older versions of Lucene would search faster 
on an optimized index, but this is no longer necessary.", and more recently in 
a thread you initiated a month ago "Question about optimization".

I'll also be very interested if anyone had a more precise idea/datas of 
benefits and tradeoff of optimize vs merge ...

Pierre


-Message d'origine-
De : Marc SCHNEIDER [mailto:marc.schneide...@gmail.com] 
Envoyé : vendredi 22 juillet 2011 15:45
À : solr-user@lucene.apache.org
Objet : Re: commit time and lock

Hello,

Pierre, can you tell us where you read that?
"I've read here that optimization is not always a requirement to have an
efficient index, due to some low level changes in lucene 3.xx"

Marc.

On Fri, Jul 22, 2011 at 2:10 PM, Pierre GOSSE wrote:

> Solr will response for search during optimization, but commits will have to
> wait the end of the optimization process.
>
> During optimization a new index is generated on disk by merging every
> single file of the current index into one big file, so you're server will be
> busy, especially regarding disk access. This may alter your response time
> and has very negative effect on the replication of index if you have a
> master/slave architecture.
>
> I've read here that optimization is not always a requirement to have an
> efficient index, due to some low level changes in lucene 3.xx, so maybe you
> don't really need optimization. What version of solr are you using ? Maybe
> someone can point toward a relevant link about optimization other than solr
> wiki
> http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations
>
> Pierre
>
>
> -Message d'origine-----
> De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
> Envoyé : vendredi 22 juillet 2011 12:45
> À : solr-user@lucene.apache.org
> Objet : Re: commit time and lock
>
> Thanks for clarity.
>
> One more thing I want to know about optimization.
>
> Right now I am planning to optimize the server in 24 hour. Optimization is
> also time taking ( last time took around 13 minutes), so I want to know
> that
> :
>
> 1. when optimization is under process that time will solr server response
> or
> not?
> 2. if server will not response then how to do optimization of server fast
> or
> other way to do optimization so our user will not have to wait to finished
> optimization process.
>
> regards
> Jonty
>
>
>
> On Fri, Jul 22, 2011 at 2:44 PM, Pierre GOSSE  >wrote:
>
> > Solr still respond to search queries during commit, only new indexations
> > requests will have to wait (until end of commit?). So I don't think your
> > users will experience increased response time during commits (unless your
> > server is much undersized).
> >
> > Pierre
> >
> > -Message d'origine-
> > De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
> > Envoyé : jeudi 21 juillet 2011 20:27
> > À : solr-user@lucene.apache.org
> > Objet : Re: commit time and lock
> >
> > Actually i m worried about the response time. i k commiting around 500
> > docs in every 5 minutes. as i know,correct me if i m wrong; at the
> > time of commiting solr server stop responding. my concern is how to
> > minimize the response time so user not need to wait. or any other
> > logic will require for my case. please suggest.
> >
> > regards
> > jonty
> >
> > On Tuesday, June 21, 2011, Erick Erickson 
> wrote:
> > > What is it you want help with? You haven't told us what the
> > > problem you're trying to solve is. Are you asking how to
> > > speed up indexing? What have you tried? Have you
> > > looked at: http://wiki.apache.org/solr/FAQ#Performance?
> > >
> > > Best
> > > Erick
> > >
> > > On Tue, Jun 21, 2011 at 2:16 AM, Jonty Rhods 
> > wrote:
> > >> I am using solrj to index the data. I have around 5 docs indexed.
> As
> > at
> > >> the time of commit due to lock server stop giving response so I was
> > >> calculating commit time:
> > >>
> > >> double starttemp = System.currentTimeMillis();
> > >> server.add(docs);
> > >> server.commit();
> > >> System.out.println("total time in commit = " +
> > (System.currentTimeMillis() -
> > >> starttemp)/1000);
> > >>
> > >> It taking around 9 second to commit the 5000 docs with 15 fields.
> > However I
> > >> am not confirm the lock time of index whether it is start
> > >> since server.add(docs); time or server.commit(); time only.
> > >>
> > >> If I am changing from above to following
> > >>
> > >> server.add(docs);
> > >> double starttemp = System.currentTimeMillis();
> > >> server.commit();
> > >> System.out.println("total time in commit = " +
> > (System.currentTimeMillis() -
> > >> starttemp)/1000);
> > >>
> > >> then commit time becomes less then 1 second. I am not sure which one
> is
> > >> right.
> > >>
> > >> please help.
> > >>
> > >> regards
> > >> Jonty
> > >>
> > >
> >
>


Re: commit time and lock

2011-07-22 Thread Marc SCHNEIDER
Hello,

Pierre, can you tell us where you read that?
"I've read here that optimization is not always a requirement to have an
efficient index, due to some low level changes in lucene 3.xx"

Marc.

On Fri, Jul 22, 2011 at 2:10 PM, Pierre GOSSE wrote:

> Solr will response for search during optimization, but commits will have to
> wait the end of the optimization process.
>
> During optimization a new index is generated on disk by merging every
> single file of the current index into one big file, so you're server will be
> busy, especially regarding disk access. This may alter your response time
> and has very negative effect on the replication of index if you have a
> master/slave architecture.
>
> I've read here that optimization is not always a requirement to have an
> efficient index, due to some low level changes in lucene 3.xx, so maybe you
> don't really need optimization. What version of solr are you using ? Maybe
> someone can point toward a relevant link about optimization other than solr
> wiki
> http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations
>
> Pierre
>
>
> -Message d'origine-
> De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
> Envoyé : vendredi 22 juillet 2011 12:45
> À : solr-user@lucene.apache.org
> Objet : Re: commit time and lock
>
> Thanks for clarity.
>
> One more thing I want to know about optimization.
>
> Right now I am planning to optimize the server in 24 hour. Optimization is
> also time taking ( last time took around 13 minutes), so I want to know
> that
> :
>
> 1. when optimization is under process that time will solr server response
> or
> not?
> 2. if server will not response then how to do optimization of server fast
> or
> other way to do optimization so our user will not have to wait to finished
> optimization process.
>
> regards
> Jonty
>
>
>
> On Fri, Jul 22, 2011 at 2:44 PM, Pierre GOSSE  >wrote:
>
> > Solr still respond to search queries during commit, only new indexations
> > requests will have to wait (until end of commit?). So I don't think your
> > users will experience increased response time during commits (unless your
> > server is much undersized).
> >
> > Pierre
> >
> > -Message d'origine-
> > De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
> > Envoyé : jeudi 21 juillet 2011 20:27
> > À : solr-user@lucene.apache.org
> > Objet : Re: commit time and lock
> >
> > Actually i m worried about the response time. i k commiting around 500
> > docs in every 5 minutes. as i know,correct me if i m wrong; at the
> > time of commiting solr server stop responding. my concern is how to
> > minimize the response time so user not need to wait. or any other
> > logic will require for my case. please suggest.
> >
> > regards
> > jonty
> >
> > On Tuesday, June 21, 2011, Erick Erickson 
> wrote:
> > > What is it you want help with? You haven't told us what the
> > > problem you're trying to solve is. Are you asking how to
> > > speed up indexing? What have you tried? Have you
> > > looked at: http://wiki.apache.org/solr/FAQ#Performance?
> > >
> > > Best
> > > Erick
> > >
> > > On Tue, Jun 21, 2011 at 2:16 AM, Jonty Rhods 
> > wrote:
> > >> I am using solrj to index the data. I have around 5 docs indexed.
> As
> > at
> > >> the time of commit due to lock server stop giving response so I was
> > >> calculating commit time:
> > >>
> > >> double starttemp = System.currentTimeMillis();
> > >> server.add(docs);
> > >> server.commit();
> > >> System.out.println("total time in commit = " +
> > (System.currentTimeMillis() -
> > >> starttemp)/1000);
> > >>
> > >> It taking around 9 second to commit the 5000 docs with 15 fields.
> > However I
> > >> am not confirm the lock time of index whether it is start
> > >> since server.add(docs); time or server.commit(); time only.
> > >>
> > >> If I am changing from above to following
> > >>
> > >> server.add(docs);
> > >> double starttemp = System.currentTimeMillis();
> > >> server.commit();
> > >> System.out.println("total time in commit = " +
> > (System.currentTimeMillis() -
> > >> starttemp)/1000);
> > >>
> > >> then commit time becomes less then 1 second. I am not sure which one
> is
> > >> right.
> > >>
> > >> please help.
> > >>
> > >> regards
> > >> Jonty
> > >>
> > >
> >
>


RE: commit time and lock

2011-07-22 Thread Pierre GOSSE
Solr will response for search during optimization, but commits will have to 
wait the end of the optimization process.

During optimization a new index is generated on disk by merging every single 
file of the current index into one big file, so you're server will be busy, 
especially regarding disk access. This may alter your response time and has 
very negative effect on the replication of index if you have a master/slave 
architecture.

I've read here that optimization is not always a requirement to have an 
efficient index, due to some low level changes in lucene 3.xx, so maybe you 
don't really need optimization. What version of solr are you using ? Maybe 
someone can point toward a relevant link about optimization other than solr 
wiki 
http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations

Pierre


-Message d'origine-
De : Jonty Rhods [mailto:jonty.rh...@gmail.com] 
Envoyé : vendredi 22 juillet 2011 12:45
À : solr-user@lucene.apache.org
Objet : Re: commit time and lock

Thanks for clarity.

One more thing I want to know about optimization.

Right now I am planning to optimize the server in 24 hour. Optimization is
also time taking ( last time took around 13 minutes), so I want to know that
:

1. when optimization is under process that time will solr server response or
not?
2. if server will not response then how to do optimization of server fast or
other way to do optimization so our user will not have to wait to finished
optimization process.

regards
Jonty



On Fri, Jul 22, 2011 at 2:44 PM, Pierre GOSSE wrote:

> Solr still respond to search queries during commit, only new indexations
> requests will have to wait (until end of commit?). So I don't think your
> users will experience increased response time during commits (unless your
> server is much undersized).
>
> Pierre
>
> -Message d'origine-
> De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
> Envoyé : jeudi 21 juillet 2011 20:27
> À : solr-user@lucene.apache.org
> Objet : Re: commit time and lock
>
> Actually i m worried about the response time. i k commiting around 500
> docs in every 5 minutes. as i know,correct me if i m wrong; at the
> time of commiting solr server stop responding. my concern is how to
> minimize the response time so user not need to wait. or any other
> logic will require for my case. please suggest.
>
> regards
> jonty
>
> On Tuesday, June 21, 2011, Erick Erickson  wrote:
> > What is it you want help with? You haven't told us what the
> > problem you're trying to solve is. Are you asking how to
> > speed up indexing? What have you tried? Have you
> > looked at: http://wiki.apache.org/solr/FAQ#Performance?
> >
> > Best
> > Erick
> >
> > On Tue, Jun 21, 2011 at 2:16 AM, Jonty Rhods 
> wrote:
> >> I am using solrj to index the data. I have around 5 docs indexed. As
> at
> >> the time of commit due to lock server stop giving response so I was
> >> calculating commit time:
> >>
> >> double starttemp = System.currentTimeMillis();
> >> server.add(docs);
> >> server.commit();
> >> System.out.println("total time in commit = " +
> (System.currentTimeMillis() -
> >> starttemp)/1000);
> >>
> >> It taking around 9 second to commit the 5000 docs with 15 fields.
> However I
> >> am not confirm the lock time of index whether it is start
> >> since server.add(docs); time or server.commit(); time only.
> >>
> >> If I am changing from above to following
> >>
> >> server.add(docs);
> >> double starttemp = System.currentTimeMillis();
> >> server.commit();
> >> System.out.println("total time in commit = " +
> (System.currentTimeMillis() -
> >> starttemp)/1000);
> >>
> >> then commit time becomes less then 1 second. I am not sure which one is
> >> right.
> >>
> >> please help.
> >>
> >> regards
> >> Jonty
> >>
> >
>


Re: commit time and lock

2011-07-22 Thread Jonty Rhods
Thanks for clarity.

One more thing I want to know about optimization.

Right now I am planning to optimize the server in 24 hour. Optimization is
also time taking ( last time took around 13 minutes), so I want to know that
:

1. when optimization is under process that time will solr server response or
not?
2. if server will not response then how to do optimization of server fast or
other way to do optimization so our user will not have to wait to finished
optimization process.

regards
Jonty



On Fri, Jul 22, 2011 at 2:44 PM, Pierre GOSSE wrote:

> Solr still respond to search queries during commit, only new indexations
> requests will have to wait (until end of commit?). So I don't think your
> users will experience increased response time during commits (unless your
> server is much undersized).
>
> Pierre
>
> -Message d'origine-
> De : Jonty Rhods [mailto:jonty.rh...@gmail.com]
> Envoyé : jeudi 21 juillet 2011 20:27
> À : solr-user@lucene.apache.org
> Objet : Re: commit time and lock
>
> Actually i m worried about the response time. i k commiting around 500
> docs in every 5 minutes. as i know,correct me if i m wrong; at the
> time of commiting solr server stop responding. my concern is how to
> minimize the response time so user not need to wait. or any other
> logic will require for my case. please suggest.
>
> regards
> jonty
>
> On Tuesday, June 21, 2011, Erick Erickson  wrote:
> > What is it you want help with? You haven't told us what the
> > problem you're trying to solve is. Are you asking how to
> > speed up indexing? What have you tried? Have you
> > looked at: http://wiki.apache.org/solr/FAQ#Performance?
> >
> > Best
> > Erick
> >
> > On Tue, Jun 21, 2011 at 2:16 AM, Jonty Rhods 
> wrote:
> >> I am using solrj to index the data. I have around 5 docs indexed. As
> at
> >> the time of commit due to lock server stop giving response so I was
> >> calculating commit time:
> >>
> >> double starttemp = System.currentTimeMillis();
> >> server.add(docs);
> >> server.commit();
> >> System.out.println("total time in commit = " +
> (System.currentTimeMillis() -
> >> starttemp)/1000);
> >>
> >> It taking around 9 second to commit the 5000 docs with 15 fields.
> However I
> >> am not confirm the lock time of index whether it is start
> >> since server.add(docs); time or server.commit(); time only.
> >>
> >> If I am changing from above to following
> >>
> >> server.add(docs);
> >> double starttemp = System.currentTimeMillis();
> >> server.commit();
> >> System.out.println("total time in commit = " +
> (System.currentTimeMillis() -
> >> starttemp)/1000);
> >>
> >> then commit time becomes less then 1 second. I am not sure which one is
> >> right.
> >>
> >> please help.
> >>
> >> regards
> >> Jonty
> >>
> >
>


RE: commit time and lock

2011-07-22 Thread Pierre GOSSE
Solr still respond to search queries during commit, only new indexations 
requests will have to wait (until end of commit?). So I don't think your users 
will experience increased response time during commits (unless your server is 
much undersized).

Pierre

-Message d'origine-
De : Jonty Rhods [mailto:jonty.rh...@gmail.com] 
Envoyé : jeudi 21 juillet 2011 20:27
À : solr-user@lucene.apache.org
Objet : Re: commit time and lock

Actually i m worried about the response time. i k commiting around 500
docs in every 5 minutes. as i know,correct me if i m wrong; at the
time of commiting solr server stop responding. my concern is how to
minimize the response time so user not need to wait. or any other
logic will require for my case. please suggest.

regards
jonty

On Tuesday, June 21, 2011, Erick Erickson  wrote:
> What is it you want help with? You haven't told us what the
> problem you're trying to solve is. Are you asking how to
> speed up indexing? What have you tried? Have you
> looked at: http://wiki.apache.org/solr/FAQ#Performance?
>
> Best
> Erick
>
> On Tue, Jun 21, 2011 at 2:16 AM, Jonty Rhods  wrote:
>> I am using solrj to index the data. I have around 5 docs indexed. As at
>> the time of commit due to lock server stop giving response so I was
>> calculating commit time:
>>
>> double starttemp = System.currentTimeMillis();
>> server.add(docs);
>> server.commit();
>> System.out.println("total time in commit = " + (System.currentTimeMillis() -
>> starttemp)/1000);
>>
>> It taking around 9 second to commit the 5000 docs with 15 fields. However I
>> am not confirm the lock time of index whether it is start
>> since server.add(docs); time or server.commit(); time only.
>>
>> If I am changing from above to following
>>
>> server.add(docs);
>> double starttemp = System.currentTimeMillis();
>> server.commit();
>> System.out.println("total time in commit = " + (System.currentTimeMillis() -
>> starttemp)/1000);
>>
>> then commit time becomes less then 1 second. I am not sure which one is
>> right.
>>
>> please help.
>>
>> regards
>> Jonty
>>
>


Re: commit time and lock

2011-07-21 Thread Jonty Rhods
Actually i m worried about the response time. i k commiting around 500
docs in every 5 minutes. as i know,correct me if i m wrong; at the
time of commiting solr server stop responding. my concern is how to
minimize the response time so user not need to wait. or any other
logic will require for my case. please suggest.

regards
jonty

On Tuesday, June 21, 2011, Erick Erickson  wrote:
> What is it you want help with? You haven't told us what the
> problem you're trying to solve is. Are you asking how to
> speed up indexing? What have you tried? Have you
> looked at: http://wiki.apache.org/solr/FAQ#Performance?
>
> Best
> Erick
>
> On Tue, Jun 21, 2011 at 2:16 AM, Jonty Rhods  wrote:
>> I am using solrj to index the data. I have around 5 docs indexed. As at
>> the time of commit due to lock server stop giving response so I was
>> calculating commit time:
>>
>> double starttemp = System.currentTimeMillis();
>> server.add(docs);
>> server.commit();
>> System.out.println("total time in commit = " + (System.currentTimeMillis() -
>> starttemp)/1000);
>>
>> It taking around 9 second to commit the 5000 docs with 15 fields. However I
>> am not confirm the lock time of index whether it is start
>> since server.add(docs); time or server.commit(); time only.
>>
>> If I am changing from above to following
>>
>> server.add(docs);
>> double starttemp = System.currentTimeMillis();
>> server.commit();
>> System.out.println("total time in commit = " + (System.currentTimeMillis() -
>> starttemp)/1000);
>>
>> then commit time becomes less then 1 second. I am not sure which one is
>> right.
>>
>> please help.
>>
>> regards
>> Jonty
>>
>


Re: commit time and lock

2011-06-22 Thread Ranveer

Dear all,

Kindly help me..

thanks

On Tuesday 21 June 2011 11:46 AM, Jonty Rhods wrote:

I am using solrj to index the data. I have around 5 docs indexed. As at
the time of commit due to lock server stop giving response so I was
calculating commit time:

double starttemp = System.currentTimeMillis();
server.add(docs);
server.commit();
System.out.println("total time in commit = " + (System.currentTimeMillis() -
starttemp)/1000);

It taking around 9 second to commit the 5000 docs with 15 fields. However I
am not confirm the lock time of index whether it is start
since server.add(docs); time or server.commit(); time only.

If I am changing from above to following

server.add(docs);
double starttemp = System.currentTimeMillis();
server.commit();
System.out.println("total time in commit = " + (System.currentTimeMillis() -
starttemp)/1000);

then commit time becomes less then 1 second. I am not sure which one is
right.

please help.

regards
Jonty





Re: commit time and lock

2011-06-21 Thread Erick Erickson
What is it you want help with? You haven't told us what the
problem you're trying to solve is. Are you asking how to
speed up indexing? What have you tried? Have you
looked at: http://wiki.apache.org/solr/FAQ#Performance?

Best
Erick

On Tue, Jun 21, 2011 at 2:16 AM, Jonty Rhods  wrote:
> I am using solrj to index the data. I have around 5 docs indexed. As at
> the time of commit due to lock server stop giving response so I was
> calculating commit time:
>
> double starttemp = System.currentTimeMillis();
> server.add(docs);
> server.commit();
> System.out.println("total time in commit = " + (System.currentTimeMillis() -
> starttemp)/1000);
>
> It taking around 9 second to commit the 5000 docs with 15 fields. However I
> am not confirm the lock time of index whether it is start
> since server.add(docs); time or server.commit(); time only.
>
> If I am changing from above to following
>
> server.add(docs);
> double starttemp = System.currentTimeMillis();
> server.commit();
> System.out.println("total time in commit = " + (System.currentTimeMillis() -
> starttemp)/1000);
>
> then commit time becomes less then 1 second. I am not sure which one is
> right.
>
> please help.
>
> regards
> Jonty
>


Re: Commit taking very long

2011-06-07 Thread Erick Erickson
Are you optimizing? That is unnecessary when committing, and is often the
culprit.


Best
Erick

On Tue, Jun 7, 2011 at 5:42 AM, Rohit Gupta  wrote:
> Hi,
>
> My commit seems to be taking too much time, if you notice from the Dataimport
> status given below to commit 1000 docs its taking longer than 24 minutes
>
> 
> busy
> A command is still running...
> -
> 
> 0:24:43.156
> 1001
> 1658
> 0
> 2011-06-07 09:15:17
> -
> 
> Indexing completed. Added/Updated: 1000 documents. Deleted 0 documents.
> 
> 
>
> What can be causing this, I have tried looking for a reason or a way to 
> improve
> this, but am just not able to find. At this rate my documents would never get
> indexed, given that I have more than 100,000 records coming into the database
> every hour.
>
> Regards,
> Rohit


Re: commit configuration

2011-05-26 Thread Markus Jelsma
http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/conf/solrconfig.xml

Look for autocommit and maxDocs.

> Hi,
> 
> I'm using DIH and want to perform commits each N processed document, how
> can I do this?
> thanks in advance


Re: commit=true has no effect

2010-11-24 Thread stockii

its so strange ...

- i copy the solrconfig.xml from this core, that works and no changes
- i delete all fields in my query and change it to a simple query with two
fields. no commit ...


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-true-has-no-effect-tp1952567p1959587.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: commit=true has no effect

2010-11-24 Thread stockii

DIH Response XML:

2
1
0
2010-11-24 09:56:11
2010-11-24 09:56:11
2010-11-24 09:56:11
2010-11-24 09:56:11
1
0
0:0:0.234

here i am missing the lines:

Indexing completed. Added/Updated: 1 documents. Deleted 0
documents.
2010-11-24 09:57:08
2010-11-24 09:57:08

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-true-has-no-effect-tp1952567p1959524.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: commit=true has no effect

2010-11-24 Thread stockii

Here is my query:

http://lucene.472066.n3.nabble.com/commit-true-has-no-effect-tp1952567p1959429.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: commit=true has no effect

2010-11-23 Thread stockiii

Okay, sry and thx for reply. 

I Know the Links that you post and i Know the Most dih Settings from Wiki.
Im Not New in solr ... Dih says To me After a Delta that Some documenty
changed, but He Dong want to Commit. The auery is nö Broken, i check this,
changed the query and expert with. But with nö Effects. In Few Hours i can
post the querys. 

I tried all dih Option: waitflush = waitsearcher = false, clean, optimize,
commit of Course ;) ... But with nö Effects. Full Import imported the
missing documents but Not delta. Delta found the entries im db but don't
change...

Solfconfig.xml is Not different with my ohter Codes! 

Dih Status says nothing about commit or optimize ...


Anyone Other the Same Problem ?!
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-true-has-no-effect-tp1952567p1957709.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: commit=true has no effect

2010-11-23 Thread Erick Erickson
Patience, my friend. It's still early in the morning and people are thinking
about Thanksgiving ...

We need more details. My first guess is that "only the sql statement
changed"
means that something's wrong with the new SQL. There's a little-known
debug console for DIH you might want to investigate, something like:'
http://localhost:8983/solr/admin/dataimport.jsp?handler=dataimport
also
see:
http://www.packtpub.com/article/indexing-data-solr-1.4-enterprise-search-server-2

If
that doesn't help much, perhaps you could provide more details on how you
know things are failing. What have you tried? Also, post your SQL statement.

Best
Erick

On Tue, Nov 23, 2010 at 8:48 AM, stockii  wrote:

>
> =(  anyone a  idea ?
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/commit-true-has-no-effect-tp1952567p1953391.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: commit=true has no effect

2010-11-23 Thread stockii

=(  anyone a  idea ? 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/commit-true-has-no-effect-tp1952567p1953391.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commit/Optimise question

2010-10-31 Thread Savvas-Andreas Moysidis
Thanks Eric. For the record, we are using 1.4.1 and SolrJ.

On 31 October 2010 01:54, Erick Erickson  wrote:

> What version of Solr are you using?
>
> About committing. I'd just let the solr defaults handle that. You configure
> this in the autocommit section of solrconfig.xml. I'm pretty sure this
>  gets
> triggered even if you're using SolrJ.
>
> That said, it's probably wise to issue a commit after all your data is
> indexed
> too, just to flush any remaining documents since the last autocommit.
>
> Optimize should not be issued until you're all done, if at all. If
> you're not deleting (or updating) documents, don't bother to optimize
> unless the number of files in your index directory gets really large.
> Recent Solr code almost removes the need to optimize unless you
> delete documents, but I confess I don't know the revision number
> "recent" refers to, perhaps only trunk...
>
> HTH
> Erick
>
> On Thu, Oct 28, 2010 at 9:56 AM, Savvas-Andreas Moysidis <
> savvas.andreas.moysi...@googlemail.com> wrote:
>
> > Hello,
> >
> > We currently index our data through a SQL-DIH setup but due to our model
> > (and therefore sql query) becoming complex we need to index our data
> > programmatically. As we didn't have to deal with commit/optimise before,
> we
> > are now wondering whether there is an optimal approach to that. Is there
> a
> > batch size after which we should fire a commit or should we execute a
> > commit
> > after indexing all of our data? What about optimise?
> >
> > Our document corpus is > 4m documents and through DIH the resulting index
> > is
> > around 1.5G
> >
> > We have searched previous posts but couldn't find a definite answer. Any
> > input much appreciated!
> >
> > Regards,
> > -- Savvas
> >
>


Re: Commit/Optimise question

2010-10-30 Thread Erick Erickson
What version of Solr are you using?

About committing. I'd just let the solr defaults handle that. You configure
this in the autocommit section of solrconfig.xml. I'm pretty sure this  gets
triggered even if you're using SolrJ.

That said, it's probably wise to issue a commit after all your data is
indexed
too, just to flush any remaining documents since the last autocommit.

Optimize should not be issued until you're all done, if at all. If
you're not deleting (or updating) documents, don't bother to optimize
unless the number of files in your index directory gets really large.
Recent Solr code almost removes the need to optimize unless you
delete documents, but I confess I don't know the revision number
"recent" refers to, perhaps only trunk...

HTH
Erick

On Thu, Oct 28, 2010 at 9:56 AM, Savvas-Andreas Moysidis <
savvas.andreas.moysi...@googlemail.com> wrote:

> Hello,
>
> We currently index our data through a SQL-DIH setup but due to our model
> (and therefore sql query) becoming complex we need to index our data
> programmatically. As we didn't have to deal with commit/optimise before, we
> are now wondering whether there is an optimal approach to that. Is there a
> batch size after which we should fire a commit or should we execute a
> commit
> after indexing all of our data? What about optimise?
>
> Our document corpus is > 4m documents and through DIH the resulting index
> is
> around 1.5G
>
> We have searched previous posts but couldn't find a definite answer. Any
> input much appreciated!
>
> Regards,
> -- Savvas
>


Re: commit is taking very very long time

2010-07-23 Thread Mark Miller
On 7/23/10 5:59 PM, Alexey Serba wrote:

> Another option is to set optimize=false in DIH call ( it's true by
> default ). 

Ouch - that should really be changed then.

- Mark


Re: commit is taking very very long time

2010-07-23 Thread Alexey Serba
> I am not sure why some commits take very long time.
Hmm... Because it merges index segments... How large is your index?

> Also is there a way to reduce the time it takes?
You can disable commit in DIH call and use autoCommit instead. It's
kind of hack because you postpone commit operation and make it async.

Another option is to set optimize=false in DIH call ( it's true by
default ). Also you can try to increase mergeFactor parameter but it
would affect search performance.


[Solved] Re: Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-12 Thread Markus Fischer

Hi,

On 07.05.2010 22:47, Chris Hostetter wrote:

so it's the full request time, and would be inclusive of any postCommit
event handlers -- that's important to know.  the logs will help clear up
wether the underlying "commit" is really taking up a large amount of time
or if it's some postCommit even (like spellcheck index building, or
snapshooting, etc...)


I finally was able to solve the issue and you hit the nail pretty hard: 
the spellchecker was the culprit. I assume it was a) an oversight or b) 
ignorance on our part about the spellchecker.


I wasn't aware that building the spellchecker index took so long (2 or 
more minutes for around 800.000 documents); we disabled buildOnCommit 
and only build the spellchecker once every night.




:>  what do your Solr logs say about the commit, and the subsequent
:>  newSearcher?
:
: How can I get such logs? I was asking for in the second mail do this

it depends on what servlet container you use to run Solr, and how it's
configured.  In the simple the "java -jar start.jar" jetty setup used for
the Solr example, jetty dumps them to STDOUT in your console.  But most
servlet containers will write log messages to a file someplace (and even
jetty will log to a file if it's configured to do so -- most production
instance are)


The start.jar was incredible helpful to dissect things.

Thanks Chris and everyone who helped me!

- Markus



Re: Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-07 Thread Chris Hostetter

: The measurement was done outside our Solr client which sends the update
: and then the commit to the handler. I also see the update-URL call in
: the Tomcat Manager taking up that amount of time.

so it's the full request time, and would be inclusive of any postCommit 
event handlers -- that's important to know.  the logs will help clear up 
wether the underlying "commit" is really taking up a large amount of time 
or if it's some postCommit even (like spellcheck index building, or 
snapshooting, etc...)

: > what do your Solr logs say about the commit, and the subsequent 
: > newSearcher?
: 
: How can I get such logs? I was asking for in the second mail do this

it depends on what servlet container you use to run Solr, and how it's 
configured.  In the simple the "java -jar start.jar" jetty setup used for 
the Solr example, jetty dumps them to STDOUT in your console.  But most
servlet containers will write log messages to a file someplace (and even 
jetty will log to a file if it's configured to do so -- most production 
instance are)


-Hoss



Re: Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-05 Thread Erick Erickson
The mail servers are often not too friendly with attachments, so people
either inline configs or put them on a server and post the URL.

HTH
Erick

On Wed, May 5, 2010 at 12:06 PM, Markus Fischer  wrote:

> Hi,
>
> On 05.05.2010 03:49, Chris Hostetter wrote:
> >
> > : Are you accidentally building the spellchecker database on each commit?
> >   ...
> > : > This could also be caused by performing an optimize after the commit,
> or it
> > : > could be caused by auto warming the caches, or a combination of both.
> >
> > The heart of the matter being: it's pretty much impossible to guess what
> > is taking up all this time (and eating up all that CPU) w/o seeing your
> > configs, and having a better idea of how you are timing this "1 to 2
> > minutes" ... is this what your client sending the commit reports? what
> > exactly is the command it's executing?
>
> The measurement was done outside our Solr client which sends the update
> and then the commit to the handler. I also see the update-URL call in
> the Tomcat Manager taking up that amount of time.
>
> > what do your Solr logs say about the commit, and the subsequent
> > newSearcher?
>
> How can I get such logs? I was asking for in the second mail do this
> thread; basically it's the same problem I'm having, as I've no clue were
> it costs that time.
>
> I'll get the configs asap; and yes, we're using a spellchecker too. Is
> this list fine with attachments? Or should I just paste the XML stuff
> inline? Or use pastebin?
>
> thanks for the help!
>
> - Markus
>


Re: Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-05 Thread Markus Fischer
Hi,

On 05.05.2010 03:49, Chris Hostetter wrote:
> 
> : Are you accidentally building the spellchecker database on each commit?
>   ...
> : > This could also be caused by performing an optimize after the commit, or 
> it
> : > could be caused by auto warming the caches, or a combination of both.
> 
> The heart of the matter being: it's pretty much impossible to guess what 
> is taking up all this time (and eating up all that CPU) w/o seeing your 
> configs, and having a better idea of how you are timing this "1 to 2 
> minutes" ... is this what your client sending the commit reports? what 
> exactly is the command it's executing?

The measurement was done outside our Solr client which sends the update
and then the commit to the handler. I also see the update-URL call in
the Tomcat Manager taking up that amount of time.

> what do your Solr logs say about the commit, and the subsequent 
> newSearcher?

How can I get such logs? I was asking for in the second mail do this
thread; basically it's the same problem I'm having, as I've no clue were
it costs that time.

I'll get the configs asap; and yes, we're using a spellchecker too. Is
this list fine with attachments? Or should I just paste the XML stuff
inline? Or use pastebin?

thanks for the help!

- Markus


Re: Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-04 Thread Chris Hostetter

: Are you accidentally building the spellchecker database on each commit?
...
: > This could also be caused by performing an optimize after the commit, or it
: > could be caused by auto warming the caches, or a combination of both.

The heart of the matter being: it's pretty much impossible to guess what 
is taking up all this time (and eating up all that CPU) w/o seeing your 
configs, and having a better idea of how you are timing this "1 to 2 
minutes" ... is this what your client sending the commit reports? what 
exactly is the command it's executing?

what do your Solr logs say about the commit, and the subsequent 
newSearcher?


-Hoss



Re: Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-04 Thread Lance Norskog
Are you accidentally building the spellchecker database on each commit?

An option is to use the MergePolicy stuff to avoid merging during
normal commits, but I failed to understand the interactions of
configuration numbers. It's a bit of a jungle in there.

On Tue, May 4, 2010 at 5:43 AM,   wrote:
> Hi,
>
> This could also be caused by performing an optimize after the commit, or it
> could be caused by auto warming the caches, or a combination of both.
>
> If you are using the Data Import Handler the default for a delta import is
> commit and optimize, which caused us a similar problem except we were
> optimizing a 7 million document, 23Gb index with every delta import which
> was taking over 10 minutes. As soon as we added optimize=false to the
> command updates took a few seconds. You can always add separate calls to
> perform the optimize when it's convenient for you.
>
> To see if the problem is auto warming take a look at the warm up time for
> the searcher. If this is the cause you will need to consider lowering the
> autowarmCount for your caches.
>
>
> Colin.
>
>> -Original Message-
>> From: Markus Fischer [mailto:mar...@fischer.name]
>> Sent: Tuesday, May 04, 2010 6:22 AM
>> To: solr-user@lucene.apache.org
>> Subject: Re: Commit takes 1 to 2 minutes, CPU usage affects other apps
>>
>> On 04.05.2010 11:01, Peter Sturge wrote:
>> > It might be worth checking the VMWare environment - if you're using
>> the
>> > VMWare scsi vmdk and it's shared across multiple VMs and there's a
>> lot of
>> > disk contention (i.e. multiple VMs are all busy reading/writing
>> to/from the
>> > same disk channel), this can really slow down I/O operations.
>>
>> Ok, thanks, I'll try to get the information from my hoster.
>>
>> I noticed that the commiting seems to be constant in time: it doesn't
>> matter whether I'm updating only one document or 50 (usually it won't
>> be
>> more). Maybe these numbers are so low anyway to cause any real impact
>> ...
>>
>> - Markus
>
>
>
>
>



-- 
Lance Norskog
goks...@gmail.com


RE: Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-04 Thread cbennett
Hi,

This could also be caused by performing an optimize after the commit, or it
could be caused by auto warming the caches, or a combination of both.

If you are using the Data Import Handler the default for a delta import is
commit and optimize, which caused us a similar problem except we were
optimizing a 7 million document, 23Gb index with every delta import which
was taking over 10 minutes. As soon as we added optimize=false to the
command updates took a few seconds. You can always add separate calls to
perform the optimize when it's convenient for you.

To see if the problem is auto warming take a look at the warm up time for
the searcher. If this is the cause you will need to consider lowering the
autowarmCount for your caches. 


Colin.

> -Original Message-
> From: Markus Fischer [mailto:mar...@fischer.name]
> Sent: Tuesday, May 04, 2010 6:22 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Commit takes 1 to 2 minutes, CPU usage affects other apps
> 
> On 04.05.2010 11:01, Peter Sturge wrote:
> > It might be worth checking the VMWare environment - if you're using
> the
> > VMWare scsi vmdk and it's shared across multiple VMs and there's a
> lot of
> > disk contention (i.e. multiple VMs are all busy reading/writing
> to/from the
> > same disk channel), this can really slow down I/O operations.
> 
> Ok, thanks, I'll try to get the information from my hoster.
> 
> I noticed that the commiting seems to be constant in time: it doesn't
> matter whether I'm updating only one document or 50 (usually it won't
> be
> more). Maybe these numbers are so low anyway to cause any real impact
> ...
> 
> - Markus






Re: Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-04 Thread Markus Fischer

On 04.05.2010 11:01, Peter Sturge wrote:

It might be worth checking the VMWare environment - if you're using the
VMWare scsi vmdk and it's shared across multiple VMs and there's a lot of
disk contention (i.e. multiple VMs are all busy reading/writing to/from the
same disk channel), this can really slow down I/O operations.


Ok, thanks, I'll try to get the information from my hoster.

I noticed that the commiting seems to be constant in time: it doesn't 
matter whether I'm updating only one document or 50 (usually it won't be 
more). Maybe these numbers are so low anyway to cause any real impact ...


- Markus



Re: Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-04 Thread Peter Sturge
It might be worth checking the VMWare environment - if you're using the
VMWare scsi vmdk and it's shared across multiple VMs and there's a lot of
disk contention (i.e. multiple VMs are all busy reading/writing to/from the
same disk channel), this can really slow down I/O operations.


On Tue, May 4, 2010 at 8:52 AM, Markus Fischer  wrote:

> Hi,
>
>
> On 04.05.2010 03:24, Mark Miller wrote:
>
>> On 5/3/10 9:06 AM, Markus Fischer wrote:
>>
>>> we recently began having trouble with our Solr 1.4 instance. We've about
>>> 850k documents in the index which is about 1.2GB in size; the JVM which
>>> runs tomcat/solr (no other apps are deployed) has been given 2GB.
>>>
>>> We've a forum and run a process every minute which indexes the new
>>> messages. The number of messages updated are from 0 to 20 messages
>>> average. The commit takes about 1 or two minutes, but usually when it
>>> finished a few seconds later the next batch of documents is processed
>>> and the story starts again.
>>>
>>> Our environment is being providing by a company purely using VMWare
>>> infrastructure, the Solr index itself is on an NSF for which we get some
>>> 33MB/s throughput.
>>>
>>
>> That is certainly not a normal commit time for an index of that size.
>>
>> Note that Solr 1.4 can have issues when working on NFS, but I don't know
>> that it would have anything to do with this.
>>
>> Are you using the simple lock factory rather than the default native
>> lock factory? (as you should do when running on NFS)
>>
>
> I've switched the lockType to "simple" but didn't see any timing
> difference; it's still somewhat between one or two minutes.
>
> In my last test case I tested with the indexing having been updated with
> only a single document.
>
> I'm not very familiar with getting more debug information or similar out of
> Solr; is there a way to enable something to find out what's actually doing
> and what costs much time?
>
> thanks so far,
> - Markus
>
>


Re: Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-04 Thread Markus Fischer

Hi,

On 04.05.2010 03:24, Mark Miller wrote:

On 5/3/10 9:06 AM, Markus Fischer wrote:

we recently began having trouble with our Solr 1.4 instance. We've about
850k documents in the index which is about 1.2GB in size; the JVM which
runs tomcat/solr (no other apps are deployed) has been given 2GB.

We've a forum and run a process every minute which indexes the new
messages. The number of messages updated are from 0 to 20 messages
average. The commit takes about 1 or two minutes, but usually when it
finished a few seconds later the next batch of documents is processed
and the story starts again.

Our environment is being providing by a company purely using VMWare
infrastructure, the Solr index itself is on an NSF for which we get some
33MB/s throughput.


That is certainly not a normal commit time for an index of that size.

Note that Solr 1.4 can have issues when working on NFS, but I don't know
that it would have anything to do with this.

Are you using the simple lock factory rather than the default native
lock factory? (as you should do when running on NFS)


I've switched the lockType to "simple" but didn't see any timing 
difference; it's still somewhat between one or two minutes.


In my last test case I tested with the indexing having been updated with 
only a single document.


I'm not very familiar with getting more debug information or similar out 
of Solr; is there a way to enable something to find out what's actually 
doing and what costs much time?


thanks so far,
- Markus



Re: Commit takes 1 to 2 minutes, CPU usage affects other apps

2010-05-03 Thread Mark Miller

On 5/3/10 9:06 AM, Markus Fischer wrote:

Hi,

we recently began having trouble with our Solr 1.4 instance. We've about
850k documents in the index which is about 1.2GB in size; the JVM which
runs tomcat/solr (no other apps are deployed) has been given 2GB.

We've a forum and run a process every minute which indexes the new
messages. The number of messages updated are from 0 to 20 messages
average. The commit takes about 1 or two minutes, but usually when it
finished a few seconds later the next batch of documents is processed
and the story starts again.

So actually it's like Solr is running commits all day long and CPU usage
ranges from 80% to 120%.

This continuous CPU usage caused ill effects on other services running
on the same machine.

Our environment is being providing by a company purely using VMWare
infrastructure, the Solr index itself is on an NSF for which we get some
33MB/s throughput.

So, an easy solution would be to just put more resources into it, e.g. a
separate machine. But before I make the decision I'd like to find out
whether the app behaves properly under this circumstances or if its
possible to shorten the commit time down to a few seconds so the CPU is
not drained that long.

thanks for any pointers,

- Markus



That is certainly not a normal commit time for an index of that size.

Note that Solr 1.4 can have issues when working on NFS, but I don't know 
that it would have anything to do with this.


Are you using the simple lock factory rather than the default native 
lock factory? (as you should do when running on NFS)


--
- Mark

http://www.lucidimagination.com


RE: commit fails on weblogic

2010-01-22 Thread Joe Kessel

Within the weblogic console I have unchecked the Enable Keepalives and have 
been able to get by this error on commit, but it now fails on optimize.  Using 
TCPMon it was noticed that multiple request where on the same connection, 
including the commit.

 

As I've read that Solr runs fine on Weblogic, I assume  this issue is with the 
StreamingUpdateSolrServer.

 

Thanks,

Joe

 

Caused by: org.apache.commons.httpclient.ProtocolException: Unbuffered entity 
enclosing request can not be repeated.
at 
org.apache.commons.httpclient.methods.EntityEnclosingMethod.writeRequestBody(EntityEnclosingMethod.java:483)
at 
org.apache.commons.httpclient.HttpMethodBase.writeRequest(HttpMethodBase.java:1973)
at 
org.apache.commons.httpclient.HttpMethodBase.execute(HttpMethodBase.java:993)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:397)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:170)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:396)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:324)
at 
org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer$Runner.run(StreamingUpdateSolrServer.java:135)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
 
> Date: Thu, 21 Jan 2010 15:33:50 -0800
> Subject: Re: commit fails on weblogic
> From: goks...@gmail.com
> To: solr-user@lucene.apache.org
> 
> There might be a limit in Weblogic on the number or length of
> parameters allowed in a POST.
> 
> On Thu, Jan 21, 2010 at 7:37 AM, Joe Kessel  wrote:
> >
> > Using Solr 1.4 and the StreamingUpdateSolrServer on Weblogic 10.3 and get 
> > the following error on commit.  The data seems to load fine, and the same 
> > code works fine with Tomcat.  On the client side an Internal Server Error 
> > is reported.
> >
> >
> >
> > Thanks,
> >
> > Joe
> >
> >
> >
> > weblogic.utils.NestedRuntimeException: Cannot parse POST parameters of 
> > request: '/martini-solr-1.4.0-SP2/CORE_1_0_01/update'
> >  at 
> > weblogic.servlet.internal.ServletRequestImpl$RequestParameters.mergePostParams(ServletRequestImpl.java:2021)
> >  at 
> > weblogic.servlet.internal.ServletRequestImpl$RequestParameters.parseQueryParams(ServletRequestImpl.java:1901)
> >  at 
> > weblogic.servlet.internal.ServletRequestImpl$RequestParameters.peekParameter(ServletRequestImpl.java:2047)
> >  at 
> > weblogic.servlet.internal.ServletRequestImpl$SessionHelper.initSessionInfoWithContext(ServletRequestImpl.java:2602)
> >  at 
> > weblogic.servlet.internal.ServletRequestImpl$SessionHelper.initSessionInfo(ServletRequestImpl.java:2506)
> >  at 
> > weblogic.servlet.internal.ServletRequestImpl$SessionHelper.getSessionInternal(ServletRequestImpl.java:2281)
> >  at 
> > weblogic.servlet.internal.ServletRequestImpl$SessionHelper.getSession(ServletRequestImpl.java:2271)
> >  at 
> > weblogic.servlet.internal.ServletRequestImpl.getSession(ServletRequestImpl.java:1245)
> >  at 
> > weblogic.servlet.security.internal.SecurityModule$SessionRetrievalAction.run(SecurityModule.java:591)
> >  at 
> > weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
> >  at weblogic.security.service.SecurityManager.runAs(Unknown Source)
> >  at 
> > weblogic.servlet.security.internal.SecurityModule.getUserSession(SecurityModule.java:482)
> >  at 
> > weblogic.servlet.security.internal.ServletSecurityManager.checkAccess(ServletSecurityManager.java:81)
> >  at 
> > weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2116)
> >  at 
> > weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2086)
> >  at 
> > weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1406)
> >  at weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
> >  at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
> > java.net.SocketTimeoutException: Read timed out
> >  at java.net.SocketInputStream.socketRead0(Native Method)
> >  at java.net.SocketInputStream.read(SocketInputStream.java:129)
> >  at weblogic.servlet.internal.PostInputStream.read(PostInputStream.java:142)
> >  at 
> > weblogic.utils.http.HttpChunkInputStream.readChunkSize(HttpChunkInputStream.java:109)
> >  at 
> > weblogic.utils.http.HttpChunkInput

Re: commit fails on weblogic

2010-01-21 Thread Lance Norskog
There might be a limit in Weblogic on the number or length of
parameters allowed in a POST.

On Thu, Jan 21, 2010 at 7:37 AM, Joe Kessel  wrote:
>
> Using Solr 1.4 and the StreamingUpdateSolrServer on Weblogic 10.3 and get the 
> following error on commit.  The data seems to load fine, and the same code 
> works fine with Tomcat.  On the client side an Internal Server Error is 
> reported.
>
>
>
> Thanks,
>
> Joe
>
>
>
> weblogic.utils.NestedRuntimeException: Cannot parse POST parameters of 
> request: '/martini-solr-1.4.0-SP2/CORE_1_0_01/update'
>  at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.mergePostParams(ServletRequestImpl.java:2021)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.parseQueryParams(ServletRequestImpl.java:1901)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.peekParameter(ServletRequestImpl.java:2047)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$SessionHelper.initSessionInfoWithContext(ServletRequestImpl.java:2602)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$SessionHelper.initSessionInfo(ServletRequestImpl.java:2506)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$SessionHelper.getSessionInternal(ServletRequestImpl.java:2281)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$SessionHelper.getSession(ServletRequestImpl.java:2271)
>  at 
> weblogic.servlet.internal.ServletRequestImpl.getSession(ServletRequestImpl.java:1245)
>  at 
> weblogic.servlet.security.internal.SecurityModule$SessionRetrievalAction.run(SecurityModule.java:591)
>  at 
> weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
>  at weblogic.security.service.SecurityManager.runAs(Unknown Source)
>  at 
> weblogic.servlet.security.internal.SecurityModule.getUserSession(SecurityModule.java:482)
>  at 
> weblogic.servlet.security.internal.ServletSecurityManager.checkAccess(ServletSecurityManager.java:81)
>  at 
> weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2116)
>  at 
> weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2086)
>  at 
> weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1406)
>  at weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
>  at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
> java.net.SocketTimeoutException: Read timed out
>  at java.net.SocketInputStream.socketRead0(Native Method)
>  at java.net.SocketInputStream.read(SocketInputStream.java:129)
>  at weblogic.servlet.internal.PostInputStream.read(PostInputStream.java:142)
>  at 
> weblogic.utils.http.HttpChunkInputStream.readChunkSize(HttpChunkInputStream.java:109)
>  at 
> weblogic.utils.http.HttpChunkInputStream.initChunk(HttpChunkInputStream.java:71)
>  at 
> weblogic.utils.http.HttpChunkInputStream.read(HttpChunkInputStream.java:142)
>  at 
> weblogic.utils.http.HttpChunkInputStream.read(HttpChunkInputStream.java:182)
>  at 
> weblogic.servlet.internal.ServletInputStreamImpl.read(ServletInputStreamImpl.java:222)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.mergePostParams(ServletRequestImpl.java:1995)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.parseQueryParams(ServletRequestImpl.java:1901)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$RequestParameters.peekParameter(ServletRequestImpl.java:2047)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$SessionHelper.initSessionInfoWithContext(ServletRequestImpl.java:2602)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$SessionHelper.initSessionInfo(ServletRequestImpl.java:2506)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$SessionHelper.getSessionInternal(ServletRequestImpl.java:2281)
>  at 
> weblogic.servlet.internal.ServletRequestImpl$SessionHelper.getSession(ServletRequestImpl.java:2271)
>  at 
> weblogic.servlet.internal.ServletRequestImpl.getSession(ServletRequestImpl.java:1245)
>  at 
> weblogic.servlet.security.internal.SecurityModule$SessionRetrievalAction.run(SecurityModule.java:591)
>  at 
> weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
>  at weblogic.security.service.SecurityManager.runAs(Unknown Source)
>  at 
> weblogic.servlet.security.internal.SecurityModule.getUserSession(SecurityModule.java:482)
>  at 
> weblogic.servlet.security.internal.ServletSecurityManager.checkAccess(ServletSecurityManager.java:81)
>  at 
> weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2116)
>  at 
> weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2086)
>  at 
> weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1406)
>  at weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
>  at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
>>
>  
> <[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default 
> (self-tuning)'> <> <>

Re: Commit error

2009-11-11 Thread Licinio Fernández Maurelo
Thanks Israel, i've done a sucesfull import using optimize=false

2009/11/11 Israel Ekpo 

> 2009/11/11 Licinio Fernández Maurelo 
>
> > Hi folks,
> >
> > i'm getting this error while committing after a dataimport of only 12
> docs
> > !!!
> >
> > Exception while solr commit.
> > java.io.IOException: background merge hit exception: _3kta:C2329239
> > _3ktb:c11->_3ktb into _3ktc [optimize] [mergeDocStores]
> > at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2829)
> > at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2750)
> > at
> >
> >
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:401)
> > at
> >
> >
> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85)
> > at
> >
> >
> org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:138)
> > at
> >
> >
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:66)
> > at
> > org.apache.solr.handler.dataimport.SolrWriter.commit(SolrWriter.java:170)
> > at
> > org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:208)
> > at
> >
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:185)
> > at
> >
> >
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
> > at
> >
> >
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:393)
> > at
> >
> >
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
> > Caused by: java.io.IOException: No hay espacio libre en el dispositivo
> > at java.io.RandomAccessFile.writeBytes(Native Method)
> > at java.io.RandomAccessFile.write(RandomAccessFile.java:499)
> > at
> >
> >
> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:191)
> > at
> >
> >
> org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)
> > at
> >
> >
> org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:85)
> > at
> >
> >
> org.apache.lucene.store.BufferedIndexOutput.writeBytes(BufferedIndexOutput.java:75)
> > at org.apache.lucene.store.IndexOutput.writeBytes(IndexOutput.java:45)
> > at
> >
> >
> org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWriter.java:229)
> > at
> >
> >
> org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter.java:184)
> > at
> >
> >
> org.apache.lucene.index.SegmentMerger.createCompoundFile(SegmentMerger.java:217)
> > at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5089)
> > at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4589)
> > at
> >
> >
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235)
> > at
> >
> >
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
> >
> > Index info: 2.600.000 docs | 11G size
> > System info: 15GB free disk space
> >
> > When attempting to commit the disk usage increases until solr breaks ...
> it
> > looks like 15 GB is not enought space to do the merge | optimize
> >
> > Any advice?
> >
> > --
> > Lici
> >
>
>
> Hi Licinio,
>
> During the the optimization process, the index size would be approximately
> double what it was originally and the remaining space on disk may not be
> enough for the task.
>
> You are describing exactly what could be going on
> --
> "Good Enough" is not good enough.
> To give anything less than your best is to sacrifice the gift.
> Quality First. Measure Twice. Cut Once.
>



-- 
Lici


Re: Commit error

2009-11-11 Thread Israel Ekpo
2009/11/11 Licinio Fernández Maurelo 

> Hi folks,
>
> i'm getting this error while committing after a dataimport of only 12 docs
> !!!
>
> Exception while solr commit.
> java.io.IOException: background merge hit exception: _3kta:C2329239
> _3ktb:c11->_3ktb into _3ktc [optimize] [mergeDocStores]
> at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2829)
> at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2750)
> at
>
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:401)
> at
>
> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85)
> at
>
> org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:138)
> at
>
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:66)
> at
> org.apache.solr.handler.dataimport.SolrWriter.commit(SolrWriter.java:170)
> at
> org.apache.solr.handler.dataimport.DocBuilder.finish(DocBuilder.java:208)
> at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:185)
> at
>
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:333)
> at
>
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:393)
> at
>
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:372)
> Caused by: java.io.IOException: No hay espacio libre en el dispositivo
> at java.io.RandomAccessFile.writeBytes(Native Method)
> at java.io.RandomAccessFile.write(RandomAccessFile.java:499)
> at
>
> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:191)
> at
>
> org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)
> at
>
> org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:85)
> at
>
> org.apache.lucene.store.BufferedIndexOutput.writeBytes(BufferedIndexOutput.java:75)
> at org.apache.lucene.store.IndexOutput.writeBytes(IndexOutput.java:45)
> at
>
> org.apache.lucene.index.CompoundFileWriter.copyFile(CompoundFileWriter.java:229)
> at
>
> org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter.java:184)
> at
>
> org.apache.lucene.index.SegmentMerger.createCompoundFile(SegmentMerger.java:217)
> at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5089)
> at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4589)
> at
>
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235)
> at
>
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
>
> Index info: 2.600.000 docs | 11G size
> System info: 15GB free disk space
>
> When attempting to commit the disk usage increases until solr breaks ... it
> looks like 15 GB is not enought space to do the merge | optimize
>
> Any advice?
>
> --
> Lici
>


Hi Licinio,

During the the optimization process, the index size would be approximately
double what it was originally and the remaining space on disk may not be
enough for the task.

You are describing exactly what could be going on
-- 
"Good Enough" is not good enough.
To give anything less than your best is to sacrifice the gift.
Quality First. Measure Twice. Cut Once.


Re: commit question

2009-05-26 Thread Anshuman Manur
Hey Ashish,

If commit fails, the documents won't be indexed! You can look at your index
by pointing luke  to your data folder (a Solr
index is a Lucene index) or hit:

http://host:port/solr/admin/luke/

to get an xml reply of what your index looks like.

You can commit again and there won't be any document duplicates as long as
you are using a unique ID to identify each document. A same ID commit causes
solr to overwrite/update any existing documents.

But, I'm not sure what happens if commit fails halfway through a large
document set! Would the docs that have already been committed stay or is
commit an atomic op.?

Anshu

On Wed, May 27, 2009 at 8:43 AM, Ashish P  wrote:

>
> Hi,
> Any idea if documents from solr server are cleared even if commit fails or
> I
> can still again try commit after some time??
> Thanks,
> Ashish
>
>
> Ashish P wrote:
> >
> > If I add 10 document to solrServer as in solrServer.addIndex(docs) (
> Using
> > Embedded ) and then I commit and commit fails for for some reason. Then
> > can I retry this commit lets say after some time or the added documents
> > are lost??
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/commit-question-tp23717415p23735301.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Re: commit question

2009-05-26 Thread Ashish P

Hi,
Any idea if documents from solr server are cleared even if commit fails or I
can still again try commit after some time??
Thanks,
Ashish


Ashish P wrote:
> 
> If I add 10 document to solrServer as in solrServer.addIndex(docs) ( Using
> Embedded ) and then I commit and commit fails for for some reason. Then
> can I retry this commit lets say after some time or the added documents
> are lost??
> 
> 

-- 
View this message in context: 
http://www.nabble.com/commit-question-tp23717415p23735301.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: commit / new searcher delay?

2009-04-14 Thread sunnyfr

Hi Hossman,

I would love to know either how do you manage this ? 

thanks,


Shalin Shekhar Mangar wrote:
> 
> On Fri, Mar 6, 2009 at 8:47 AM, Steve Conover  wrote:
> 
>> That's exactly what I'm doing, but I'm explicitly replicating, and
>> committing.  Even under these circumstances, what could explain the
>> delay after commit before the new index becomes available?
>>
> 
> How are you explicitly replicating? I mean, how do you make sure that the
> slave has actually finished replication and the new index is available
> now?
> Are you using the script based replication or the new java based one?
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/commit---new-searcher-delay--tp22342916p23036207.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Commit is taking very long time

2009-03-24 Thread Chris Hostetter

: My application is in prod and quite frequently�getting NullPointerException.
...
: java.lang.NullPointerException
: at 
com.fm.search.incrementalindex.service.AuctionCollectionServiceImpl.indexData(AuctionCollectionServiceImpl.java:251)
: at 
com.fm.search.incrementalindex.service.AuctionCollectionServiceImpl.process(AuctionCollectionServiceImpl.java:135)
: at 
com.fm.search.job.SearchIndexingJob.executeInternal(SearchIndexingJob.java:68)
: at 
org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86)
: at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
: at 
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:529)

that stack trace doesn't suggest anything remotely related to Solr.  none 
of those classes are in teh Solr code base -- without having any idea what 
the code on line 251 of your AuctionCollectionServiceImpl class looks 
like, no one could even begin to speculate what is causing the NPE.   Even 
if we know what line 251 looks like, uinderstanding why some refrence on 
that line is null would probably require knowing a whole lot more about 
your application.




-Hoss


Re: Commit is taking very long time

2009-03-19 Thread mahendra mahendra
Hi,
 
Sorry in delaying to mail!
 
My application is in prod and quite frequently getting NullPointerException.
Initially I thought this is happening because of memory issue, so I reduced 
mergeFactor to 5 and reduced number of document per commit to 2000. After these 
changes for some time it stopped getting NullPointerException.
 
In one of my prod box again started getting NullPointerException. Its taking 
more time at commit and throwing an exception. Its trying to commit only 15 
records when it is throwing exception. I deleted the index directory manually 
and reindexed same data, it just completed in few seconds.
 
How I can overcome these problems. Thanks in advance!!
 
Log
 
INFO: [EnglishAuction2-0] webapp=/solr path=/admin/ping 
params={wt=javabin&version=2.2} hits=0 status=0 QTime=47 
19-Mar-2009 18:55:03 org.apache.solr.core.SolrCore execute
INFO: [EnglishAuction2-0] webapp=/solr path=/admin/ping 
params={wt=javabin&version=2.2} status=0 QTime=47 
19-Mar-2009 18:57:06 org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[27812087, 27812088, 27812089, 27812090, 27812091, 27812079, 
27812080, 27812081, ...(15 more)]} 0 122952
19-Mar-2009 18:57:06 org.apache.solr.core.SolrCore execute
INFO: [EnglishAuction2-0] webapp=/solr path=/update 
params={wt=javabin&version=2.2} status=0 QTime=122952 
19-Mar-2009 18:59:03 org.apache.solr.core.SolrCore execute
INFO: [EnglishAuction2-0] webapp=/solr path=/admin/ping 
params={wt=javabin&version=2.2} hits=0 status=0 QTime=16 
19-Mar-2009 18:59:03 org.apache.solr.core.SolrCore execute
INFO: [EnglishAuction2-0] webapp=/solr path=/admin/ping 
params={wt=javabin&version=2.2} status=0 QTime=16 
19-Mar-2009 18:59:03 org.apache.solr.update.processor.LogUpdateProcessor finish
INFO: {add=[27812087, 27812088, 27812089, 27812090, 27812091, 27812079, 
27812080, 27812081, ...(15 more)]} 0 141
19-Mar-2009 18:59:03 org.apache.solr.core.SolrCore execute
INFO: [EnglishAuction2-0] webapp=/solr path=/update 
params={wt=javabin&version=2.2} status=0 QTime=141 
19-Mar-2009 18:59:03 org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=false,waitFlush=false,waitSearcher=false)
 
StackTrace
 
java.lang.NullPointerException
at 
com.fm.search.incrementalindex.service.AuctionCollectionServiceImpl.indexData(AuctionCollectionServiceImpl.java:251)
at 
com.fm.search.incrementalindex.service.AuctionCollectionServiceImpl.process(AuctionCollectionServiceImpl.java:135)
at 
com.fm.search.job.SearchIndexingJob.executeInternal(SearchIndexingJob.java:68)
at 
org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:86)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:529)
 
Thanks,
Mahendra

--- On Sat, 3/14/09, Yonik Seeley  wrote:

From: Yonik Seeley 
Subject: Re: Commit is taking very long time
To: solr-user@lucene.apache.org
Date: Saturday, March 14, 2009, 6:59 AM

>From your logs, it looks like the time is spent in closing of the index.
There may be some pending deletes buffered, but they shouldn't take too
long.
There could also be a merge triggered... but this would only happen
sometimes, not every time you commit.

One more relatively recent change in Lucene is to sync the index files
for safety.
Are you perhaps running on Linux with the ext3 filesystem?

Not sure what's causing the null pointer exception... do you have a stack
trace?

-Yonik
http://www.lucidimagination.com


On Fri, Mar 13, 2009 at 9:05 PM, mahendra mahendra
 wrote:
> Hello,
>
> I am experiencing strange problems while doing commit. I am doing indexing
for every 10 min to update index with data base values. commit is taking 7 to 10
min approximately and my indexing is failing due to null pointer exception. If
first thread is not completed in 10 min the second thread will be starting to
index data.
> I changed wait=false for the listener from solrconfig.xml file. It stopped
getting Null pointer exception but the commit is taking 7 to 10 min. I have
approximately 70 to 90 kb of data every time.
>    
>   solr/bin/snapshooter
>   .
>   false
>    arg1
arg2 
>   
MYVAR=val1 
>     
> I kept all default parameter values in solrconfig.xml except the
ramBuffersize to 512.
> Could you please tell me how can I overcome these problems, also some
times I see "INFO: Failed to unregister mbean: partitioned because it was
not registered
> Mar 13, 2009 11:49:16 AM org.apache.solr.core.JmxMonitoredMap
unregister" in my log files.
>
> Log file
>
> ar 13, 2009 1:28:40 PM org.apache.solr.core.SolrCore execute
> INFO: [EnglishAuction1-0] webapp=/solr path=/update
params={wt=javabin&waitFlush=true&commit=true&waitSearcher=true&version=2.2}
status=0 QTime=247232
> Mar 13, 2009 1:30:32 PM
org.apache.solr.update.processor.LogUpdateProcessor finish
> INFO: {a

Re: Commit is taking very long time

2009-03-13 Thread Yonik Seeley
>From your logs, it looks like the time is spent in closing of the index.
There may be some pending deletes buffered, but they shouldn't take too long.
There could also be a merge triggered... but this would only happen
sometimes, not every time you commit.

One more relatively recent change in Lucene is to sync the index files
for safety.
Are you perhaps running on Linux with the ext3 filesystem?

Not sure what's causing the null pointer exception... do you have a stack trace?

-Yonik
http://www.lucidimagination.com


On Fri, Mar 13, 2009 at 9:05 PM, mahendra mahendra
 wrote:
> Hello,
>
> I am experiencing strange problems while doing commit. I am doing indexing 
> for every 10 min to update index with data base values. commit is taking 7 to 
> 10 min approximately and my indexing is failing due to null pointer 
> exception. If first thread is not completed in 10 min the second thread will 
> be starting to index data.
> I changed wait=false for the listener from solrconfig.xml file. It stopped 
> getting Null pointer exception but the commit is taking 7 to 10 min. I have 
> approximately 70 to 90 kb of data every time.
>    
>   solr/bin/snapshooter
>   .
>   false
>    arg1 arg2 
>    MYVAR=val1 
>     
> I kept all default parameter values in solrconfig.xml except the 
> ramBuffersize to 512.
> Could you please tell me how can I overcome these problems, also some times I 
> see "INFO: Failed to unregister mbean: partitioned because it was not 
> registered
> Mar 13, 2009 11:49:16 AM org.apache.solr.core.JmxMonitoredMap unregister" in 
> my log files.
>
> Log file
>
> ar 13, 2009 1:28:40 PM org.apache.solr.core.SolrCore execute
> INFO: [EnglishAuction1-0] webapp=/solr path=/update 
> params={wt=javabin&waitFlush=true&commit=true&waitSearcher=true&version=2.2} 
> status=0 QTime=247232
> Mar 13, 2009 1:30:32 PM org.apache.solr.update.processor.LogUpdateProcessor 
> finish
> INFO: {add=[79827482, 79845504, 79850902, 79850913, 79850697, 79850833, 
> 79850901, 79798207, ...(93 more)]} 0 62578
> Mar 13, 2009 1:30:32 PM org.apache.solr.core.SolrCore execute
> INFO: [EnglishAuction1-0] webapp=/solr path=/update 
> params={wt=javabin&version=2.2} status=0 QTime=62578
> Mar 13, 2009 1:30:32 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: start commit(optimize=false,waitFlush=true,waitSearcher=true)
> Mar 13, 2009 1:34:38 PM org.apache.solr.search.SolrIndexSearcher 
> INFO: Opening searc...@1ba5edf main
> Mar 13, 2009 1:34:38 PM org.apache.solr.update.DirectUpdateHandler2 commit
> INFO: end_commit_flush
> Mar 13, 2009 1:34:38 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming searc...@1ba5edf main from searc...@81f25 main
>  filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Mar 13, 2009 1:34:38 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming result for searc...@1ba5edf main
>  filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Mar 13, 2009 1:34:38 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming searc...@1ba5edf main from searc...@81f25 main
>  queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=3,evictions=0,size=3,warmupTime=63,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Mar 13, 2009 1:34:38 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming result for searc...@1ba5edf main
>  queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=3,evictions=0,size=3,warmupTime=94,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Mar 13, 2009 1:34:38 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming searc...@1ba5edf main from searc...@81f25 main
>  documentCache{lookups=0,hits=0,hitratio=0.00,inserts=20,evictions=0,size=20,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Mar 13, 2009 1:34:38 PM org.apache.solr.search.SolrIndexSearcher warm
> INFO: autowarming result for searc...@1ba5edf main
>  documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
> Mar 13, 2009 1:34:38 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener sending requests to searc...@1ba5edf main
> Mar 13, 2009 1:34:38 PM org.apache.solr.core.SolrCore execute
> INFO: [EnglishAuction1-0] webapp=null path=null 
> params={rows=10&start=0&q=solr} hits=0 status=0 QTime=0
> Mar 13, 2009 1:34:38 PM org.apache.solr.core.QuerySenderListener newSearcher
> INFO: QuerySenderListener don

Re: commit / new searcher delay?

2009-03-06 Thread Shalin Shekhar Mangar
On Fri, Mar 6, 2009 at 8:47 AM, Steve Conover  wrote:

> That's exactly what I'm doing, but I'm explicitly replicating, and
> committing.  Even under these circumstances, what could explain the
> delay after commit before the new index becomes available?
>

How are you explicitly replicating? I mean, how do you make sure that the
slave has actually finished replication and the new index is available now?
Are you using the script based replication or the new java based one?

-- 
Regards,
Shalin Shekhar Mangar.


Re: commit / new searcher delay?

2009-03-05 Thread Chris Hostetter
: I suspect this has something to do with waiting for the searcher to
: warm and switch over (?).  Though, I'm confused because when I print
: out /solr/admin/registry.jsp, the hashcode of the Searcher changes
: immediately (as the commit docs say, the commit operation blocks by
: default until a new searcher is in place).  I've tried turning off all
: caching, to no effect.

off the top of my head i don't remember if registry.jsp will start to list 
hte new searcher even if it's not swapped in to become hte current 
searcher.

what you should check, is when the slave logs hte end of the commit, and 
how long that is after the start of the commit. (and of course: if it logs 
any warming stats -- you might have overlooked a cache)

: Anyone have any idea what could be going on here?  Ideally, 
: would be an operation that blocks until the exact moment when the new
: searcher is in place and is actually serving based on the new index

that's how it works if you use waitSearcher=true ... are you sure you're 
not seeing the effects of some caching (HTTP?) in between solr and your 
client?  did you try setting never304=true in the  section of 
solrconfig.xml ?



-Hoss



Re: commit / new searcher delay?

2009-03-05 Thread Steve Conover
That's exactly what I'm doing, but I'm explicitly replicating, and
committing.  Even under these circumstances, what could explain the
delay after commit before the new index becomes available?

On Thu, Mar 5, 2009 at 10:55 AM, Shalin Shekhar Mangar
 wrote:
> On Thu, Mar 5, 2009 at 10:30 PM, Steve Conover  wrote:
>
>> Yep, I notice the default is true/true, but I explicitly specified
>> both those things too and there's no difference in behavior.
>>
>
> Perhaps you are indexing on the master and then searching on the slaves? It
> may be the delay introduced by replication.
> --
> Regards,
> Shalin Shekhar Mangar.
>


Re: commit / new searcher delay?

2009-03-05 Thread Shalin Shekhar Mangar
On Thu, Mar 5, 2009 at 10:30 PM, Steve Conover  wrote:

> Yep, I notice the default is true/true, but I explicitly specified
> both those things too and there's no difference in behavior.
>

Perhaps you are indexing on the master and then searching on the slaves? It
may be the delay introduced by replication.
-- 
Regards,
Shalin Shekhar Mangar.


Re: commit / new searcher delay?

2009-03-05 Thread Steve Conover
Yep, I notice the default is true/true, but I explicitly specified
both those things too and there's no difference in behavior.

On Wed, Mar 4, 2009 at 7:39 PM, Shalin Shekhar Mangar
 wrote:
> On Thu, Mar 5, 2009 at 6:06 AM, Steve Conover  wrote:
>
>> I'm doing some testing of a solr master/slave config and find that,
>> after syncing my slave, I need to sleep for about 400ms after commit
>> to "see" the new index state.  i.e. if I don't sleep, and I execute a
>> query, I get results that reflect the prior state of the index.
>>
>
> How are you sending the commit? You should use commit with waitSearcher=true
> and waitFlush=true so that it blocks until the new searcher becomes
> available for querying.
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


Re: commit / new searcher delay?

2009-03-04 Thread Shalin Shekhar Mangar
On Thu, Mar 5, 2009 at 6:06 AM, Steve Conover  wrote:

> I'm doing some testing of a solr master/slave config and find that,
> after syncing my slave, I need to sleep for about 400ms after commit
> to "see" the new index state.  i.e. if I don't sleep, and I execute a
> query, I get results that reflect the prior state of the index.
>

How are you sending the commit? You should use commit with waitSearcher=true
and waitFlush=true so that it blocks until the new searcher becomes
available for querying.


-- 
Regards,
Shalin Shekhar Mangar.


Re: commit error which kill my dataimport.properties file

2009-02-13 Thread sunnyfr

It's actually the space, sorry. 
But yes my snapshot looks huge around 3G every 20mn, so should I clean them
up more often like every 4hours?? 



sunnyfr wrote:
> 
> Hi, 
> 
> Last night I've got an error during the importation and I don't get what
> does that mean and it even kill my dataimport.properties (empty file), so
> nothing was write in this file then the delta-import, started to import
> from the very start I guess.
> 
> Thanks a lot for your help,
> I wish you guys a lovely day,
> 
> 
> 
> there is the error:
> 
> 
> 2009/02/12 23:45:01 commit request to Solr at
> http://books.com:8180/solr/books/update failed:
> 2009/02/12 23:45:01 Apache Tomcat/5.5 - Error
> report
> HTTP Status 500 - No space left on device
> java.io.IOException: No space left on device at
> java.io.RandomAccessFile.writeBytes(Native Method) at
> java.io.RandomAccessFile.write(RandomAccessFile.java:466) at
> org.apache.lucene.store.FSDirectory$FSIndexOutput.flushBuffer(FSDirectory.java:679)
> at
> org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)
> at
> org.apache.lucene.store.BufferedIndexOutput.flush(BufferedIndexOutput.java:85)
> at
> org.apache.lucene.store.BufferedIndexOutput.close(BufferedIndexOutput.java:109)
> at
> org.apache.lucene.store.FSDirectory$FSIndexOutput.close(FSDirectory.java:686)
> at org.apache.lucene.index.FieldsWriter.close(FieldsWriter.java:145) at
> org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:83)
> at
> org.apache.lucene.index.DocFieldConsumers.closeDocStore(DocFieldConsumers.java:83)
> at
> org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:47)
> at
> org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:373)
> at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:562)
> at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3803) at
> org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3712) at
> org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1752)
> at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1716) at
> org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1687) at
> org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:214) at
> org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:172)
> at
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:341)
> at
> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:78)
> at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:168) at
> org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1313) at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151)
> at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:874)
> at
> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
> at
> org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
> at
> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
> at
> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
> at java.lang.Thread.run(Threa

Re: commit very long ?? solr

2009-02-12 Thread sunnyfr


Batch committing is always a better option than committing for each
document. An optimize automatically commits. Note that you may not need to
optimize very frequently. For a lot cases, optimizing once per day works
fine.

Yes but commit once per day won"t show updated datas until the next day, If
I need to show them almost straight away how can I do ? How can I check
which my commit is that long ?
elapsed time: 337 sec

-- Regards - Sunny
-- 
View this message in context: 
http://www.nabble.com/commit-very-longsolr-tp21848973p21979442.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: commit looks stuck ?

2009-02-12 Thread sunnyfr

Hi 

Yes I saw that afterward so I decrease it from 5000   to 4500

Sunny


Grant Ingersoll-6 wrote:
> 
> It looks like you are running out of memory.  What is your heap size?
> 
> On Feb 11, 2009, at 4:09 AM, sunnyfr wrote:
> 
>>
>> Hi
>>
>> Have you an idea why after a night with solr running, but just  
>> commit every
>> five minute??
>> It looks like process never shutdown ???
>>
>>
>> root 29428  0.0  0.0  53988  2648 ?S01:05   0:00 curl
>> http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
>> charset=utf-8 -d 
>> root 29829  0.0  0.0   3944   560 ?Ss   01:10   0:00 / 
>> bin/sh -c
>> /data/solr/book/bin/commit
>> root 29830  0.0  0.0   8936  1256 ?S01:10   0:00 / 
>> bin/bash
>> /data/solr/book/bin/commit
>> root 29852  0.0  0.0  53988  2640 ?S01:10   0:00 curl
>> http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
>> charset=utf-8 -d 
>> root 30286  0.0  0.0   3944   564 ?Ss   01:15   0:00 / 
>> bin/sh -c
>> /data/solr/book/bin/commit
>> root 30287  0.0  0.0   8936  1256 ?S01:15   0:00 / 
>> bin/bash
>> /data/solr/book/bin/commit
>> root 30309  0.0  0.0  53988  2644 ?S01:15   0:00 curl
>> http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
>> charset=utf-8 -d 
>> root 30715  0.0  0.0   3944   560 ?Ss   01:20   0:00 / 
>> bin/sh -c
>> /data/solr/book/bin/commit
>> root 30716  0.0  0.0   8936  1252 ?S01:20   0:00 / 
>> bin/bash
>> /data/solr/book/bin/commit
>> root 30738  0.0  0.0  53988  2644 ?S01:20   0:00 curl
>> http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
>> charset=utf-8 -d 
>> root 31172  0.0  0.0   3944   564 ?Ss   01:25   0:00 / 
>> bin/sh -c
>> /data/solr/book/bin/commit
>> root 31173  0.0  0.0   8936  1252 ?S01:25   0:00 / 
>> bin/bash
>> /data/solr/book/bin/commit
>> root 31195  0.0  0.0  53988  2644 ?S01:25   0:00 curl
>> http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
>> charset=utf-8 -d 
>> root 31606  0.0  0.0   3944   564 ?Ss   01:30   0:00 / 
>> bin/sh -c
>> /data/solr/book/bin/commit
>> root 31607  0.0  0.0   8936  1256 ?S01:30   0:00 / 
>> bin/bash
>> /data/solr/book/bin/commit
>> root 31629  0.0  0.0  53988  2648 ?S01:30   0:00 curl
>> http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
>> charset=utf-8 -d 
>> root 32063  0.0  0.0   3944   560 ?Ss   01:35   0:00 / 
>> bin/sh -c
>> /data/solr/book/bin/commit
>> root 32064  0.0  0.0   8936  1256 ?S01:35   0:00 / 
>> bin/bash
>> /data/solr/book/bin/commit
>> root 32086  0.0  0.0  53988  2640 ?S01:35   0:00 curl
>> http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
>> charset=utf-8 -d 
>> root 32499  0.0  0.0   3944   564 ?Ss   01:40   0:00 / 
>> bin/sh -c
>> /data/solr/book/bin/commit
>> root 32500  0.0  0.0   8936  1252 ?S01:40   0:00 / 
>> bin/bash
>> /data/solr/book/bin/commit
>> root 32522  0.0  0.0  53988  2648 ?S01:40   0:00 curl
>> http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
>> charset=utf-8 -d 
>>
>> My logs has a huge error, I don't know where it comes from??
>>
>> 2009/02/10 19:29:37 Apache Tomcat/5.5 - Error
>> report

Re: commit looks stuck ?

2009-02-11 Thread Grant Ingersoll

It looks like you are running out of memory.  What is your heap size?

On Feb 11, 2009, at 4:09 AM, sunnyfr wrote:



Hi

Have you an idea why after a night with solr running, but just  
commit every

five minute??
It looks like process never shutdown ???


root 29428  0.0  0.0  53988  2648 ?S01:05   0:00 curl
http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
charset=utf-8 -d 
root 29829  0.0  0.0   3944   560 ?Ss   01:10   0:00 / 
bin/sh -c

/data/solr/book/bin/commit
root 29830  0.0  0.0   8936  1256 ?S01:10   0:00 / 
bin/bash

/data/solr/book/bin/commit
root 29852  0.0  0.0  53988  2640 ?S01:10   0:00 curl
http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
charset=utf-8 -d 
root 30286  0.0  0.0   3944   564 ?Ss   01:15   0:00 / 
bin/sh -c

/data/solr/book/bin/commit
root 30287  0.0  0.0   8936  1256 ?S01:15   0:00 / 
bin/bash

/data/solr/book/bin/commit
root 30309  0.0  0.0  53988  2644 ?S01:15   0:00 curl
http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
charset=utf-8 -d 
root 30715  0.0  0.0   3944   560 ?Ss   01:20   0:00 / 
bin/sh -c

/data/solr/book/bin/commit
root 30716  0.0  0.0   8936  1252 ?S01:20   0:00 / 
bin/bash

/data/solr/book/bin/commit
root 30738  0.0  0.0  53988  2644 ?S01:20   0:00 curl
http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
charset=utf-8 -d 
root 31172  0.0  0.0   3944   564 ?Ss   01:25   0:00 / 
bin/sh -c

/data/solr/book/bin/commit
root 31173  0.0  0.0   8936  1252 ?S01:25   0:00 / 
bin/bash

/data/solr/book/bin/commit
root 31195  0.0  0.0  53988  2644 ?S01:25   0:00 curl
http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
charset=utf-8 -d 
root 31606  0.0  0.0   3944   564 ?Ss   01:30   0:00 / 
bin/sh -c

/data/solr/book/bin/commit
root 31607  0.0  0.0   8936  1256 ?S01:30   0:00 / 
bin/bash

/data/solr/book/bin/commit
root 31629  0.0  0.0  53988  2648 ?S01:30   0:00 curl
http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
charset=utf-8 -d 
root 32063  0.0  0.0   3944   560 ?Ss   01:35   0:00 / 
bin/sh -c

/data/solr/book/bin/commit
root 32064  0.0  0.0   8936  1256 ?S01:35   0:00 / 
bin/bash

/data/solr/book/bin/commit
root 32086  0.0  0.0  53988  2640 ?S01:35   0:00 curl
http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
charset=utf-8 -d 
root 32499  0.0  0.0   3944   564 ?Ss   01:40   0:00 / 
bin/sh -c

/data/solr/book/bin/commit
root 32500  0.0  0.0   8936  1252 ?S01:40   0:00 / 
bin/bash

/data/solr/book/bin/commit
root 32522  0.0  0.0  53988  2648 ?S01:40   0:00 curl
http://localhost:8180/solr/book/update -s -H Content-type:text/xml;
charset=utf-8 -d 

My logs has a huge error, I don't know where it comes from??

2009/02/10 19:29:37 Apache Tomcat/5.5 - Error
report

Re: commit very long ?? solr

2009-02-05 Thread Shalin Shekhar Mangar
On Thu, Feb 5, 2009 at 7:07 PM, Gert Brinkmann  wrote:

> sunnyfr wrote:
> > Yes the average is 12 docs seconde updated.
>
> In our case with indexing normal web-pages on a normal workstation we
> have about 10 docs per second (updating + committing). This feels quite
> long. But if this is normal... ok.
>
> > I actually reduce warmup and cache, it works fine now, I will see if it
> > impact a lot or not the request time.
>
> Is a warmup and cache refill done on every commit? Might it be possible
> to do this on optimization only and not on every commit?


Yes autowarming, if enabled, is done on each commit. It is not possible to
do it only for optimize because the caches contain the lucene's document ids
which change on commits.


>
>
> Or does optimize automatically trigger a commit? In this case I could
> turn off committing all 100 documents and only do an optimize at the end
> of indexing.
>

Batch committing is always a better option than committing for each
document. An optimize automatically commits. Note that you may not need to
optimize very frequently. For a lot cases, optimizing once per day works
fine.

-- 
Regards,
Shalin Shekhar Mangar.


Re: commit very long ?? solr

2009-02-05 Thread Gert Brinkmann
sunnyfr wrote:
> Yes the average is 12 docs seconde updated.

In our case with indexing normal web-pages on a normal workstation we
have about 10 docs per second (updating + committing). This feels quite
long. But if this is normal... ok.

> I actually reduce warmup and cache, it works fine now, I will see if it
> impact a lot or not the request time.

Is a warmup and cache refill done on every commit? Might it be possible
to do this on optimization only and not on every commit?

Or does optimize automatically trigger a commit? In this case I could
turn off committing all 100 documents and only do an optimize at the end
of indexing.

Thanks,
Gert


Re: commit very long ?? solr

2009-02-05 Thread sunnyfr

Yes the average is 12 docs seconde updated.
I've 8,5M documents and I try to update every 5mn so I guess I've no choice
with 8G of ram to have almost null warmup and cache. My data folder is about
5.8G.

 What would you reckon ?  

I actually reduce warmup and cache, it works fine now, I will see if it
impact a lot or not the request time.

Thanks Shalin,



Shalin Shekhar Mangar wrote:
> 
> 12275 documents in 422 seconds = 29 docs/second. How fast do you want it
> to
> complete?
> 
> How much time do the queries take to create a document? We don't know the
> size of the documents?
> 
> On Thu, Feb 5, 2009 at 4:11 PM, sunnyfr  wrote:
> 
>>
>> Hi,
>>
>> Sorry but I don't know where is the problem ??
>> Don't you think it's a bit long ???
>>
>> 2009/02/05 11:30:01 started by root
>> 2009/02/05 11:30:01 command: /data/solr/book/bin/commit
>> 2009/02/05 11:37:03 ended (elapsed time: 422 sec)
>>
>> idle
>> 
>> -
>> 
>> 85926
>> 43347
>> 0
>> 2009-02-05 11:28:19
>> 2009-02-05 11:28:19
>> 2009-02-05 11:28:50
>> 2009-02-05 11:28:50
>> 12275
>> -
>> 
>> Indexing completed. Added/Updated: 12275 documents. Deleted 0 documents.
>> 
>> 2009-02-05 11:37:03
>> 2009-02-05 11:37:03
>> 0:8:43.711
>>
>> Once he added/updated docs its stays a long time busy ...
>> delta import was started around  11:28:30
>>
>> Thanks a lot for your help,
>> Sunny
>>
>> --
>> View this message in context:
>> http://www.nabble.com/commit-very-longsolr-tp21848973p21848973.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/commit-very-longsolr-tp21848973p21851246.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: commit very long ?? solr

2009-02-05 Thread Shalin Shekhar Mangar
12275 documents in 422 seconds = 29 docs/second. How fast do you want it to
complete?

How much time do the queries take to create a document? We don't know the
size of the documents?

On Thu, Feb 5, 2009 at 4:11 PM, sunnyfr  wrote:

>
> Hi,
>
> Sorry but I don't know where is the problem ??
> Don't you think it's a bit long ???
>
> 2009/02/05 11:30:01 started by root
> 2009/02/05 11:30:01 command: /data/solr/book/bin/commit
> 2009/02/05 11:37:03 ended (elapsed time: 422 sec)
>
> idle
> 
> -
> 
> 85926
> 43347
> 0
> 2009-02-05 11:28:19
> 2009-02-05 11:28:19
> 2009-02-05 11:28:50
> 2009-02-05 11:28:50
> 12275
> -
> 
> Indexing completed. Added/Updated: 12275 documents. Deleted 0 documents.
> 
> 2009-02-05 11:37:03
> 2009-02-05 11:37:03
> 0:8:43.711
>
> Once he added/updated docs its stays a long time busy ...
> delta import was started around  11:28:30
>
> Thanks a lot for your help,
> Sunny
>
> --
> View this message in context:
> http://www.nabble.com/commit-very-longsolr-tp21848973p21848973.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Regards,
Shalin Shekhar Mangar.


Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-13 Thread Uwe Klosa
Hi

I have now recreated the whole index with new index files and all is back to
normal again. I think something had happend to our old index files.

Many thanks to you who tried to help.

Uwe

On Mon, Oct 6, 2008 at 5:39 PM, Uwe Klosa <[EMAIL PROTECTED]> wrote:

> I already had the chance to setup a new server for testing. Before
> deploying my application I checked my solrconfig against the solrconfig from
> 1.3. And removed the deprecated parameters. I started updating the new
> index. I ingest 100 documents att a time and then I do a commit(). With 2000
> ingested documents the commit time is 1-3 seconds. I'll get back tomorrow
> with more results.
>
> Uwe
>
>
>
> On Sun, Oct 5, 2008 at 2:52 PM, Uwe Klosa <[EMAIL PROTECTED]> wrote:
>
>> It's a live server with many search queries. I will set up a test server
>> next week or the week after and index the same amount of documents. I will
>> get back with the results.
>>
>> Uwe
>>
>>
>> On Sat, Oct 4, 2008 at 8:18 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>>
>>> On Sat, Oct 4, 2008 at 11:55 AM, Uwe Klosa <[EMAIL PROTECTED]> wrote:
>>> > A "Opening Server" is always happening directly after "start commit"
>>> with no
>>> > delay.
>>>
>>> Ah, so it doesn't look like it's the close of the IndexWriter then!
>>> When do you see the "end_commit_flush"?
>>> Could you post everything in your log between when the commit begins
>>> and when it ends?
>>> Is this a live server (is query traffic continuing to come in while
>>> the commit is happening?)  If so, it would be interesting to see (and
>>> easier to debug) if it happened on a server with no query traffic.
>>>
>>> > But I can see many {commit=} with QTime around 280.000 (4 and a half
>>> > minutes)
>>>
>>> > One difference I could see to your logging is that I have
>>> waitFlush=true.
>>> > Could that have this impact?
>>>
>>> These parameters (waitFlush/waitSearcher) won't affect how long it
>>> takes to get the new searcher registered, but does affect at what
>>> point control is returned to the caller (and hence when you see the
>>> response).  If waitSearcher==false, then you see the response before
>>> searcher warming, otherwise it blocks until after.  waitFlush==false
>>> is not currently supported (it will always act as true), so your
>>> change of that doesn't matter.
>>>
>>> -Yonik
>>>
>>
>>
>


Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-06 Thread Uwe Klosa
I already had the chance to setup a new server for testing. Before deploying
my application I checked my solrconfig against the solrconfig from 1.3. And
removed the deprecated parameters. I started updating the new index. I
ingest 100 documents att a time and then I do a commit(). With 2000 ingested
documents the commit time is 1-3 seconds. I'll get back tomorrow with more
results.

Uwe


On Sun, Oct 5, 2008 at 2:52 PM, Uwe Klosa <[EMAIL PROTECTED]> wrote:

> It's a live server with many search queries. I will set up a test server
> next week or the week after and index the same amount of documents. I will
> get back with the results.
>
> Uwe
>
>
> On Sat, Oct 4, 2008 at 8:18 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>
>> On Sat, Oct 4, 2008 at 11:55 AM, Uwe Klosa <[EMAIL PROTECTED]> wrote:
>> > A "Opening Server" is always happening directly after "start commit"
>> with no
>> > delay.
>>
>> Ah, so it doesn't look like it's the close of the IndexWriter then!
>> When do you see the "end_commit_flush"?
>> Could you post everything in your log between when the commit begins
>> and when it ends?
>> Is this a live server (is query traffic continuing to come in while
>> the commit is happening?)  If so, it would be interesting to see (and
>> easier to debug) if it happened on a server with no query traffic.
>>
>> > But I can see many {commit=} with QTime around 280.000 (4 and a half
>> > minutes)
>>
>> > One difference I could see to your logging is that I have
>> waitFlush=true.
>> > Could that have this impact?
>>
>> These parameters (waitFlush/waitSearcher) won't affect how long it
>> takes to get the new searcher registered, but does affect at what
>> point control is returned to the caller (and hence when you see the
>> response).  If waitSearcher==false, then you see the response before
>> searcher warming, otherwise it blocks until after.  waitFlush==false
>> is not currently supported (it will always act as true), so your
>> change of that doesn't matter.
>>
>> -Yonik
>>
>
>


Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-05 Thread Uwe Klosa
It's a live server with many search queries. I will set up a test server
next week or the week after and index the same amount of documents. I will
get back with the results.

Uwe

On Sat, Oct 4, 2008 at 8:18 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:

> On Sat, Oct 4, 2008 at 11:55 AM, Uwe Klosa <[EMAIL PROTECTED]> wrote:
> > A "Opening Server" is always happening directly after "start commit" with
> no
> > delay.
>
> Ah, so it doesn't look like it's the close of the IndexWriter then!
> When do you see the "end_commit_flush"?
> Could you post everything in your log between when the commit begins
> and when it ends?
> Is this a live server (is query traffic continuing to come in while
> the commit is happening?)  If so, it would be interesting to see (and
> easier to debug) if it happened on a server with no query traffic.
>
> > But I can see many {commit=} with QTime around 280.000 (4 and a half
> > minutes)
>
> > One difference I could see to your logging is that I have waitFlush=true.
> > Could that have this impact?
>
> These parameters (waitFlush/waitSearcher) won't affect how long it
> takes to get the new searcher registered, but does affect at what
> point control is returned to the caller (and hence when you see the
> response).  If waitSearcher==false, then you see the response before
> searcher warming, otherwise it blocks until after.  waitFlush==false
> is not currently supported (it will always act as true), so your
> change of that doesn't matter.
>
> -Yonik
>


Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-04 Thread Yonik Seeley
On Sat, Oct 4, 2008 at 11:55 AM, Uwe Klosa <[EMAIL PROTECTED]> wrote:
> A "Opening Server" is always happening directly after "start commit" with no
> delay.

Ah, so it doesn't look like it's the close of the IndexWriter then!
When do you see the "end_commit_flush"?
Could you post everything in your log between when the commit begins
and when it ends?
Is this a live server (is query traffic continuing to come in while
the commit is happening?)  If so, it would be interesting to see (and
easier to debug) if it happened on a server with no query traffic.

> But I can see many {commit=} with QTime around 280.000 (4 and a half
> minutes)

> One difference I could see to your logging is that I have waitFlush=true.
> Could that have this impact?

These parameters (waitFlush/waitSearcher) won't affect how long it
takes to get the new searcher registered, but does affect at what
point control is returned to the caller (and hence when you see the
response).  If waitSearcher==false, then you see the response before
searcher warming, otherwise it blocks until after.  waitFlush==false
is not currently supported (it will always act as true), so your
change of that doesn't matter.

-Yonik


Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-04 Thread Uwe Klosa
A "Opening Server" is always happening directly after "start commit" with no
delay. But I can see many {commit=} with QTime around 280.000 (4 and a half
minutes)

One difference I could see to your logging is that I have waitFlush=true.
Could that have this impact?

Uwe

On Sat, Oct 4, 2008 at 4:36 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote:

> On Fri, Oct 3, 2008 at 2:28 PM, Michael McCandless
> <[EMAIL PROTECTED]> wrote:
> > Yonik, when Solr commits what does it actually do?
>
> Less than it used to (Solr now uses Lucene to handle deletes).
> A solr-level commit closes the IndexWriter, calls some configured
> callbacks, opens a new IndexSearcher, warms it, and registers it.
>
> We can tell where the time is taken by looking at the timestamps in
> the log entries.  Here is what the log output should look like for a
> commit:
>
> INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)
>  // close the index writer
>  // call any configured post-commit callbacks (to take a snapshot if
> the index, etc).
>  // open a new IndexSearcher (uses IndexReader.reopen() of the last
> opened reader)
> INFO: Opening [EMAIL PROTECTED] main
> INFO: end_commit_flush
>  // in a different thread, warming of the new IndexSearcher will be done.
>  // by default, the solr-level commit will wait for warming to be
> done and the new searcher
>  // to be registered (i.e. any new searches will see the committed changes)
> INFO: autowarming [EMAIL PROTECTED] main from [EMAIL PROTECTED] main [...]
>  // there will be multiple autowarming statements, and some could
> appear before the
>  // end_commit_flush log entry because it's being done in another thread.
> INFO: [] Registered new searcher [EMAIL PROTECTED] main
> INFO: Closing [EMAIL PROTECTED] main
> INFO: {commit=} 0 547
> INFO: [] webapp=/solr path=/update params={} status=0 QTime=547
>
> Uwe, can you verify that the bulk of the time is between "start
> commit" and "Opening Searcher"?
>
> -Yonik
>


Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-04 Thread Yonik Seeley
On Sat, Oct 4, 2008 at 9:35 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
> So it seems like fsync with ZFS can be very slow?

The other user that appears to have a commit issue is on Win64.

http://www.nabble.com/*Very*-slow-Commit-after-upgrading-to-solr-1.3-td19720792.html#a19720792

-Yonik


Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-04 Thread Yonik Seeley
On Fri, Oct 3, 2008 at 2:28 PM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
> Yonik, when Solr commits what does it actually do?

Less than it used to (Solr now uses Lucene to handle deletes).
A solr-level commit closes the IndexWriter, calls some configured
callbacks, opens a new IndexSearcher, warms it, and registers it.

We can tell where the time is taken by looking at the timestamps in
the log entries.  Here is what the log output should look like for a
commit:

INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)
  // close the index writer
  // call any configured post-commit callbacks (to take a snapshot if
the index, etc).
  // open a new IndexSearcher (uses IndexReader.reopen() of the last
opened reader)
INFO: Opening [EMAIL PROTECTED] main
INFO: end_commit_flush
  // in a different thread, warming of the new IndexSearcher will be done.
  // by default, the solr-level commit will wait for warming to be
done and the new searcher
  // to be registered (i.e. any new searches will see the committed changes)
INFO: autowarming [EMAIL PROTECTED] main from [EMAIL PROTECTED] main [...]
  // there will be multiple autowarming statements, and some could
appear before the
  // end_commit_flush log entry because it's being done in another thread.
INFO: [] Registered new searcher [EMAIL PROTECTED] main
INFO: Closing [EMAIL PROTECTED] main
INFO: {commit=} 0 547
INFO: [] webapp=/solr path=/update params={} status=0 QTime=547

Uwe, can you verify that the bulk of the time is between "start
commit" and "Opening Searcher"?

-Yonik


Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-04 Thread Michael McCandless


Oh OK, phew.  I misunderstood your answer too!

So it seems like fsync with ZFS can be very slow?

Mike

Uwe Klosa wrote:

Oh, you meant index files. I misunderstood your question. Sorry, now  
that I
read it again I see what you meant. There are only 136 index files.  
So no

problem there.

Uwe

On Sat, Oct 4, 2008 at 1:59 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:



Yikes!  That's way too many files.  Have you changed mergeFactor?  Or
implemented a custom DeletionPolicy or MergePolicy?

Or... does anyone know of something else in Solr's configuration  
that could

lead to such an insane number of files?

Mike


Uwe Klosa wrote:

There are around 35.000 files in the index. When I started Indexing 5

weeks
ago with only 2000 documents I did not this issue. I have seen it  
the

first
time with around 10.000 documents.

Before that I have been using the same instance on a Linux machine  
with up
to 17.000 documents and I haven't seen this issue at all. The  
original

plan
has always been to use Solr on Linux, but I'm still waiting for  
the new

server.

Uwe

On Sat, Oct 4, 2008 at 12:06 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:


Hmm OK that seems like a possible explanation then.  Still it's  
spooky

that
it's taking 5 minutes.  How many files are in the index at the  
time you

call
commit?

I wonder if you were to simply pause for say 30 seconds, before  
issuing

the
commit, whether you'd then see the commit go faster?  On Windows  
at least
such a silly trick does seem to improve performance, I think  
because it
allows the OS to move the bytes from its write cache onto stable  
storage

"on
its own schedule" whereas when we commit we are demanding the OS  
move the

bytes on our [arbitrary] schedule.

I really wish OSs would add an API that would just block & return  
once

the
file has made it to stable storage (letting the OS sync on its own
optimal
schedule), rather than demanding the file be fsync'd immediately.

I really haven't explored the performance of fsync on different
filesystems.  I think I've read that ReiserFS may have issues,  
though it
could have been addressed by now.  I *believe* ext3 is OK (at  
least, it
didn't show the strange "sleep to get better performance" issue  
above, in

my
limited testing).

Mike


Uwe Klosa wrote:

Thanks Mike



The use of fsync() might be the answer to my problem, because I  
have
installed Solr for lack of other possibilities in a zone on  
Solaris with

ZFS
which slows down when many fsync() calls are made. This will be  
fixed in

a
upcoming release of Solaris, but I will move as soon as possible  
the

Solr
instances to another server with a different file system. Would  
the use

of
a
different file system than ext3 boost the performance?

Uwe

On Fri, Oct 3, 2008 at 8:28 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:


Yonik Seeley wrote:


On Fri, Oct 3, 2008 at 1:56 PM, Uwe Klosa <[EMAIL PROTECTED]>  
wrote:



I have a big problem with one of my solr instances. A commit  
can take



up
to
5 minutes. This time does not depend on the number of  
documents which

are
updated. The difference for 1 or 100 updated documents is  
only a few

seconds.


Since Solr's commit logic really hasn't changed, I wonder if  
this

could be lucene related somehow.


Lucene's commit logic has changed: we now fsync() each file in  
the

index
to
ensure all bytes are on stable storage, before returning.

But I can't imagine that taking 5 minutes, unless there are  
somehow a

great
many files added to the index?

Uwe, what filesystem are you using?

Yonik, when Solr commits what does it actually do?

Mike











Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-04 Thread Uwe Klosa
Oh, you meant index files. I misunderstood your question. Sorry, now that I
read it again I see what you meant. There are only 136 index files. So no
problem there.

Uwe

On Sat, Oct 4, 2008 at 1:59 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:

>
> Yikes!  That's way too many files.  Have you changed mergeFactor?  Or
> implemented a custom DeletionPolicy or MergePolicy?
>
> Or... does anyone know of something else in Solr's configuration that could
> lead to such an insane number of files?
>
> Mike
>
>
> Uwe Klosa wrote:
>
>  There are around 35.000 files in the index. When I started Indexing 5
>> weeks
>> ago with only 2000 documents I did not this issue. I have seen it the
>> first
>> time with around 10.000 documents.
>>
>> Before that I have been using the same instance on a Linux machine with up
>> to 17.000 documents and I haven't seen this issue at all. The original
>> plan
>> has always been to use Solr on Linux, but I'm still waiting for the new
>> server.
>>
>> Uwe
>>
>> On Sat, Oct 4, 2008 at 12:06 PM, Michael McCandless <
>> [EMAIL PROTECTED]> wrote:
>>
>>
>>> Hmm OK that seems like a possible explanation then.  Still it's spooky
>>> that
>>> it's taking 5 minutes.  How many files are in the index at the time you
>>> call
>>> commit?
>>>
>>> I wonder if you were to simply pause for say 30 seconds, before issuing
>>> the
>>> commit, whether you'd then see the commit go faster?  On Windows at least
>>> such a silly trick does seem to improve performance, I think because it
>>> allows the OS to move the bytes from its write cache onto stable storage
>>> "on
>>> its own schedule" whereas when we commit we are demanding the OS move the
>>> bytes on our [arbitrary] schedule.
>>>
>>> I really wish OSs would add an API that would just block & return once
>>> the
>>> file has made it to stable storage (letting the OS sync on its own
>>> optimal
>>> schedule), rather than demanding the file be fsync'd immediately.
>>>
>>> I really haven't explored the performance of fsync on different
>>> filesystems.  I think I've read that ReiserFS may have issues, though it
>>> could have been addressed by now.  I *believe* ext3 is OK (at least, it
>>> didn't show the strange "sleep to get better performance" issue above, in
>>> my
>>> limited testing).
>>>
>>> Mike
>>>
>>>
>>> Uwe Klosa wrote:
>>>
>>> Thanks Mike
>>>

 The use of fsync() might be the answer to my problem, because I have
 installed Solr for lack of other possibilities in a zone on Solaris with
 ZFS
 which slows down when many fsync() calls are made. This will be fixed in
 a
 upcoming release of Solaris, but I will move as soon as possible the
 Solr
 instances to another server with a different file system. Would the use
 of
 a
 different file system than ext3 boost the performance?

 Uwe

 On Fri, Oct 3, 2008 at 8:28 PM, Michael McCandless <
 [EMAIL PROTECTED]> wrote:


  Yonik Seeley wrote:
>
> On Fri, Oct 3, 2008 at 1:56 PM, Uwe Klosa <[EMAIL PROTECTED]> wrote:
>
>
>> I have a big problem with one of my solr instances. A commit can take
>>
>>> up
>>> to
>>> 5 minutes. This time does not depend on the number of documents which
>>> are
>>> updated. The difference for 1 or 100 updated documents is only a few
>>> seconds.
>>>
>>>
>>>  Since Solr's commit logic really hasn't changed, I wonder if this
>> could be lucene related somehow.
>>
>>
>>  Lucene's commit logic has changed: we now fsync() each file in the
> index
> to
> ensure all bytes are on stable storage, before returning.
>
> But I can't imagine that taking 5 minutes, unless there are somehow a
> great
> many files added to the index?
>
> Uwe, what filesystem are you using?
>
> Yonik, when Solr commits what does it actually do?
>
> Mike
>
>
>
>>>
>


Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-04 Thread Michael McCandless


Yikes!  That's way too many files.  Have you changed mergeFactor?  Or  
implemented a custom DeletionPolicy or MergePolicy?


Or... does anyone know of something else in Solr's configuration that  
could lead to such an insane number of files?


Mike

Uwe Klosa wrote:

There are around 35.000 files in the index. When I started Indexing  
5 weeks
ago with only 2000 documents I did not this issue. I have seen it  
the first

time with around 10.000 documents.

Before that I have been using the same instance on a Linux machine  
with up
to 17.000 documents and I haven't seen this issue at all. The  
original plan
has always been to use Solr on Linux, but I'm still waiting for the  
new

server.

Uwe

On Sat, Oct 4, 2008 at 12:06 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:



Hmm OK that seems like a possible explanation then.  Still it's  
spooky that
it's taking 5 minutes.  How many files are in the index at the time  
you call

commit?

I wonder if you were to simply pause for say 30 seconds, before  
issuing the
commit, whether you'd then see the commit go faster?  On Windows at  
least
such a silly trick does seem to improve performance, I think  
because it
allows the OS to move the bytes from its write cache onto stable  
storage "on
its own schedule" whereas when we commit we are demanding the OS  
move the

bytes on our [arbitrary] schedule.

I really wish OSs would add an API that would just block & return  
once the
file has made it to stable storage (letting the OS sync on its own  
optimal

schedule), rather than demanding the file be fsync'd immediately.

I really haven't explored the performance of fsync on different
filesystems.  I think I've read that ReiserFS may have issues,  
though it
could have been addressed by now.  I *believe* ext3 is OK (at  
least, it
didn't show the strange "sleep to get better performance" issue  
above, in my

limited testing).

Mike


Uwe Klosa wrote:

Thanks Mike


The use of fsync() might be the answer to my problem, because I have
installed Solr for lack of other possibilities in a zone on  
Solaris with

ZFS
which slows down when many fsync() calls are made. This will be  
fixed in a
upcoming release of Solaris, but I will move as soon as possible  
the Solr
instances to another server with a different file system. Would  
the use of

a
different file system than ext3 boost the performance?

Uwe

On Fri, Oct 3, 2008 at 8:28 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:



Yonik Seeley wrote:

On Fri, Oct 3, 2008 at 1:56 PM, Uwe Klosa <[EMAIL PROTECTED]>  
wrote:




I have a big problem with one of my solr instances. A commit can  
take

up
to
5 minutes. This time does not depend on the number of documents  
which

are
updated. The difference for 1 or 100 updated documents is only  
a few

seconds.



Since Solr's commit logic really hasn't changed, I wonder if this
could be lucene related somehow.


Lucene's commit logic has changed: we now fsync() each file in  
the index

to
ensure all bytes are on stable storage, before returning.

But I can't imagine that taking 5 minutes, unless there are  
somehow a

great
many files added to the index?

Uwe, what filesystem are you using?

Yonik, when Solr commits what does it actually do?

Mike








Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-04 Thread Uwe Klosa
There are around 35.000 files in the index. When I started Indexing 5 weeks
ago with only 2000 documents I did not this issue. I have seen it the first
time with around 10.000 documents.

Before that I have been using the same instance on a Linux machine with up
to 17.000 documents and I haven't seen this issue at all. The original plan
has always been to use Solr on Linux, but I'm still waiting for the new
server.

Uwe

On Sat, Oct 4, 2008 at 12:06 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:

>
> Hmm OK that seems like a possible explanation then.  Still it's spooky that
> it's taking 5 minutes.  How many files are in the index at the time you call
> commit?
>
> I wonder if you were to simply pause for say 30 seconds, before issuing the
> commit, whether you'd then see the commit go faster?  On Windows at least
> such a silly trick does seem to improve performance, I think because it
> allows the OS to move the bytes from its write cache onto stable storage "on
> its own schedule" whereas when we commit we are demanding the OS move the
> bytes on our [arbitrary] schedule.
>
> I really wish OSs would add an API that would just block & return once the
> file has made it to stable storage (letting the OS sync on its own optimal
> schedule), rather than demanding the file be fsync'd immediately.
>
> I really haven't explored the performance of fsync on different
> filesystems.  I think I've read that ReiserFS may have issues, though it
> could have been addressed by now.  I *believe* ext3 is OK (at least, it
> didn't show the strange "sleep to get better performance" issue above, in my
> limited testing).
>
> Mike
>
>
> Uwe Klosa wrote:
>
>  Thanks Mike
>>
>> The use of fsync() might be the answer to my problem, because I have
>> installed Solr for lack of other possibilities in a zone on Solaris with
>> ZFS
>> which slows down when many fsync() calls are made. This will be fixed in a
>> upcoming release of Solaris, but I will move as soon as possible the Solr
>> instances to another server with a different file system. Would the use of
>> a
>> different file system than ext3 boost the performance?
>>
>> Uwe
>>
>> On Fri, Oct 3, 2008 at 8:28 PM, Michael McCandless <
>> [EMAIL PROTECTED]> wrote:
>>
>>
>>> Yonik Seeley wrote:
>>>
>>> On Fri, Oct 3, 2008 at 1:56 PM, Uwe Klosa <[EMAIL PROTECTED]> wrote:
>>>

  I have a big problem with one of my solr instances. A commit can take
> up
> to
> 5 minutes. This time does not depend on the number of documents which
> are
> updated. The difference for 1 or 100 updated documents is only a few
> seconds.
>
>
 Since Solr's commit logic really hasn't changed, I wonder if this
 could be lucene related somehow.


>>> Lucene's commit logic has changed: we now fsync() each file in the index
>>> to
>>> ensure all bytes are on stable storage, before returning.
>>>
>>> But I can't imagine that taking 5 minutes, unless there are somehow a
>>> great
>>> many files added to the index?
>>>
>>> Uwe, what filesystem are you using?
>>>
>>> Yonik, when Solr commits what does it actually do?
>>>
>>> Mike
>>>
>>>
>


Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-04 Thread Michael McCandless


Hmm OK that seems like a possible explanation then.  Still it's spooky  
that it's taking 5 minutes.  How many files are in the index at the  
time you call commit?


I wonder if you were to simply pause for say 30 seconds, before  
issuing the commit, whether you'd then see the commit go faster?  On  
Windows at least such a silly trick does seem to improve performance,  
I think because it allows the OS to move the bytes from its write  
cache onto stable storage "on its own schedule" whereas when we commit  
we are demanding the OS move the bytes on our [arbitrary] schedule.


I really wish OSs would add an API that would just block & return once  
the file has made it to stable storage (letting the OS sync on its own  
optimal schedule), rather than demanding the file be fsync'd  
immediately.


I really haven't explored the performance of fsync on different  
filesystems.  I think I've read that ReiserFS may have issues, though  
it could have been addressed by now.  I *believe* ext3 is OK (at  
least, it didn't show the strange "sleep to get better performance"  
issue above, in my limited testing).


Mike

Uwe Klosa wrote:


Thanks Mike

The use of fsync() might be the answer to my problem, because I have
installed Solr for lack of other possibilities in a zone on Solaris  
with ZFS
which slows down when many fsync() calls are made. This will be  
fixed in a
upcoming release of Solaris, but I will move as soon as possible the  
Solr
instances to another server with a different file system. Would the  
use of a

different file system than ext3 boost the performance?

Uwe

On Fri, Oct 3, 2008 at 8:28 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:



Yonik Seeley wrote:

On Fri, Oct 3, 2008 at 1:56 PM, Uwe Klosa <[EMAIL PROTECTED]>  
wrote:


I have a big problem with one of my solr instances. A commit can  
take up

to
5 minutes. This time does not depend on the number of documents  
which are
updated. The difference for 1 or 100 updated documents is only a  
few

seconds.



Since Solr's commit logic really hasn't changed, I wonder if this
could be lucene related somehow.



Lucene's commit logic has changed: we now fsync() each file in the  
index to

ensure all bytes are on stable storage, before returning.

But I can't imagine that taking 5 minutes, unless there are somehow  
a great

many files added to the index?

Uwe, what filesystem are you using?

Yonik, when Solr commits what does it actually do?

Mike





Re: Commit in solr 1.3 can take up to 5 minutes

2008-10-04 Thread Uwe Klosa
Thanks Mike

The use of fsync() might be the answer to my problem, because I have
installed Solr for lack of other possibilities in a zone on Solaris with ZFS
which slows down when many fsync() calls are made. This will be fixed in a
upcoming release of Solaris, but I will move as soon as possible the Solr
instances to another server with a different file system. Would the use of a
different file system than ext3 boost the performance?

Uwe

On Fri, Oct 3, 2008 at 8:28 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:

>
> Yonik Seeley wrote:
>
>  On Fri, Oct 3, 2008 at 1:56 PM, Uwe Klosa <[EMAIL PROTECTED]> wrote:
>>
>>> I have a big problem with one of my solr instances. A commit can take up
>>> to
>>> 5 minutes. This time does not depend on the number of documents which are
>>> updated. The difference for 1 or 100 updated documents is only a few
>>> seconds.
>>>
>>
>> Since Solr's commit logic really hasn't changed, I wonder if this
>> could be lucene related somehow.
>>
>
> Lucene's commit logic has changed: we now fsync() each file in the index to
> ensure all bytes are on stable storage, before returning.
>
> But I can't imagine that taking 5 minutes, unless there are somehow a great
> many files added to the index?
>
> Uwe, what filesystem are you using?
>
> Yonik, when Solr commits what does it actually do?
>
> Mike
>


<    1   2   3   >