[jira] [Updated] (SOLR-4722) Highlighter which generates a list of query term position(s) for each item in a list of documents, or returns null if highlighting is disabled.

2017-05-21 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-4722:
--
Attachment: PositionsSolrHighlighter.java

Thanks a lot for the patch!
I did some modification based on the original patch to meet our project special 
needs. This modification reveals each term's text as well as its position and 
offsets.  
We do not need solr to do the highlighting but just return the positions and 
offsets. So in schema.xml, our field does not stored and only have 
termVectors="true" termPositions="true" termOffsets="true". 
Just share it.

> Highlighter which generates a list of query term position(s) for each item in 
> a list of documents, or returns null if highlighting is disabled.
> ---
>
> Key: SOLR-4722
> URL: https://issues.apache.org/jira/browse/SOLR-4722
> Project: Solr
>  Issue Type: New Feature
>  Components: highlighter
>Affects Versions: 4.3, 6.0
>Reporter: Tricia Jenkins
>Priority: Minor
> Attachments: PositionsSolrHighlighter.java, SOLR-4722.patch, 
> SOLR-4722.patch, solr-positionshighlighter.jar
>
>
> As an alternative to returning snippets, this highlighter provides the (term) 
> position for query matches.  One usecase for this is to reconcile the term 
> position from the Solr index with 'word' coordinates provided by an OCR 
> process.  In this way we are able to 'highlight' an image, like a page from a 
> book or an article from a newspaper, in the locations that match the user's 
> query.
> This is based on the FastVectorHighlighter and requires that termVectors, 
> termOffsets and termPositions be stored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9829) Solr cannot provide index service after a large GC pause but core state in ZK is still active

2016-12-08 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732316#comment-15732316
 ] 

Forest Soup commented on SOLR-9829:
---

Thanks All!

I have a mail thread tracking on it. 
http://lucene.472066.n3.nabble.com/Solr-cannot-provide-index-service-after-a-large-GC-pause-but-core-state-in-ZK-is-still-active-td4308942.html

Could you please help comments on the questions in it? Thanks!

@Mark and Varun, are you sure this issue is dup of 
https://issues.apache.org/jira/browse/SOLR-7956 ? 
If yes, I'll try to backport it to 5.3.2. 
And also I see Daisy created a similar JIRA: 
https://issues.apache.org/jira/browse/SOLR-9830 . Although her root cause is 
the too many open file, but could you make sure it's also the dup of SOLR-7956? 

> Solr cannot provide index service after a large GC pause but core state in ZK 
> is still active
> -
>
> Key: SOLR-9829
> URL: https://issues.apache.org/jira/browse/SOLR-9829
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update
>Affects Versions: 5.3.2
> Environment: Redhat enterprise server 64bit 
>Reporter: Forest Soup
>
> When Solr meets a large GC pause like 
> https://issues.apache.org/jira/browse/SOLR-9828 , the collections on it 
> cannot provide service and never come back until restart. 
> But in the ZooKeeper, the cores on that server still shows active and server 
> is also in live_nodes. 
> Some /update requests got http 500 due to "IndexWriter is closed". Some gots 
> http 400 due to "possible analysis error." whose root cause is still 
> "IndexWriter is closed", which we think it should return 500 
> instead(documented in https://issues.apache.org/jira/browse/SOLR-9825).
> Our questions in this JIRA are:
> 1, should solr mark cores as down in zk when it cannot provide index service?
> 2, Is it possible solr re-open the IndexWriter to provide index service again?
> solr log snippets:
> 2016-11-22 20:47:37.274 ERROR (qtp2011912080-76) [c:collection12 s:shard1 
> r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore 
> org.apache.solr.common.SolrException: Exception writing document id 
> Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20841350!270CE4F9C032EC26002580730061473C 
> to the index; possible analysis error.
>   at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
>   at 
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231)
>   at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
>   at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
>   at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
>   at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
>   at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>   at 
> 

[jira] [Commented] (SOLR-9828) Very long young generation stop the world GC pause

2016-12-08 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732274#comment-15732274
 ] 

Forest Soup commented on SOLR-9828:
---

The mail thread:
http://lucene.472066.n3.nabble.com/Very-long-young-generation-stop-the-world-GC-pause-td4308911.html

> Very long young generation stop the world GC pause 
> ---
>
> Key: SOLR-9828
> URL: https://issues.apache.org/jira/browse/SOLR-9828
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.3.2
> Environment: Linux Redhat 64bit
>Reporter: Forest Soup
>
> We are using oracle jdk8u92 64bit.
> The jvm memory related options:
> -Xms32768m 
> -Xmx32768m 
> -XX:+HeapDumpOnOutOfMemoryError 
> -XX:HeapDumpPath=/mnt/solrdata1/log 
> -XX:+UseG1GC 
> -XX:+PerfDisableSharedMem 
> -XX:+ParallelRefProcEnabled 
> -XX:G1HeapRegionSize=8m 
> -XX:MaxGCPauseMillis=100 
> -XX:InitiatingHeapOccupancyPercent=35 
> -XX:+AggressiveOpts 
> -XX:+AlwaysPreTouch 
> -XX:ConcGCThreads=16 
> -XX:ParallelGCThreads=18 
> -XX:+HeapDumpOnOutOfMemoryError 
> -XX:HeapDumpPath=/mnt/solrdata1/log 
> -verbose:gc 
> -XX:+PrintHeapAtGC 
> -XX:+PrintGCDetails 
> -XX:+PrintGCDateStamps 
> -XX:+PrintGCTimeStamps 
> -XX:+PrintTenuringDistribution 
> -XX:+PrintGCApplicationStoppedTime 
> -Xloggc:/mnt/solrdata1/log/solr_gc.log
> It usually works fine. But recently we met very long stop the world young 
> generation GC pause. Some snippets of the gc log are as below:
> 2016-11-22T20:43:16.436+: 2942054.483: Total time for which application 
> threads were stopped: 0.0005510 seconds, Stopping threads took: 0.894 
> seconds
> 2016-11-22T20:43:16.463+: 2942054.509: Total time for which application 
> threads were stopped: 0.0029195 seconds, Stopping threads took: 0.804 
> seconds
> {Heap before GC invocations=2246 (full 0):
>  garbage-first heap   total 26673152K, used 4683965K [0x7f0c1000, 
> 0x7f0c108065c0, 0x7f141000)
>   region size 8192K, 162 young (1327104K), 17 survivors (139264K)
>  Metaspace   used 56487K, capacity 57092K, committed 58368K, reserved 
> 59392K
> 2016-11-22T20:43:16.555+: 2942054.602: [GC pause (G1 Evacuation Pause) 
> (young)
> Desired survivor size 88080384 bytes, new threshold 15 (max 15)
> - age   1:   28176280 bytes,   28176280 total
> - age   2:5632480 bytes,   33808760 total
> - age   3:9719072 bytes,   43527832 total
> - age   4:6219408 bytes,   49747240 total
> - age   5:4465544 bytes,   54212784 total
> - age   6:3417168 bytes,   57629952 total
> - age   7:5343072 bytes,   62973024 total
> - age   8:2784808 bytes,   65757832 total
> - age   9:6538056 bytes,   72295888 total
> - age  10:6368016 bytes,   78663904 total
> - age  11: 695216 bytes,   79359120 total
> , 97.2044320 secs]
>[Parallel Time: 19.8 ms, GC Workers: 18]
>   [GC Worker Start (ms): Min: 2942054602.1, Avg: 2942054604.6, Max: 
> 2942054612.7, Diff: 10.6]
>   [Ext Root Scanning (ms): Min: 0.0, Avg: 2.4, Max: 6.7, Diff: 6.7, Sum: 
> 43.5]
>   [Update RS (ms): Min: 0.0, Avg: 3.0, Max: 15.9, Diff: 15.9, Sum: 54.0]
>  [Processed Buffers: Min: 0, Avg: 10.7, Max: 39, Diff: 39, Sum: 192]
>   [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.6]
>   [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 
> 0.0]
>   [Object Copy (ms): Min: 0.1, Avg: 9.2, Max: 13.4, Diff: 13.3, Sum: 
> 165.9]
>   [Termination (ms): Min: 0.0, Avg: 2.5, Max: 2.7, Diff: 2.7, Sum: 44.1]
>  [Termination Attempts: Min: 1, Avg: 1.5, Max: 3, Diff: 2, Sum: 27]
>   [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 
> 0.6]
>   [GC Worker Total (ms): Min: 9.0, Avg: 17.1, Max: 19.7, Diff: 10.6, Sum: 
> 308.7]
>   [GC Worker End (ms): Min: 2942054621.8, Avg: 2942054621.8, Max: 
> 2942054621.8, Diff: 0.0]
>[Code Root Fixup: 0.1 ms]
>[Code Root Purge: 0.0 ms]
>[Clear CT: 0.2 ms]
>[Other: 97184.3 ms]
>   [Choose CSet: 0.0 ms]
>   [Ref Proc: 8.5 ms]
>   [Ref Enq: 0.2 ms]
>   [Redirty Cards: 0.2 ms]
>   [Humongous Register: 0.1 ms]
>   [Humongous Reclaim: 0.1 ms]
>   [Free CSet: 0.4 ms]
>[Eden: 1160.0M(1160.0M)->0.0B(1200.0M) Survivors: 136.0M->168.0M Heap: 
> 4574.2M(25.4G)->3450.8M(26.8G)]
> Heap after GC invocations=2247 (full 0):
>  garbage-first heap   total 28049408K, used 3533601K [0x7f0c1000, 
> 0x7f0c10806b00, 0x7f141000)
>   region size 8192K, 21 young (172032K), 21 survivors (172032K)
>  Metaspace   used 56487K, capacity 57092K, committed 58368K, reserved 
> 59392K
> }
>  [Times: user=0.00 sys=94.28, real=97.19 secs] 
> 2016-11-22T20:44:53.760+: 2942151.806: Total time for which application 
> 

[jira] [Commented] (SOLR-9828) Very long young generation stop the world GC pause

2016-12-08 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15732269#comment-15732269
 ] 

Forest Soup commented on SOLR-9828:
---

Thanks Shawn, 

I'll use this mail thread talking on it instead of this JIRA. 

Could you please help comment on the question in the mail thread? Thanks!

1, As you can see in the gc log, the long GC pause is not a full GC. It's a 
young generation GC instead.   
In our case, full gc is fast and young gc got some long stw pause. 
Do you have any comments on that, as we usually believe full gc may cause 
longer pause, but young generation should be ok? 

2, Will these JVM options make it better? 
-XX:+UnlockExperimentalVMOptions -XX:G1NewSizePercent=10 

2016-11-22T20:43:16.463+: 2942054.509: Total time for which application 
threads were stopped: 0.0029195 seconds, Stopping threads took: 0.804 
seconds 
{Heap before GC invocations=2246 (full 0): 
 garbage-first heap   total 26673152K, used 4683965K [0x7f0c1000, 
0x7f0c108065c0, 0x7f141000) 
  region size 8192K, 162 young (1327104K), 17 survivors (139264K) 
 Metaspace   used 56487K, capacity 57092K, committed 58368K, reserved 
59392K 
2016-11-22T20:43:16.555+: 2942054.602: [GC pause (G1 Evacuation Pause) 
(young) 
Desired survivor size 88080384 bytes, new threshold 15 (max 15) 
- age   1:   28176280 bytes,   28176280 total 
- age   2:5632480 bytes,   33808760 total 
- age   3:9719072 bytes,   43527832 total 
- age   4:6219408 bytes,   49747240 total 
- age   5:4465544 bytes,   54212784 total 
- age   6:3417168 bytes,   57629952 total 
- age   7:5343072 bytes,   62973024 total 
- age   8:2784808 bytes,   65757832 total 
- age   9:6538056 bytes,   72295888 total 
- age  10:6368016 bytes,   78663904 total 
- age  11: 695216 bytes,   79359120 total 
, 97.2044320 secs] 
   [Parallel Time: 19.8 ms, GC Workers: 18] 
  [GC Worker Start (ms): Min: 2942054602.1, Avg: 2942054604.6, Max: 
2942054612.7, Diff: 10.6] 
  [Ext Root Scanning (ms): Min: 0.0, Avg: 2.4, Max: 6.7, Diff: 6.7, Sum: 
43.5] 
  [Update RS (ms): Min: 0.0, Avg: 3.0, Max: 15.9, Diff: 15.9, Sum: 54.0] 
 [Processed Buffers: Min: 0, Avg: 10.7, Max: 39, Diff: 39, Sum: 192] 
  [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.6] 
  [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 
0.0] 
  [Object Copy (ms): Min: 0.1, Avg: 9.2, Max: 13.4, Diff: 13.3, Sum: 165.9] 
  [Termination (ms): Min: 0.0, Avg: 2.5, Max: 2.7, Diff: 2.7, Sum: 44.1] 
 [Termination Attempts: Min: 1, Avg: 1.5, Max: 3, Diff: 2, Sum: 27] 
  [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.6] 
  [GC Worker Total (ms): Min: 9.0, Avg: 17.1, Max: 19.7, Diff: 10.6, Sum: 
308.7] 
  [GC Worker End (ms): Min: 2942054621.8, Avg: 2942054621.8, Max: 
2942054621.8, Diff: 0.0] 
   [Code Root Fixup: 0.1 ms] 
   [Code Root Purge: 0.0 ms] 
   [Clear CT: 0.2 ms] 
   [Other: 97184.3 ms] 
  [Choose CSet: 0.0 ms] 
  [Ref Proc: 8.5 ms] 
  [Ref Enq: 0.2 ms] 
  [Redirty Cards: 0.2 ms] 
  [Humongous Register: 0.1 ms] 
  [Humongous Reclaim: 0.1 ms] 
  [Free CSet: 0.4 ms] 
   [Eden: 1160.0M(1160.0M)->0.0B(1200.0M) Survivors: 136.0M->168.0M Heap: 
4574.2M(25.4G)->3450.8M(26.8G)] 
Heap after GC invocations=2247 (full 0): 
 garbage-first heap   total 28049408K, used 3533601K [0x7f0c1000, 
0x7f0c10806b00, 0x7f141000) 
  region size 8192K, 21 young (172032K), 21 survivors (172032K) 
 Metaspace   used 56487K, capacity 57092K, committed 58368K, reserved 
59392K 
} 
 [Times: user=0.00 sys=94.28, real=97.19 secs] 
2016-11-22T20:44:53.760+: 2942151.806: Total time for which application 
threads were stopped: 97.2053747 seconds, Stopping threads took: 0.0001373 
seconds

> Very long young generation stop the world GC pause 
> ---
>
> Key: SOLR-9828
> URL: https://issues.apache.org/jira/browse/SOLR-9828
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.3.2
> Environment: Linux Redhat 64bit
>Reporter: Forest Soup
>
> We are using oracle jdk8u92 64bit.
> The jvm memory related options:
> -Xms32768m 
> -Xmx32768m 
> -XX:+HeapDumpOnOutOfMemoryError 
> -XX:HeapDumpPath=/mnt/solrdata1/log 
> -XX:+UseG1GC 
> -XX:+PerfDisableSharedMem 
> -XX:+ParallelRefProcEnabled 
> -XX:G1HeapRegionSize=8m 
> -XX:MaxGCPauseMillis=100 
> -XX:InitiatingHeapOccupancyPercent=35 
> -XX:+AggressiveOpts 
> -XX:+AlwaysPreTouch 
> -XX:ConcGCThreads=16 
> -XX:ParallelGCThreads=18 
> -XX:+HeapDumpOnOutOfMemoryError 
> -XX:HeapDumpPath=/mnt/solrdata1/log 
> -verbose:gc 
> -XX:+PrintHeapAtGC 
> -XX:+PrintGCDetails 
> -XX:+PrintGCDateStamps 
> 

[jira] [Updated] (SOLR-9829) Solr cannot provide index service after a large GC pause but core state in ZK is still active

2016-12-08 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9829:
--
Description: 
When Solr meets a large GC pause like 
https://issues.apache.org/jira/browse/SOLR-9828 , the collections on it cannot 
provide service and never come back until restart. 

But in the ZooKeeper, the cores on that server still shows active and server is 
also in live_nodes. 

Some /update requests got http 500 due to "IndexWriter is closed". Some gots 
http 400 due to "possible analysis error." whose root cause is still 
"IndexWriter is closed", which we think it should return 500 instead(documented 
in https://issues.apache.org/jira/browse/SOLR-9825).

Our questions in this JIRA are:
1, should solr mark cores as down in zk when it cannot provide index service?
2, Is it possible solr re-open the IndexWriter to provide index service again?

solr log snippets:
2016-11-22 20:47:37.274 ERROR (qtp2011912080-76) [c:collection12 s:shard1 
r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore 
org.apache.solr.common.SolrException: Exception writing document id 
Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20841350!270CE4F9C032EC26002580730061473C 
to the index; possible analysis error.
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at 

[jira] [Commented] (SOLR-9829) Solr cannot provide index service after a large GC pause but core state in ZK is still active

2016-12-07 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731274#comment-15731274
 ] 

Forest Soup commented on SOLR-9829:
---

Hi Erick, 

I'm sure the solr node is still in the live_nodes list. 
The logs are from solr log. And the most root cause I can see here is the 
IndexWriter is closed.

> Solr cannot provide index service after a large GC pause but core state in ZK 
> is still active
> -
>
> Key: SOLR-9829
> URL: https://issues.apache.org/jira/browse/SOLR-9829
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update
>Affects Versions: 5.3.2
> Environment: Redhat enterprise server 64bit 
>Reporter: Forest Soup
>
> When Solr meets a large GC pause like 
> https://issues.apache.org/jira/browse/SOLR-9828 , the collections on it 
> cannot provide service and never come back until restart. 
> But in the ZooKeeper, the cores on that server still shows active. 
> Some /update requests got http 500 due to "IndexWriter is closed". Some gots 
> http 400 due to "possible analysis error." whose root cause is still 
> "IndexWriter is closed", which we think it should return 500 
> instead(documented in https://issues.apache.org/jira/browse/SOLR-9825).
> Our questions in this JIRA are:
> 1, should solr mark cores as down in zk when it cannot provide index service?
> 2, Is it possible solr re-open the IndexWriter to provide index service again?
> solr log snippets:
> 2016-11-22 20:47:37.274 ERROR (qtp2011912080-76) [c:collection12 s:shard1 
> r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore 
> org.apache.solr.common.SolrException: Exception writing document id 
> Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20841350!270CE4F9C032EC26002580730061473C 
> to the index; possible analysis error.
>   at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
>   at 
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231)
>   at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
>   at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
>   at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
>   at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
>   at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>   at 
> 

[jira] [Updated] (SOLR-9829) Solr cannot provide index service after a large GC pause but state in ZK is still active

2016-12-06 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9829:
--
Summary: Solr cannot provide index service after a large GC pause but state 
in ZK is still active  (was: Solr cannot provide index service after a large GC 
pause)

> Solr cannot provide index service after a large GC pause but state in ZK is 
> still active
> 
>
> Key: SOLR-9829
> URL: https://issues.apache.org/jira/browse/SOLR-9829
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update
>Affects Versions: 5.3.2
> Environment: Redhat enterprise server 64bit 
>Reporter: Forest Soup
>
> When Solr meets a large GC pause like 
> https://issues.apache.org/jira/browse/SOLR-9828 , the collections on it 
> cannot provide service and never come back until restart. 
> But in the ZooKeeper, the cores on that server still shows active. 
> Some /update requests got http 500 due to "IndexWriter is closed". Some gots 
> http 400 due to "possible analysis error." whose root cause is still 
> "IndexWriter is closed", which we think it should return 500 
> instead(documented in https://issues.apache.org/jira/browse/SOLR-9825).
> Our questions in this JIRA are:
> 1, should solr mark it as down when it cannot provide index service?
> 2, Is it possible solr re-open the IndexWriter to provide index service again?
> solr log snippets:
> 2016-11-22 20:47:37.274 ERROR (qtp2011912080-76) [c:collection12 s:shard1 
> r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore 
> org.apache.solr.common.SolrException: Exception writing document id 
> Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20841350!270CE4F9C032EC26002580730061473C 
> to the index; possible analysis error.
>   at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
>   at 
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231)
>   at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
>   at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
>   at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
>   at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
>   at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> 

[jira] [Updated] (SOLR-9829) Solr cannot provide index service after a large GC pause but core state in ZK is still active

2016-12-06 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9829:
--
Description: 
When Solr meets a large GC pause like 
https://issues.apache.org/jira/browse/SOLR-9828 , the collections on it cannot 
provide service and never come back until restart. 

But in the ZooKeeper, the cores on that server still shows active. 

Some /update requests got http 500 due to "IndexWriter is closed". Some gots 
http 400 due to "possible analysis error." whose root cause is still 
"IndexWriter is closed", which we think it should return 500 instead(documented 
in https://issues.apache.org/jira/browse/SOLR-9825).

Our questions in this JIRA are:
1, should solr mark cores as down when it cannot provide index service?
2, Is it possible solr re-open the IndexWriter to provide index service again?

solr log snippets:
2016-11-22 20:47:37.274 ERROR (qtp2011912080-76) [c:collection12 s:shard1 
r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore 
org.apache.solr.common.SolrException: Exception writing document id 
Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20841350!270CE4F9C032EC26002580730061473C 
to the index; possible analysis error.
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at 
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at 

[jira] [Updated] (SOLR-9829) Solr cannot provide index service after a large GC pause but core state in ZK is still active

2016-12-06 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9829:
--
Description: 
When Solr meets a large GC pause like 
https://issues.apache.org/jira/browse/SOLR-9828 , the collections on it cannot 
provide service and never come back until restart. 

But in the ZooKeeper, the cores on that server still shows active. 

Some /update requests got http 500 due to "IndexWriter is closed". Some gots 
http 400 due to "possible analysis error." whose root cause is still 
"IndexWriter is closed", which we think it should return 500 instead(documented 
in https://issues.apache.org/jira/browse/SOLR-9825).

Our questions in this JIRA are:
1, should solr mark cores as down in zk when it cannot provide index service?
2, Is it possible solr re-open the IndexWriter to provide index service again?

solr log snippets:
2016-11-22 20:47:37.274 ERROR (qtp2011912080-76) [c:collection12 s:shard1 
r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore 
org.apache.solr.common.SolrException: Exception writing document id 
Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20841350!270CE4F9C032EC26002580730061473C 
to the index; possible analysis error.
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at 
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at 

[jira] [Updated] (SOLR-9829) Solr cannot provide index service after a large GC pause but core state in ZK is still active

2016-12-06 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9829:
--
Summary: Solr cannot provide index service after a large GC pause but core 
state in ZK is still active  (was: Solr cannot provide index service after a 
large GC pause but state in ZK is still active)

> Solr cannot provide index service after a large GC pause but core state in ZK 
> is still active
> -
>
> Key: SOLR-9829
> URL: https://issues.apache.org/jira/browse/SOLR-9829
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update
>Affects Versions: 5.3.2
> Environment: Redhat enterprise server 64bit 
>Reporter: Forest Soup
>
> When Solr meets a large GC pause like 
> https://issues.apache.org/jira/browse/SOLR-9828 , the collections on it 
> cannot provide service and never come back until restart. 
> But in the ZooKeeper, the cores on that server still shows active. 
> Some /update requests got http 500 due to "IndexWriter is closed". Some gots 
> http 400 due to "possible analysis error." whose root cause is still 
> "IndexWriter is closed", which we think it should return 500 
> instead(documented in https://issues.apache.org/jira/browse/SOLR-9825).
> Our questions in this JIRA are:
> 1, should solr mark it as down when it cannot provide index service?
> 2, Is it possible solr re-open the IndexWriter to provide index service again?
> solr log snippets:
> 2016-11-22 20:47:37.274 ERROR (qtp2011912080-76) [c:collection12 s:shard1 
> r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore 
> org.apache.solr.common.SolrException: Exception writing document id 
> Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20841350!270CE4F9C032EC26002580730061473C 
> to the index; possible analysis error.
>   at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
>   at 
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231)
>   at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
>   at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
>   at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
>   at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
>   at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>   at 
> 

[jira] [Created] (SOLR-9829) Solr cannot provide index service after a large GC pause

2016-12-05 Thread Forest Soup (JIRA)
Forest Soup created SOLR-9829:
-

 Summary: Solr cannot provide index service after a large GC pause
 Key: SOLR-9829
 URL: https://issues.apache.org/jira/browse/SOLR-9829
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
  Components: update
Affects Versions: 5.3.2
 Environment: Redhat enterprise server 64bit 
Reporter: Forest Soup


When Solr meets a large GC pause like 
https://issues.apache.org/jira/browse/SOLR-9828 , the collections on it cannot 
provide service and never come back until restart. 

But in the ZooKeeper, the cores on that server still shows active. 

Some /update requests got http 500 due to "IndexWriter is closed". Some gots 
http 400 due to "possible analysis error." whose root cause is still 
"IndexWriter is closed", which we think it should return 500 instead(documented 
in https://issues.apache.org/jira/browse/SOLR-9825).

Our questions in this JIRA are:
1, should solr mark it as down when it cannot provide index service?
2, Is it possible solr re-open the IndexWriter to provide index service again?

solr log snippets:
2016-11-22 20:47:37.274 ERROR (qtp2011912080-76) [c:collection12 s:shard1 
r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore 
org.apache.solr.common.SolrException: Exception writing document id 
Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20841350!270CE4F9C032EC26002580730061473C 
to the index; possible analysis error.
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at 

[jira] [Updated] (SOLR-9825) Solr should not return HTTP 400 for some cases

2016-12-05 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9825:
--
Affects Version/s: (was: 5.3)
   5.3.2

> Solr should not return HTTP 400 for some cases
> --
>
> Key: SOLR-9825
> URL: https://issues.apache.org/jira/browse/SOLR-9825
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.3.2
>Reporter: Forest Soup
>
> For some cases, when solr handling requests, it should not always return http 
> 400.  We met several cases, here is the recent two:
> Case 1:  When adding a doc, if there is runtime error happens, even it's a 
> solr internal issue, it returns http 400 to confuse the client. Actually the 
> request is good, while IndexWriter is closed. 
> The exception stack is:
> 2016-11-22 21:23:32.858 ERROR (qtp2011912080-83) [c:collection12 s:shard1 
> r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore 
> org.apache.solr.common.SolrException: Exception writing document id 
> Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20824042!8918AB024CF638F685257DDC00074D78 
> to the index; possible analysis error.
>   at 
> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
>   at 
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
>   at 
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207)
>   at 
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
>   at 
> org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231)
>   at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
>   at 
> org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
>   at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
>   at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
>   at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>   at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
>   at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
>   at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
>   at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>   at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>   at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>   at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
>   at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>   at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
>   at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
>   at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
>   at org.eclipse.jetty.server.Server.handle(Server.java:499)
>   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
>   at 
> 

[jira] [Updated] (SOLR-9828) Very long young generation stop the world GC pause

2016-12-05 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9828:
--
Description: 
We are using oracle jdk8u92 64bit.
The jvm memory related options:
-Xms32768m 
-Xmx32768m 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/mnt/solrdata1/log 
-XX:+UseG1GC 
-XX:+PerfDisableSharedMem 
-XX:+ParallelRefProcEnabled 
-XX:G1HeapRegionSize=8m 
-XX:MaxGCPauseMillis=100 
-XX:InitiatingHeapOccupancyPercent=35 
-XX:+AggressiveOpts 
-XX:+AlwaysPreTouch 
-XX:ConcGCThreads=16 
-XX:ParallelGCThreads=18 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/mnt/solrdata1/log 
-verbose:gc 
-XX:+PrintHeapAtGC 
-XX:+PrintGCDetails 
-XX:+PrintGCDateStamps 
-XX:+PrintGCTimeStamps 
-XX:+PrintTenuringDistribution 
-XX:+PrintGCApplicationStoppedTime 
-Xloggc:/mnt/solrdata1/log/solr_gc.log

It usually works fine. But recently we met very long stop the world young 
generation GC pause. Some snippets of the gc log are as below:
2016-11-22T20:43:16.436+: 2942054.483: Total time for which application 
threads were stopped: 0.0005510 seconds, Stopping threads took: 0.894 
seconds
2016-11-22T20:43:16.463+: 2942054.509: Total time for which application 
threads were stopped: 0.0029195 seconds, Stopping threads took: 0.804 
seconds
{Heap before GC invocations=2246 (full 0):
 garbage-first heap   total 26673152K, used 4683965K [0x7f0c1000, 
0x7f0c108065c0, 0x7f141000)
  region size 8192K, 162 young (1327104K), 17 survivors (139264K)
 Metaspace   used 56487K, capacity 57092K, committed 58368K, reserved 59392K
2016-11-22T20:43:16.555+: 2942054.602: [GC pause (G1 Evacuation Pause) 
(young)
Desired survivor size 88080384 bytes, new threshold 15 (max 15)
- age   1:   28176280 bytes,   28176280 total
- age   2:5632480 bytes,   33808760 total
- age   3:9719072 bytes,   43527832 total
- age   4:6219408 bytes,   49747240 total
- age   5:4465544 bytes,   54212784 total
- age   6:3417168 bytes,   57629952 total
- age   7:5343072 bytes,   62973024 total
- age   8:2784808 bytes,   65757832 total
- age   9:6538056 bytes,   72295888 total
- age  10:6368016 bytes,   78663904 total
- age  11: 695216 bytes,   79359120 total
, 97.2044320 secs]
   [Parallel Time: 19.8 ms, GC Workers: 18]
  [GC Worker Start (ms): Min: 2942054602.1, Avg: 2942054604.6, Max: 
2942054612.7, Diff: 10.6]
  [Ext Root Scanning (ms): Min: 0.0, Avg: 2.4, Max: 6.7, Diff: 6.7, Sum: 
43.5]
  [Update RS (ms): Min: 0.0, Avg: 3.0, Max: 15.9, Diff: 15.9, Sum: 54.0]
 [Processed Buffers: Min: 0, Avg: 10.7, Max: 39, Diff: 39, Sum: 192]
  [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.6]
  [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 
0.0]
  [Object Copy (ms): Min: 0.1, Avg: 9.2, Max: 13.4, Diff: 13.3, Sum: 165.9]
  [Termination (ms): Min: 0.0, Avg: 2.5, Max: 2.7, Diff: 2.7, Sum: 44.1]
 [Termination Attempts: Min: 1, Avg: 1.5, Max: 3, Diff: 2, Sum: 27]
  [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.6]
  [GC Worker Total (ms): Min: 9.0, Avg: 17.1, Max: 19.7, Diff: 10.6, Sum: 
308.7]
  [GC Worker End (ms): Min: 2942054621.8, Avg: 2942054621.8, Max: 
2942054621.8, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.2 ms]
   [Other: 97184.3 ms]
  [Choose CSet: 0.0 ms]
  [Ref Proc: 8.5 ms]
  [Ref Enq: 0.2 ms]
  [Redirty Cards: 0.2 ms]
  [Humongous Register: 0.1 ms]
  [Humongous Reclaim: 0.1 ms]
  [Free CSet: 0.4 ms]
   [Eden: 1160.0M(1160.0M)->0.0B(1200.0M) Survivors: 136.0M->168.0M Heap: 
4574.2M(25.4G)->3450.8M(26.8G)]
Heap after GC invocations=2247 (full 0):
 garbage-first heap   total 28049408K, used 3533601K [0x7f0c1000, 
0x7f0c10806b00, 0x7f141000)
  region size 8192K, 21 young (172032K), 21 survivors (172032K)
 Metaspace   used 56487K, capacity 57092K, committed 58368K, reserved 59392K
}
 [Times: user=0.00 sys=94.28, real=97.19 secs] 
2016-11-22T20:44:53.760+: 2942151.806: Total time for which application 
threads were stopped: 97.2053747 seconds, Stopping threads took: 0.0001373 
seconds
2016-11-22T20:44:53.762+: 2942151.809: Total time for which application 
threads were stopped: 0.0008138 seconds, Stopping threads took: 0.0001258 
seconds

And CPU reached near 100% during the GC.
The load is normal at that time according to the stats of solr 
update/select/delete handler and jetty request log.



  was:
We are using oracle jdk8u92 64bit.
The jvm memory related options:
-Xms32768m 
-Xmx32768m 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/mnt/solrdata1/log 
-XX:+UseG1GC 
-XX:+PerfDisableSharedMem 
-XX:+ParallelRefProcEnabled 
-XX:G1HeapRegionSize=8m 
-XX:MaxGCPauseMillis=100 
-XX:InitiatingHeapOccupancyPercent=35 
-XX:+AggressiveOpts 
-XX:+AlwaysPreTouch 

[jira] [Created] (SOLR-9828) Very long young generation stop the world GC pause

2016-12-05 Thread Forest Soup (JIRA)
Forest Soup created SOLR-9828:
-

 Summary: Very long young generation stop the world GC pause 
 Key: SOLR-9828
 URL: https://issues.apache.org/jira/browse/SOLR-9828
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 5.3.2
 Environment: Linux Redhat 64bit
Reporter: Forest Soup


We are using oracle jdk8u92 64bit.
The jvm memory related options:
-Xms32768m 
-Xmx32768m 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/mnt/solrdata1/log 
-XX:+UseG1GC 
-XX:+PerfDisableSharedMem 
-XX:+ParallelRefProcEnabled 
-XX:G1HeapRegionSize=8m 
-XX:MaxGCPauseMillis=100 
-XX:InitiatingHeapOccupancyPercent=35 
-XX:+AggressiveOpts 
-XX:+AlwaysPreTouch 
-XX:ConcGCThreads=16 
-XX:ParallelGCThreads=18 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/mnt/solrdata1/log 
-verbose:gc 
-XX:+PrintHeapAtGC 
-XX:+PrintGCDetails 
-XX:+PrintGCDateStamps 
-XX:+PrintGCTimeStamps 
-XX:+PrintTenuringDistribution 
-XX:+PrintGCApplicationStoppedTime 
-Xloggc:/mnt/solrdata1/log/solr_gc.log

It usually works fine. But recently we met very long stop the world young 
generation GC pause. Some snippets of the gc log are as below:
2016-11-22T20:43:16.436+: 2942054.483: Total time for which application 
threads were stopped: 0.0005510 seconds, Stopping threads took: 0.894 
seconds
2016-11-22T20:43:16.463+: 2942054.509: Total time for which application 
threads were stopped: 0.0029195 seconds, Stopping threads took: 0.804 
seconds
{Heap before GC invocations=2246 (full 0):
 garbage-first heap   total 26673152K, used 4683965K [0x7f0c1000, 
0x7f0c108065c0, 0x7f141000)
  region size 8192K, 162 young (1327104K), 17 survivors (139264K)
 Metaspace   used 56487K, capacity 57092K, committed 58368K, reserved 59392K
2016-11-22T20:43:16.555+: 2942054.602: [GC pause (G1 Evacuation Pause) 
(young)
Desired survivor size 88080384 bytes, new threshold 15 (max 15)
- age   1:   28176280 bytes,   28176280 total
- age   2:5632480 bytes,   33808760 total
- age   3:9719072 bytes,   43527832 total
- age   4:6219408 bytes,   49747240 total
- age   5:4465544 bytes,   54212784 total
- age   6:3417168 bytes,   57629952 total
- age   7:5343072 bytes,   62973024 total
- age   8:2784808 bytes,   65757832 total
- age   9:6538056 bytes,   72295888 total
- age  10:6368016 bytes,   78663904 total
- age  11: 695216 bytes,   79359120 total
, 97.2044320 secs]
   [Parallel Time: 19.8 ms, GC Workers: 18]
  [GC Worker Start (ms): Min: 2942054602.1, Avg: 2942054604.6, Max: 
2942054612.7, Diff: 10.6]
  [Ext Root Scanning (ms): Min: 0.0, Avg: 2.4, Max: 6.7, Diff: 6.7, Sum: 
43.5]
  [Update RS (ms): Min: 0.0, Avg: 3.0, Max: 15.9, Diff: 15.9, Sum: 54.0]
 [Processed Buffers: Min: 0, Avg: 10.7, Max: 39, Diff: 39, Sum: 192]
  [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.6]
  [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 
0.0]
  [Object Copy (ms): Min: 0.1, Avg: 9.2, Max: 13.4, Diff: 13.3, Sum: 165.9]
  [Termination (ms): Min: 0.0, Avg: 2.5, Max: 2.7, Diff: 2.7, Sum: 44.1]
 [Termination Attempts: Min: 1, Avg: 1.5, Max: 3, Diff: 2, Sum: 27]
  [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.6]
  [GC Worker Total (ms): Min: 9.0, Avg: 17.1, Max: 19.7, Diff: 10.6, Sum: 
308.7]
  [GC Worker End (ms): Min: 2942054621.8, Avg: 2942054621.8, Max: 
2942054621.8, Diff: 0.0]
   [Code Root Fixup: 0.1 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.2 ms]
   [Other: 97184.3 ms]
  [Choose CSet: 0.0 ms]
  [Ref Proc: 8.5 ms]
  [Ref Enq: 0.2 ms]
  [Redirty Cards: 0.2 ms]
  [Humongous Register: 0.1 ms]
  [Humongous Reclaim: 0.1 ms]
  [Free CSet: 0.4 ms]
   [Eden: 1160.0M(1160.0M)->0.0B(1200.0M) Survivors: 136.0M->168.0M Heap: 
4574.2M(25.4G)->3450.8M(26.8G)]
Heap after GC invocations=2247 (full 0):
 garbage-first heap   total 28049408K, used 3533601K [0x7f0c1000, 
0x7f0c10806b00, 0x7f141000)
  region size 8192K, 21 young (172032K), 21 survivors (172032K)
 Metaspace   used 56487K, capacity 57092K, committed 58368K, reserved 59392K
}
 [Times: user=0.00 sys=94.28, real=97.19 secs] 
2016-11-22T20:44:53.760+: 2942151.806: Total time for which application 
threads were stopped: 97.2053747 seconds, Stopping threads took: 0.0001373 
seconds
2016-11-22T20:44:53.762+: 2942151.809: Total time for which application 
threads were stopped: 0.0008138 seconds, Stopping threads took: 0.0001258 
seconds

And CPU reached near 100% during the GC.
The load is not visibly high at that time.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For 

[jira] [Created] (SOLR-9825) Solr should not return HTTP 400 for some cases

2016-12-04 Thread Forest Soup (JIRA)
Forest Soup created SOLR-9825:
-

 Summary: Solr should not return HTTP 400 for some cases
 Key: SOLR-9825
 URL: https://issues.apache.org/jira/browse/SOLR-9825
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 5.3
Reporter: Forest Soup


For some cases, when solr handling requests, it should not always return http 
400.  We met several cases, here is the recent two:

Case 1:  When adding a doc, if there is runtime error happens, even it's a solr 
internal issue, it returns http 400 to confuse the client. Actually the request 
is good, while IndexWriter is closed. 

The exception stack is:
2016-11-22 21:23:32.858 ERROR (qtp2011912080-83) [c:collection12 s:shard1 
r:core_node1 x:collection12_shard1_replica1] o.a.s.c.SolrCore 
org.apache.solr.common.SolrException: Exception writing document id 
Q049dXMxYjMtbWFpbDg4L089bGxuX3VzMQ==20824042!8918AB024CF638F685257DDC00074D78 
to the index; possible analysis error.
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:167)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:955)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1110)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:706)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:207)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.CloneFieldUpdateProcessorFactory$1.processAdd(CloneFieldUpdateProcessorFactory.java:231)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at 
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at 

[jira] [Updated] (SOLR-9741) Solr has a CPU% spike when indexing a batch of data

2016-11-08 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9741:
--
Description: 
When we doing a batch of index and search operations to SolrCloud v5.3.2, we 
usually met a CPU% spike lasting about 10 min. 
We have 5 physical servers, 2 solr instances running on each server with 
different port(8983 and 8984), all 8983 are in a same solrcloud, all 8984 are 
in another solrcloud.

You can see the chart in the attach file screenshot-1.png.
The thread dump are in the attach file threads.zip.

During the spike, the thread dump shows most of the threads are with the call 
stacks below:
"qtp634210724-4759" #4759 prio=5 os_prio=0 tid=0x7fb32803e000 nid=0x64e7 
runnable [0x7fb3ef1ef000]
   java.lang.Thread.State: RUNNABLE
at 
java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444)
at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419)
at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298)
at java.lang.ThreadLocal.get(ThreadLocal.java:163)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.get(SolrQueryTimeoutImpl.java:49)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.shouldExit(SolrQueryTimeoutImpl.java:57)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.checkAndThrow(ExitableDirectoryReader.java:165)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.(ExitableDirectoryReader.java:157)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTerms.iterator(ExitableDirectoryReader.java:141)
at org.apache.lucene.index.TermContext.build(TermContext.java:93)
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:192)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at 
org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:838)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
at 
org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:456)
at org.apache.solr.search.Grouping.execute(Grouping.java:370)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:496)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:277)
 

  was:
When we doing a batch of index and search operations to SolrCloud v5.3.2, we 
usually met a CPU% spike lasting about 10 min. 
You can see the chart in the attach file screenshot-1.png.

During the spike, the thread dump shows most of the threads are with the call 
stacks below:
"qtp634210724-4759" #4759 prio=5 os_prio=0 tid=0x7fb32803e000 nid=0x64e7 
runnable [0x7fb3ef1ef000]
   java.lang.Thread.State: RUNNABLE
at 
java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444)
at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419)
at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298)
at java.lang.ThreadLocal.get(ThreadLocal.java:163)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.get(SolrQueryTimeoutImpl.java:49)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.shouldExit(SolrQueryTimeoutImpl.java:57)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.checkAndThrow(ExitableDirectoryReader.java:165)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.(ExitableDirectoryReader.java:157)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTerms.iterator(ExitableDirectoryReader.java:141)
at org.apache.lucene.index.TermContext.build(TermContext.java:93)
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:192)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
  

[jira] [Updated] (SOLR-9741) Solr has a CPU% spike when indexing a batch of data

2016-11-08 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9741:
--
Attachment: threads.zip

> Solr has a CPU% spike when indexing a batch of data
> ---
>
> Key: SOLR-9741
> URL: https://issues.apache.org/jira/browse/SOLR-9741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.3.2
> Environment: Linux 64bit
>Reporter: Forest Soup
> Attachments: screenshot-1.png, threads.zip
>
>
> When we doing a batch of index and search operations to SolrCloud v5.3.2, we 
> usually met a CPU% spike lasting about 10 min. 
> You can see the chart in the attach file screenshot-1.png.
> During the spike, the thread dump shows most of the threads are with the call 
> stacks below:
> "qtp634210724-4759" #4759 prio=5 os_prio=0 tid=0x7fb32803e000 nid=0x64e7 
> runnable [0x7fb3ef1ef000]
>java.lang.Thread.State: RUNNABLE
> at 
> java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444)
> at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419)
> at 
> java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298)
> at java.lang.ThreadLocal.get(ThreadLocal.java:163)
> at 
> org.apache.solr.search.SolrQueryTimeoutImpl.get(SolrQueryTimeoutImpl.java:49)
> at 
> org.apache.solr.search.SolrQueryTimeoutImpl.shouldExit(SolrQueryTimeoutImpl.java:57)
> at 
> org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.checkAndThrow(ExitableDirectoryReader.java:165)
> at 
> org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.(ExitableDirectoryReader.java:157)
> at 
> org.apache.lucene.index.ExitableDirectoryReader$ExitableTerms.iterator(ExitableDirectoryReader.java:141)
> at org.apache.lucene.index.TermContext.build(TermContext.java:93)
> at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:192)
> at 
> org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
> at 
> org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
> at 
> org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
> at 
> org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
> at 
> org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
> at 
> org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
> at 
> org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
> at 
> org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
> at 
> org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
> at 
> org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
> at 
> org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:838)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
> at 
> org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:456)
> at org.apache.solr.search.Grouping.execute(Grouping.java:370)
> at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:496)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:277)
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9741) Solr has a CPU% spike when indexing a batch of data

2016-11-08 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9741:
--
Description: 
When we doing a batch of index and search operations to SolrCloud v5.3.2, we 
usually met a CPU% spike lasting about 10 min. 
You can see the chart in the attach file screenshot-1.png.

During the spike, the thread dump shows most of the threads are with the call 
stacks below:
"qtp634210724-4759" #4759 prio=5 os_prio=0 tid=0x7fb32803e000 nid=0x64e7 
runnable [0x7fb3ef1ef000]
   java.lang.Thread.State: RUNNABLE
at 
java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444)
at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419)
at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298)
at java.lang.ThreadLocal.get(ThreadLocal.java:163)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.get(SolrQueryTimeoutImpl.java:49)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.shouldExit(SolrQueryTimeoutImpl.java:57)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.checkAndThrow(ExitableDirectoryReader.java:165)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.(ExitableDirectoryReader.java:157)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTerms.iterator(ExitableDirectoryReader.java:141)
at org.apache.lucene.index.TermContext.build(TermContext.java:93)
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:192)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at 
org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:838)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
at 
org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:456)
at org.apache.solr.search.Grouping.execute(Grouping.java:370)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:496)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:277)
 

  was:
When we doing a batch of index and search operations to SolrCloud v5.3.2, we 
usually met a CPU% spike lasting about 10 min. 
You can see the chart in the attach file 

During the spike, the thread dump shows most of the threads are with the call 
stacks below:
"qtp634210724-4759" #4759 prio=5 os_prio=0 tid=0x7fb32803e000 nid=0x64e7 
runnable [0x7fb3ef1ef000]
   java.lang.Thread.State: RUNNABLE
at 
java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444)
at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419)
at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298)
at java.lang.ThreadLocal.get(ThreadLocal.java:163)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.get(SolrQueryTimeoutImpl.java:49)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.shouldExit(SolrQueryTimeoutImpl.java:57)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.checkAndThrow(ExitableDirectoryReader.java:165)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.(ExitableDirectoryReader.java:157)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTerms.iterator(ExitableDirectoryReader.java:141)
at org.apache.lucene.index.TermContext.build(TermContext.java:93)
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:192)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 

[jira] [Updated] (SOLR-9741) Solr has a CPU% spike when indexing a batch of data

2016-11-08 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9741:
--
Attachment: screenshot-1.png

> Solr has a CPU% spike when indexing a batch of data
> ---
>
> Key: SOLR-9741
> URL: https://issues.apache.org/jira/browse/SOLR-9741
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 5.3.2
> Environment: Linux 64bit
>Reporter: Forest Soup
> Attachments: screenshot-1.png
>
>
> When we doing a batch of index and search operations to SolrCloud v5.3.2, we 
> usually met a CPU% spike lasting about 10 min. 
> You can see the chart in the attach file 
> During the spike, the thread dump shows most of the threads are with the call 
> stacks below:
> "qtp634210724-4759" #4759 prio=5 os_prio=0 tid=0x7fb32803e000 nid=0x64e7 
> runnable [0x7fb3ef1ef000]
>java.lang.Thread.State: RUNNABLE
> at 
> java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444)
> at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419)
> at 
> java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298)
> at java.lang.ThreadLocal.get(ThreadLocal.java:163)
> at 
> org.apache.solr.search.SolrQueryTimeoutImpl.get(SolrQueryTimeoutImpl.java:49)
> at 
> org.apache.solr.search.SolrQueryTimeoutImpl.shouldExit(SolrQueryTimeoutImpl.java:57)
> at 
> org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.checkAndThrow(ExitableDirectoryReader.java:165)
> at 
> org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.(ExitableDirectoryReader.java:157)
> at 
> org.apache.lucene.index.ExitableDirectoryReader$ExitableTerms.iterator(ExitableDirectoryReader.java:141)
> at org.apache.lucene.index.TermContext.build(TermContext.java:93)
> at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:192)
> at 
> org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
> at 
> org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
> at 
> org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
> at 
> org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
> at 
> org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
> at 
> org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
> at 
> org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
> at 
> org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
> at 
> org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
> at 
> org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
> at 
> org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:838)
> at 
> org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
> at 
> org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:456)
> at org.apache.solr.search.Grouping.execute(Grouping.java:370)
> at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:496)
> at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:277)
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9741) Solr has a CPU% spike when indexing a batch of data

2016-11-08 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9741:
--
Description: 
When we doing a batch of index and search operations to SolrCloud v5.3.2, we 
usually met a CPU% spike lasting about 10 min. 
You can see the chart in the attach file 

During the spike, the thread dump shows most of the threads are with the call 
stacks below:
"qtp634210724-4759" #4759 prio=5 os_prio=0 tid=0x7fb32803e000 nid=0x64e7 
runnable [0x7fb3ef1ef000]
   java.lang.Thread.State: RUNNABLE
at 
java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444)
at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419)
at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298)
at java.lang.ThreadLocal.get(ThreadLocal.java:163)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.get(SolrQueryTimeoutImpl.java:49)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.shouldExit(SolrQueryTimeoutImpl.java:57)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.checkAndThrow(ExitableDirectoryReader.java:165)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.(ExitableDirectoryReader.java:157)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTerms.iterator(ExitableDirectoryReader.java:141)
at org.apache.lucene.index.TermContext.build(TermContext.java:93)
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:192)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at 
org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:838)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
at 
org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:456)
at org.apache.solr.search.Grouping.execute(Grouping.java:370)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:496)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:277)
 

  was:
When we doing index a batch of data to SolrCloud v5.3.2, we usually met a CPU% 
spike lasting about 10 min. 
You can see the chart in the attach file.
During the spike, the thread dump shows most of the threads are with the call 
stacks below:
"qtp634210724-4759" #4759 prio=5 os_prio=0 tid=0x7fb32803e000 nid=0x64e7 
runnable [0x7fb3ef1ef000]
   java.lang.Thread.State: RUNNABLE
at 
java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444)
at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419)
at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298)
at java.lang.ThreadLocal.get(ThreadLocal.java:163)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.get(SolrQueryTimeoutImpl.java:49)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.shouldExit(SolrQueryTimeoutImpl.java:57)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.checkAndThrow(ExitableDirectoryReader.java:165)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.(ExitableDirectoryReader.java:157)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTerms.iterator(ExitableDirectoryReader.java:141)
at org.apache.lucene.index.TermContext.build(TermContext.java:93)
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:192)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at 

[jira] [Updated] (SOLR-9741) Solr has a CPU% spike when indexing a batch of data

2016-11-08 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-9741:
--
Description: 
When we doing index a batch of data to SolrCloud v5.3.2, we usually met a CPU% 
spike lasting about 10 min. 
You can see the chart in the attach file.
During the spike, the thread dump shows most of the threads are with the call 
stacks below:
"qtp634210724-4759" #4759 prio=5 os_prio=0 tid=0x7fb32803e000 nid=0x64e7 
runnable [0x7fb3ef1ef000]
   java.lang.Thread.State: RUNNABLE
at 
java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444)
at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419)
at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298)
at java.lang.ThreadLocal.get(ThreadLocal.java:163)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.get(SolrQueryTimeoutImpl.java:49)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.shouldExit(SolrQueryTimeoutImpl.java:57)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.checkAndThrow(ExitableDirectoryReader.java:165)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.(ExitableDirectoryReader.java:157)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTerms.iterator(ExitableDirectoryReader.java:141)
at org.apache.lucene.index.TermContext.build(TermContext.java:93)
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:192)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at 
org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:838)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
at 
org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:456)
at org.apache.solr.search.Grouping.execute(Grouping.java:370)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:496)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:277)
 

  was:
When we doing index a batch of data to SolrCloud v5.3.2, we usually met a CPU% 
spike lasting about 10 min. 
During the spike, the thread dump shows most of the threads are with the call 
stacks below:
"qtp634210724-4759" #4759 prio=5 os_prio=0 tid=0x7fb32803e000 nid=0x64e7 
runnable [0x7fb3ef1ef000]
   java.lang.Thread.State: RUNNABLE
at 
java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444)
at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419)
at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298)
at java.lang.ThreadLocal.get(ThreadLocal.java:163)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.get(SolrQueryTimeoutImpl.java:49)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.shouldExit(SolrQueryTimeoutImpl.java:57)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.checkAndThrow(ExitableDirectoryReader.java:165)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.(ExitableDirectoryReader.java:157)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTerms.iterator(ExitableDirectoryReader.java:141)
at org.apache.lucene.index.TermContext.build(TermContext.java:93)
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:192)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 

[jira] [Created] (SOLR-9741) Solr has a CPU% spike when indexing a batch of data

2016-11-08 Thread Forest Soup (JIRA)
Forest Soup created SOLR-9741:
-

 Summary: Solr has a CPU% spike when indexing a batch of data
 Key: SOLR-9741
 URL: https://issues.apache.org/jira/browse/SOLR-9741
 Project: Solr
  Issue Type: Bug
  Security Level: Public (Default Security Level. Issues are Public)
Affects Versions: 5.3.2
 Environment: Linux 64bit
Reporter: Forest Soup


When we doing index a batch of data to SolrCloud v5.3.2, we usually met a CPU% 
spike lasting about 10 min. 
During the spike, the thread dump shows most of the threads are with the call 
stacks below:
"qtp634210724-4759" #4759 prio=5 os_prio=0 tid=0x7fb32803e000 nid=0x64e7 
runnable [0x7fb3ef1ef000]
   java.lang.Thread.State: RUNNABLE
at 
java.lang.ThreadLocal$ThreadLocalMap.getEntryAfterMiss(ThreadLocal.java:444)
at java.lang.ThreadLocal$ThreadLocalMap.getEntry(ThreadLocal.java:419)
at java.lang.ThreadLocal$ThreadLocalMap.access$000(ThreadLocal.java:298)
at java.lang.ThreadLocal.get(ThreadLocal.java:163)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.get(SolrQueryTimeoutImpl.java:49)
at 
org.apache.solr.search.SolrQueryTimeoutImpl.shouldExit(SolrQueryTimeoutImpl.java:57)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.checkAndThrow(ExitableDirectoryReader.java:165)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTermsEnum.(ExitableDirectoryReader.java:157)
at 
org.apache.lucene.index.ExitableDirectoryReader$ExitableTerms.iterator(ExitableDirectoryReader.java:141)
at org.apache.lucene.index.TermContext.build(TermContext.java:93)
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:192)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at org.apache.lucene.search.BooleanWeight.(BooleanWeight.java:56)
at 
org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:203)
at 
org.apache.lucene.search.IndexSearcher.createWeight(IndexSearcher.java:855)
at 
org.apache.lucene.search.IndexSearcher.createNormalizedWeight(IndexSearcher.java:838)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:486)
at 
org.apache.solr.search.Grouping.searchWithTimeLimiter(Grouping.java:456)
at org.apache.solr.search.Grouping.execute(Grouping.java:370)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:496)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:277)
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5724) Two node, one shard solr instance intermittently going offline

2016-08-24 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434430#comment-15434430
 ] 

Forest Soup commented on SOLR-5724:
---

I found the similar issue in Solr v5.3.2 -  
We have a solrcloud with 3 solr nodes, 80 collections are created on them with 
replicateFactor=1, and shardNum=1 for each collection.

After the collections creation, all cores are active, we start first batch of 
index with SolrJ client. But we found issues on all collections of one of the 3 
solr nodes, and index failure due to HTTP 503:

2016-08-16 20:02:05.660 ERROR (qtp208437930-70) [c:collection4 s:shard1 
r:core_node1 x:collection4_shard1_replica1] 
o.a.s.u.p.DistributedUpdateProcessor ClusterState says we are the leader, but 
locally we don't think so
2016-08-16 20:02:05.667 ERROR (qtp208437930-70) [c:collection4 s:shard1 
r:core_node1 x:collection4_shard1_replica1] o.a.s.c.SolrCore 
org.apache.solr.common.SolrException: ClusterState says we are the leader 
(https://host1.domain1:8983/solr/collection4_shard1_replica1), but locally we 
don't think so. Request came from null
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:619)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:381)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:314)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:665)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:143)
at 
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:113)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:76)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:235)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:199)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at 
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)

The collections on the other 2 solr nodes works fine and index succeeded.

> Two node, one shard solr instance intermittently going offline 
> ---
>
> Key: SOLR-5724
> URL: https://issues.apache.org/jira/browse/SOLR-5724
> Project: Solr
>  Issue Type: Bug
>Affects Versions: 4.6.1
> Environment: Ubuntu 12.04.3 LTS, 64 bit,  java version "1.6.0_45"
> Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
> Java HotSpot(TM) 64-Bit 

[jira] [Commented] (SOLR-7021) Leader will not publish core as active without recovering first, but never recovers

2016-06-06 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316424#comment-15316424
 ] 

Forest Soup commented on SOLR-7021:
---

Is there any plan to fix this? We found same log in v5.3.2 solrcloud.

> Leader will not publish core as active without recovering first, but never 
> recovers
> ---
>
> Key: SOLR-7021
> URL: https://issues.apache.org/jira/browse/SOLR-7021
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Affects Versions: 4.10
>Reporter: James Hardwick
>Priority: Critical
>  Labels: recovery, solrcloud, zookeeper
>
> A little background: 1 core solr-cloud cluster across 3 nodes, each with its 
> own shard and each shard with a single replica hence each replica is itself a 
> leader. 
> For reasons we won't get into, we witnessed a shard go down in our cluster. 
> We restarted the cluster but our core/shards still did not come back up. 
> After inspecting the logs, we found this:
> {code}
> 015-01-21 15:51:56,494 [coreZkRegister-1-thread-2] INFO  cloud.ZkController  
> - We are http://xxx.xxx.xxx.35:8081/solr/xyzcore/ and leader is 
> http://xxx.xxx.xxx.35:8081/solr/xyzcore/
> 2015-01-21 15:51:56,496 [coreZkRegister-1-thread-2] INFO  cloud.ZkController  
> - No LogReplay needed for core=xyzcore baseURL=http://xxx.xxx.xxx.35:8081/solr
> 2015-01-21 15:51:56,496 [coreZkRegister-1-thread-2] INFO  cloud.ZkController  
> - I am the leader, no recovery necessary
> 2015-01-21 15:51:56,496 [coreZkRegister-1-thread-2] INFO  cloud.ZkController  
> - publishing core=xyzcore state=active collection=xyzcore
> 2015-01-21 15:51:56,497 [coreZkRegister-1-thread-2] INFO  cloud.ZkController  
> - numShards not found on descriptor - reading it from system property
> 2015-01-21 15:51:56,498 [coreZkRegister-1-thread-2] INFO  cloud.ZkController  
> - publishing core=xyzcore state=down collection=xyzcore
> 2015-01-21 15:51:56,498 [coreZkRegister-1-thread-2] INFO  cloud.ZkController  
> - numShards not found on descriptor - reading it from system property
> 2015-01-21 15:51:56,501 [coreZkRegister-1-thread-2] ERROR core.ZkContainer  - 
> :org.apache.solr.common.SolrException: Cannot publish state of core 'xyzcore' 
> as active without recovering first!
>   at org.apache.solr.cloud.ZkController.publish(ZkController.java:1075)
> {code}
> And at this point the necessary shards never recover correctly and hence our 
> core never returns to a functional state. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-9173) NullPointerException during recovery

2016-05-30 Thread Forest Soup (JIRA)
Forest Soup created SOLR-9173:
-

 Summary: NullPointerException during recovery
 Key: SOLR-9173
 URL: https://issues.apache.org/jira/browse/SOLR-9173
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 5.3.2
 Environment: Linux 64bit
Reporter: Forest Soup


We have a solrcloud. One server suffered crashes. After restart, during core 
recovering, there's one error:
2016-05-03 18:30:17.200 WARN  
(recoveryExecutor-80-thread-1-processing-n:lltcl5solr05.swg.usma.ibm.com:8983_solr
 x:collection3_shard1_replica1 s:shard1 c:collection3 r:core_node2) 
[c:collection3 s:shard1 r:core_node2 x:collection3_shard1_replica1] 
o.a.s.u.UpdateLog Starting log replay 
tlog{file=/mnt/solrdata1/solr/home/collection3_shard1_replica1/data/tlog/tlog.879
 refcount=2} active=false starting pos=0
2016-05-03 18:30:18.377 ERROR (qtp562345204-56) [c:collection3 s:shard1 
r:core_node2 x:collection3_shard1_replica1] o.a.s.c.SolrCore 
java.lang.NullPointerException
at org.apache.solr.update.UpdateLog.lookup(UpdateLog.java:735)
at 
org.apache.solr.handler.component.RealTimeGetComponent.process(RealTimeGetComponent.java:165)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:277)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:215)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at 
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)

2016-05-03 18:30:18.378 ERROR (qtp562345204-56) [c:collection3 s:shard1 
r:core_node2 x:collection3_shard1_replica1] o.a.s.s.SolrDispatchFilter 
null:java.lang.NullPointerException
at org.apache.solr.update.UpdateLog.lookup(UpdateLog.java:735)
at 
org.apache.solr.handler.component.RealTimeGetComponent.process(RealTimeGetComponent.java:165)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:277)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:672)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:463)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:215)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 

[jira] [Updated] (SOLR-8756) Need 4 config "zkDigestUsername"/"zkDigestPassword"/ solr.xml

2016-02-28 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-8756:
--
Summary: Need 4 config "zkDigestUsername"/"zkDigestPassword"/  solr.xml  
(was: Need config "zkDigestUsername" and "zkDigestPassword" in 
/solr.xml)

> Need 4 config "zkDigestUsername"/"zkDigestPassword"/  solr.xml
> --
>
> Key: SOLR-8756
> URL: https://issues.apache.org/jira/browse/SOLR-8756
> Project: Solr
>  Issue Type: Bug
>  Components: security, SolrCloud
>Affects Versions: 5.3.1
> Environment: Linux 64bit
>Reporter: Forest Soup
>  Labels: security
>
> Need 4 config in /solr.xml instead of -D parameter in solr.in.sh.
> like below:
> 
>   
> zkusername
> zkpassword
> zkreadonlyusername
>  name="zkDigestReadonlyUsername">readonlypassword
> ...
> Otherwise, any user can use the linux "ps" command showing the full command 
> line including the plain text zookeeper username and password. If we use file 
> store them, we can control the access of the file not to leak the 
> username/password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8756) Need 4 config "zkDigestUsername"/"zkDigestPassword"/"zkDigestReadonlyUsername"/"zkDigestReadonlyUsername" in solr.xml

2016-02-28 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-8756:
--
Summary: Need 4 config 
"zkDigestUsername"/"zkDigestPassword"/"zkDigestReadonlyUsername"/"zkDigestReadonlyUsername"
 in solr.xml  (was: Need 4 config 
"zkDigestUsername"/"zkDigestPassword"/"zkDigestReadonlyUsername"/  solr.xml)

> Need 4 config 
> "zkDigestUsername"/"zkDigestPassword"/"zkDigestReadonlyUsername"/"zkDigestReadonlyUsername"
>  in solr.xml
> -
>
> Key: SOLR-8756
> URL: https://issues.apache.org/jira/browse/SOLR-8756
> Project: Solr
>  Issue Type: Bug
>  Components: security, SolrCloud
>Affects Versions: 5.3.1
> Environment: Linux 64bit
>Reporter: Forest Soup
>  Labels: security
>
> Need 4 config in /solr.xml instead of -D parameter in solr.in.sh.
> like below:
> 
>   
> zkusername
> zkpassword
> zkreadonlyusername
>  name="zkDigestReadonlyUsername">readonlypassword
> ...
> Otherwise, any user can use the linux "ps" command showing the full command 
> line including the plain text zookeeper username and password. If we use file 
> store them, we can control the access of the file not to leak the 
> username/password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8756) Need 4 config "zkDigestUsername"/"zkDigestPassword"/"zkDigestReadonlyUsername"/ solr.xml

2016-02-28 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-8756:
--
Summary: Need 4 config 
"zkDigestUsername"/"zkDigestPassword"/"zkDigestReadonlyUsername"/  solr.xml  
(was: Need 4 config "zkDigestUsername"/"zkDigestPassword"/  solr.xml)

> Need 4 config 
> "zkDigestUsername"/"zkDigestPassword"/"zkDigestReadonlyUsername"/  solr.xml
> -
>
> Key: SOLR-8756
> URL: https://issues.apache.org/jira/browse/SOLR-8756
> Project: Solr
>  Issue Type: Bug
>  Components: security, SolrCloud
>Affects Versions: 5.3.1
> Environment: Linux 64bit
>Reporter: Forest Soup
>  Labels: security
>
> Need 4 config in /solr.xml instead of -D parameter in solr.in.sh.
> like below:
> 
>   
> zkusername
> zkpassword
> zkreadonlyusername
>  name="zkDigestReadonlyUsername">readonlypassword
> ...
> Otherwise, any user can use the linux "ps" command showing the full command 
> line including the plain text zookeeper username and password. If we use file 
> store them, we can control the access of the file not to leak the 
> username/password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-8756) Need config "zkDigestUsername" and "zkDigestPassword" in /solr.xml

2016-02-28 Thread Forest Soup (JIRA)
Forest Soup created SOLR-8756:
-

 Summary: Need config "zkDigestUsername" and "zkDigestPassword" in 
/solr.xml
 Key: SOLR-8756
 URL: https://issues.apache.org/jira/browse/SOLR-8756
 Project: Solr
  Issue Type: Bug
  Components: security, SolrCloud
Affects Versions: 5.3.1
 Environment: Linux 64bit
Reporter: Forest Soup


Need 2 config in /solr.xml instead of -D parameter in solr.in.sh.

like below:

  
zkusername
zkpassword
zkreadonlyusername
readonlypassword
...

Otherwise, any user can use the linux "ps" command showing the full command 
line including the plain text zookeeper username and password. If we use file 
store them, we can control the access of the file not to leak the 
username/password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-8756) Need config "zkDigestUsername" and "zkDigestPassword" in /solr.xml

2016-02-28 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-8756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-8756:
--
Description: 
Need 4 config in /solr.xml instead of -D parameter in solr.in.sh.

like below:

  
zkusername
zkpassword
zkreadonlyusername
readonlypassword
...

Otherwise, any user can use the linux "ps" command showing the full command 
line including the plain text zookeeper username and password. If we use file 
store them, we can control the access of the file not to leak the 
username/password.

  was:
Need 2 config in /solr.xml instead of -D parameter in solr.in.sh.

like below:

  
zkusername
zkpassword
zkreadonlyusername
readonlypassword
...

Otherwise, any user can use the linux "ps" command showing the full command 
line including the plain text zookeeper username and password. If we use file 
store them, we can control the access of the file not to leak the 
username/password.


> Need config "zkDigestUsername" and "zkDigestPassword" in /solr.xml
> 
>
> Key: SOLR-8756
> URL: https://issues.apache.org/jira/browse/SOLR-8756
> Project: Solr
>  Issue Type: Bug
>  Components: security, SolrCloud
>Affects Versions: 5.3.1
> Environment: Linux 64bit
>Reporter: Forest Soup
>  Labels: security
>
> Need 4 config in /solr.xml instead of -D parameter in solr.in.sh.
> like below:
> 
>   
> zkusername
> zkpassword
> zkreadonlyusername
>  name="zkDigestReadonlyUsername">readonlypassword
> ...
> Otherwise, any user can use the linux "ps" command showing the full command 
> line including the plain text zookeeper username and password. If we use file 
> store them, we can control the access of the file not to leak the 
> username/password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7982) SolrCloud: collection creation: There are duplicate coreNodeName in core.properties in a same collection.

2015-08-27 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7982:
--
Description: 
We have a 3 Zookeeper 5 solr server Solrcloud. 
We created collection1 and collection2 with 80 shards respectively in the 
cloud, replicateFactor is 2. 
But after created, we found in a same collection, the coreNodeName has some 
duplicate in core.properties in the core folder. For example:
[tanglin@solr64 home]$ ll collection1_shard13_replica2/core.properties
-rw-r--r-- 1 solr solr 173 Jul 29 11:52 
collection1_shard13_replica2/core.properties
[tanglin@solr64 home]$ ll collection1_shard66_replica1/core.properties
-rw-r--r-- 1 solr solr 173 Jul 29 11:52 
collection1_shard66_replica1/core.properties
[tanglin@solr64 home]$ cat collection1_shard66_replica1/core.properties
#Written by CorePropertiesLocator
#Wed Jul 29 11:52:54 UTC 2015
numShards=80
name=collection1_shard66_replica1
shard=shard66
collection=collection1
coreNodeName=core_node19
[tanglin@solr64 home]$ cat collection1_shard13_replica2/core.properties
#Written by CorePropertiesLocator
#Wed Jul 29 11:52:53 UTC 2015
numShards=80
name=collection1_shard13_replica2
shard=shard13
collection=collection1
coreNodeName=core_node19
[tanglin@solr64 home]$

The consequence of the issue is that the clusterstate.json in zookeeper is also 
with wrong core_node#, and updating state of a core sometimes changed the state 
of other core in other shard..

Snippet from clusterstate:
  shard13:{
range:a666-a998,
state:active,
replicas:{
  core_node33:{
state:active,
base_url:https://solr65.somesite.com:8443/solr;,
core:collection1_shard13_replica1,
node_name:solr65.somesite.com:8443_solr},
  core_node19:{
state:active,
base_url:https://solr64.somesite.com:8443/solr;,
core:collection1_shard13_replica2,
node_name:solr64.somesite.com:8443_solr,
leader:true}}},
...
  shard66:{
range:5000-5332,
state:active,
replicas:{
  core_node105:{
state:active,
base_url:https://solr63.somesite.com:8443/solr;,
core:collection1_shard66_replica2,
node_name:solr63.somesite.com:8443_solr,
leader:true},
  core_node19:{
state:active,
base_url:https://solr64.somesite.com:8443/solr;,
core:collection1_shard66_replica1,
node_name:solr64.somesite.com:8443_solr}}},

  was:
We have a 3 Zookeeper 5 solr server Solrcloud. 
We created collection1 and collection2 with 80 shards respectively in the 
cloud, replicateFactor is 2. 
But after created, we found in a same collection, the coreNodeName has some 
duplicate in core.properties in the core folder. For example:
[tanglin@solr64 home]$ ll collection1_shard13_replica2/core.properties
-rw-r--r-- 1 solr solr 173 Jul 29 11:52 
collection1_shard13_replica2/core.properties
[tanglin@solr64 home]$ ll collection1_shard66_replica1/core.properties
-rw-r--r-- 1 solr solr 173 Jul 29 11:52 
collection1_shard66_replica1/core.properties
[tanglin@solr64 home]$ cat collection1_shard66_replica1/core.properties
#Written by CorePropertiesLocator
#Wed Jul 29 11:52:54 UTC 2015
numShards=80
name=collection1_shard66_replica1
shard=shard66
collection=collection1
coreNodeName=core_node19
[tanglin@solr64 home]$ cat collection1_shard13_replica2/core.properties
#Written by CorePropertiesLocator
#Wed Jul 29 11:52:53 UTC 2015
numShards=80
name=collection1_shard13_replica2
shard=shard13
collection=collection1
coreNodeName=core_node19
[tanglin@solr64 home]$

The consequence of the issue is that the clusterstate.json in zookeeper is also 
with wrong core_node#, and updating state of a core sometimes changed the state 
of other core in other shard..

Snippet from clusterstate:
  shard13:{
range:a666-a998,
state:active,
replicas:{
  core_node33:{
state:active,
base_url:https://us1a3-solr65.a3.dal06.isc4sb.com:8443/solr;,
core:collection1_shard13_replica1,
node_name:us1a3-solr65.a3.dal06.isc4sb.com:8443_solr},
  core_node19:{
state:active,
base_url:https://us1a3-solr64.a3.dal06.isc4sb.com:8443/solr;,
core:collection1_shard13_replica2,
node_name:us1a3-solr64.a3.dal06.isc4sb.com:8443_solr,
leader:true}}},
...
  shard66:{
range:5000-5332,
state:active,
replicas:{
  core_node105:{
state:active,
base_url:https://us1a3-solr63.a3.dal06.isc4sb.com:8443/solr;,
core:collection1_shard66_replica2,
node_name:us1a3-solr63.a3.dal06.isc4sb.com:8443_solr,
leader:true},
  core_node19:{
state:active,

[jira] [Updated] (SOLR-7982) SolrCloud: collection creation: There are duplicate coreNodeName in core.properties in a same collection.

2015-08-27 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7982:
--
Summary: SolrCloud: collection creation: There are duplicate coreNodeName 
in core.properties in a same collection.  (was: SolrCloud: There are duplicate 
coreNodeName in core.properties in a same collection after the collection is 
created.)

 SolrCloud: collection creation: There are duplicate coreNodeName in 
 core.properties in a same collection.
 -

 Key: SOLR-7982
 URL: https://issues.apache.org/jira/browse/SOLR-7982
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: linux redhat enterprise server 5.9 64bit
Reporter: Forest Soup

 We have a 3 Zookeeper 5 solr server Solrcloud. 
 We created collection1 and collection2 with 80 shards respectively in the 
 cloud, replicateFactor is 2. 
 But after created, we found in a same collection, the coreNodeName has some 
 duplicate in core.properties in the core folder. For example:
 [tanglin@solr64 home]$ ll collection1_shard13_replica2/core.properties
 -rw-r--r-- 1 solr solr 173 Jul 29 11:52 
 collection1_shard13_replica2/core.properties
 [tanglin@solr64 home]$ ll collection1_shard66_replica1/core.properties
 -rw-r--r-- 1 solr solr 173 Jul 29 11:52 
 collection1_shard66_replica1/core.properties
 [tanglin@solr64 home]$ cat collection1_shard66_replica1/core.properties
 #Written by CorePropertiesLocator
 #Wed Jul 29 11:52:54 UTC 2015
 numShards=80
 name=collection1_shard66_replica1
 shard=shard66
 collection=collection1
 coreNodeName=core_node19
 [tanglin@solr64 home]$ cat collection1_shard13_replica2/core.properties
 #Written by CorePropertiesLocator
 #Wed Jul 29 11:52:53 UTC 2015
 numShards=80
 name=collection1_shard13_replica2
 shard=shard13
 collection=collection1
 coreNodeName=core_node19
 [tanglin@solr64 home]$
 The consequence of the issue is that the clusterstate.json in zookeeper is 
 also with wrong core_node#, and updating state of a core sometimes changed 
 the state of other core in other shard..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7982) SolrCloud: There are duplicate coreNodeName in core.properties in a same collection after the collection is created.

2015-08-27 Thread Forest Soup (JIRA)
Forest Soup created SOLR-7982:
-

 Summary: SolrCloud: There are duplicate coreNodeName in 
core.properties in a same collection after the collection is created.
 Key: SOLR-7982
 URL: https://issues.apache.org/jira/browse/SOLR-7982
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: linux redhat enterprise server 5.9 64bit
Reporter: Forest Soup


We have a 3 Zookeeper 5 solr server Solrcloud. 
We created collection1 and collection2 with 80 shards respectively in the 
cloud, replicateFactor is 2. 
But after created, we found in a same collection, the coreNodeName has some 
duplicate in core.properties in the core folder. For example:
[tanglin@solr64 home]$ ll collection1_shard13_replica2/core.properties
-rw-r--r-- 1 solr solr 173 Jul 29 11:52 
collection1_shard13_replica2/core.properties
[tanglin@solr64 home]$ ll collection1_shard66_replica1/core.properties
-rw-r--r-- 1 solr solr 173 Jul 29 11:52 
collection1_shard66_replica1/core.properties
[tanglin@solr64 home]$ cat collection1_shard66_replica1/core.properties
#Written by CorePropertiesLocator
#Wed Jul 29 11:52:54 UTC 2015
numShards=80
name=collection1_shard66_replica1
shard=shard66
collection=collection1
coreNodeName=core_node19
[tanglin@solr64 home]$ cat collection1_shard13_replica2/core.properties
#Written by CorePropertiesLocator
#Wed Jul 29 11:52:53 UTC 2015
numShards=80
name=collection1_shard13_replica2
shard=shard13
collection=collection1
coreNodeName=core_node19
[tanglin@solr64 home]$

The consequence of the issue is that the clusterstate.json in zookeeper is also 
with wrong core_node#, and updating state of a core sometimes changed the state 
of other core in other shard..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7982) SolrCloud: collection creation: There are duplicate coreNodeName in core.properties in a same collection.

2015-08-27 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7982:
--
Description: 
We have a 3 Zookeeper 5 solr server Solrcloud. 
We created collection1 and collection2 with 80 shards respectively in the 
cloud, replicateFactor is 2. 
But after created, we found in a same collection, the coreNodeName has some 
duplicate in core.properties in the core folder. For example:
[tanglin@solr64 home]$ ll collection1_shard13_replica2/core.properties
-rw-r--r-- 1 solr solr 173 Jul 29 11:52 
collection1_shard13_replica2/core.properties
[tanglin@solr64 home]$ ll collection1_shard66_replica1/core.properties
-rw-r--r-- 1 solr solr 173 Jul 29 11:52 
collection1_shard66_replica1/core.properties
[tanglin@solr64 home]$ cat collection1_shard66_replica1/core.properties
#Written by CorePropertiesLocator
#Wed Jul 29 11:52:54 UTC 2015
numShards=80
name=collection1_shard66_replica1
shard=shard66
collection=collection1
coreNodeName=core_node19
[tanglin@solr64 home]$ cat collection1_shard13_replica2/core.properties
#Written by CorePropertiesLocator
#Wed Jul 29 11:52:53 UTC 2015
numShards=80
name=collection1_shard13_replica2
shard=shard13
collection=collection1
coreNodeName=core_node19
[tanglin@solr64 home]$

The consequence of the issue is that the clusterstate.json in zookeeper is also 
with wrong core_node#, and updating state of a core sometimes changed the state 
of other core in other shard..

Snippet from clusterstate:
  shard13:{
range:a666-a998,
state:active,
replicas:{
  core_node33:{
state:active,
base_url:https://us1a3-solr65.a3.dal06.isc4sb.com:8443/solr;,
core:collection1_shard13_replica1,
node_name:us1a3-solr65.a3.dal06.isc4sb.com:8443_solr},
  core_node19:{
state:active,
base_url:https://us1a3-solr64.a3.dal06.isc4sb.com:8443/solr;,
core:collection1_shard13_replica2,
node_name:us1a3-solr64.a3.dal06.isc4sb.com:8443_solr,
leader:true}}},
...
  shard66:{
range:5000-5332,
state:active,
replicas:{
  core_node105:{
state:active,
base_url:https://us1a3-solr63.a3.dal06.isc4sb.com:8443/solr;,
core:collection1_shard66_replica2,
node_name:us1a3-solr63.a3.dal06.isc4sb.com:8443_solr,
leader:true},
  core_node19:{
state:active,
base_url:https://us1a3-solr64.a3.dal06.isc4sb.com:8443/solr;,
core:collection1_shard66_replica1,
node_name:us1a3-solr64.a3.dal06.isc4sb.com:8443_solr}}},

  was:
We have a 3 Zookeeper 5 solr server Solrcloud. 
We created collection1 and collection2 with 80 shards respectively in the 
cloud, replicateFactor is 2. 
But after created, we found in a same collection, the coreNodeName has some 
duplicate in core.properties in the core folder. For example:
[tanglin@solr64 home]$ ll collection1_shard13_replica2/core.properties
-rw-r--r-- 1 solr solr 173 Jul 29 11:52 
collection1_shard13_replica2/core.properties
[tanglin@solr64 home]$ ll collection1_shard66_replica1/core.properties
-rw-r--r-- 1 solr solr 173 Jul 29 11:52 
collection1_shard66_replica1/core.properties
[tanglin@solr64 home]$ cat collection1_shard66_replica1/core.properties
#Written by CorePropertiesLocator
#Wed Jul 29 11:52:54 UTC 2015
numShards=80
name=collection1_shard66_replica1
shard=shard66
collection=collection1
coreNodeName=core_node19
[tanglin@solr64 home]$ cat collection1_shard13_replica2/core.properties
#Written by CorePropertiesLocator
#Wed Jul 29 11:52:53 UTC 2015
numShards=80
name=collection1_shard13_replica2
shard=shard13
collection=collection1
coreNodeName=core_node19
[tanglin@solr64 home]$

The consequence of the issue is that the clusterstate.json in zookeeper is also 
with wrong core_node#, and updating state of a core sometimes changed the state 
of other core in other shard..


 SolrCloud: collection creation: There are duplicate coreNodeName in 
 core.properties in a same collection.
 -

 Key: SOLR-7982
 URL: https://issues.apache.org/jira/browse/SOLR-7982
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: linux redhat enterprise server 5.9 64bit
Reporter: Forest Soup

 We have a 3 Zookeeper 5 solr server Solrcloud. 
 We created collection1 and collection2 with 80 shards respectively in the 
 cloud, replicateFactor is 2. 
 But after created, we found in a same collection, the coreNodeName has some 
 duplicate in core.properties in the core folder. For example:
 [tanglin@solr64 home]$ ll collection1_shard13_replica2/core.properties
 -rw-r--r-- 1 solr solr 173 Jul 29 

[jira] [Updated] (SOLR-7947) SolrCloud: /live_nodes in ZK shows the server is there, but all cores are down in /clusterstate.json.

2015-08-19 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7947:
--
Description: 
A SolrCloud with 2 solr node in Tomcat server on 2 VM servers. After restart 
one solr node, the cores on it turns to down state and logs showing below 
errors.

Logs are in attachmenent.

ERROR - 2015-07-24 09:40:34.887; org.apache.solr.common.SolrException; 
null:org.apache.solr.common.SolrException: Unable to create core: 
collection1_shard1_replica1
at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:989)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:606)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250)
at java.util.concurrent.FutureTask.run(FutureTask.java:273)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482)
at java.util.concurrent.FutureTask.run(FutureTask.java:273)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
at java.lang.Thread.run(Thread.java:804)
Caused by: org.apache.solr.common.SolrException
at org.apache.solr.core.SolrCore.init(SolrCore.java:844)
at org.apache.solr.core.SolrCore.init(SolrCore.java:630)
at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:595)
... 8 more
Caused by: java.nio.channels.OverlappingFileLockException
at sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:267)
at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:164)
at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1078)
at java.nio.channels.FileChannel.tryLock(FileChannel.java:1165)
at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:217)
at org.apache.lucene.store.NativeFSLock.isLocked(NativeFSLockFactory.java:319)
at org.apache.lucene.index.IndexWriter.isLocked(IndexWriter.java:4510)
at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:485)
at org.apache.solr.core.SolrCore.init(SolrCore.java:761)
... 11 more

  was:
A SolrCloud with 2 solr node in Tomcat server on 2 VM servers. After restart 
one solr node, the cores on it turns to down state and logs showing below 
errors.

ERROR - 2015-07-24 09:40:34.887; org.apache.solr.common.SolrException; 
null:org.apache.solr.common.SolrException: Unable to create core: 
collection1_shard1_replica1
at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:989)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:606)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250)
at java.util.concurrent.FutureTask.run(FutureTask.java:273)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482)
at java.util.concurrent.FutureTask.run(FutureTask.java:273)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
at java.lang.Thread.run(Thread.java:804)
Caused by: org.apache.solr.common.SolrException
at org.apache.solr.core.SolrCore.init(SolrCore.java:844)
at org.apache.solr.core.SolrCore.init(SolrCore.java:630)
at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:595)
... 8 more
Caused by: java.nio.channels.OverlappingFileLockException
at sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:267)
at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:164)
at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1078)
at java.nio.channels.FileChannel.tryLock(FileChannel.java:1165)
at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:217)
at org.apache.lucene.store.NativeFSLock.isLocked(NativeFSLockFactory.java:319)
at org.apache.lucene.index.IndexWriter.isLocked(IndexWriter.java:4510)
at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:485)
at org.apache.solr.core.SolrCore.init(SolrCore.java:761)
... 11 more


 SolrCloud: /live_nodes in ZK shows the server is there, but all cores are 
 down in /clusterstate.json.
 -

 Key: SOLR-7947
 URL: https://issues.apache.org/jira/browse/SOLR-7947
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Redhat Linux Enterprise Server 5.9 64bit
Reporter: Forest Soup

 A SolrCloud with 2 solr node in Tomcat server on 2 VM servers. After restart 
 one solr node, the cores on it turns to down state and logs showing below 
 errors.
 Logs are in attachmenent.
 ERROR - 2015-07-24 09:40:34.887; 

[jira] [Updated] (SOLR-7947) SolrCloud: /live_nodes in ZK shows the server is there, but all cores are down in /clusterstate.json.

2015-08-19 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7947:
--
Summary: SolrCloud: /live_nodes in ZK shows the server is there, but all 
cores are down in /clusterstate.json.  (was: ZooKeeper /live_nodes shows the 
server is there, but all cores are down in /clusterstate.json.)

 SolrCloud: /live_nodes in ZK shows the server is there, but all cores are 
 down in /clusterstate.json.
 -

 Key: SOLR-7947
 URL: https://issues.apache.org/jira/browse/SOLR-7947
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Redhat Linux Enterprise Server 5.9 64bit
Reporter: Forest Soup

 A SolrCloud with 2 solr node in Tomcat server on 2 VM servers. After restart 
 one solr node, the cores on it turns to down state and logs showing below 
 errors.
 ERROR - 2015-07-24 09:40:34.887; org.apache.solr.common.SolrException; 
 null:org.apache.solr.common.SolrException: Unable to create core: 
 collection1_shard1_replica1
 at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:989)
 at org.apache.solr.core.CoreContainer.create(CoreContainer.java:606)
 at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258)
 at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250)
 at java.util.concurrent.FutureTask.run(FutureTask.java:273)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482)
 at java.util.concurrent.FutureTask.run(FutureTask.java:273)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
 at java.lang.Thread.run(Thread.java:804)
 Caused by: org.apache.solr.common.SolrException
 at org.apache.solr.core.SolrCore.init(SolrCore.java:844)
 at org.apache.solr.core.SolrCore.init(SolrCore.java:630)
 at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244)
 at org.apache.solr.core.CoreContainer.create(CoreContainer.java:595)
 ... 8 more
 Caused by: java.nio.channels.OverlappingFileLockException
 at sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:267)
 at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:164)
 at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1078)
 at java.nio.channels.FileChannel.tryLock(FileChannel.java:1165)
 at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:217)
 at org.apache.lucene.store.NativeFSLock.isLocked(NativeFSLockFactory.java:319)
 at org.apache.lucene.index.IndexWriter.isLocked(IndexWriter.java:4510)
 at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:485)
 at org.apache.solr.core.SolrCore.init(SolrCore.java:761)
 ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7947) ZooKeeper /live_nodes shows the server is there, but all cores are down in /clusterstate.json.

2015-08-19 Thread Forest Soup (JIRA)
Forest Soup created SOLR-7947:
-

 Summary: ZooKeeper /live_nodes shows the server is there, but all 
cores are down in /clusterstate.json.
 Key: SOLR-7947
 URL: https://issues.apache.org/jira/browse/SOLR-7947
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Redhat Linux Enterprise Server 5.9 64bit
Reporter: Forest Soup


A SolrCloud with 2 solr node in Tomcat server on 2 VM servers. After restart 
one solr node, the cores on it turns to down state and logs showing below 
errors.

ERROR - 2015-07-24 09:40:34.887; org.apache.solr.common.SolrException; 
null:org.apache.solr.common.SolrException: Unable to create core: 
collection1_shard1_replica1
at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:989)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:606)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250)
at java.util.concurrent.FutureTask.run(FutureTask.java:273)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482)
at java.util.concurrent.FutureTask.run(FutureTask.java:273)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
at java.lang.Thread.run(Thread.java:804)
Caused by: org.apache.solr.common.SolrException
at org.apache.solr.core.SolrCore.init(SolrCore.java:844)
at org.apache.solr.core.SolrCore.init(SolrCore.java:630)
at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:595)
... 8 more
Caused by: java.nio.channels.OverlappingFileLockException
at sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:267)
at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:164)
at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1078)
at java.nio.channels.FileChannel.tryLock(FileChannel.java:1165)
at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:217)
at org.apache.lucene.store.NativeFSLock.isLocked(NativeFSLockFactory.java:319)
at org.apache.lucene.index.IndexWriter.isLocked(IndexWriter.java:4510)
at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:485)
at org.apache.solr.core.SolrCore.init(SolrCore.java:761)
... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7947) SolrCloud: after a solr node restarted, all cores in the node are down in /clusterstate.json due to java.nio.channels.OverlappingFileLockException.

2015-08-19 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7947:
--
Summary: SolrCloud: after a solr node restarted, all cores in the node are 
down in /clusterstate.json due to 
java.nio.channels.OverlappingFileLockException.  (was: SolrCloud: /live_nodes 
in ZK shows the server is there, but all cores are down in /clusterstate.json.)

 SolrCloud: after a solr node restarted, all cores in the node are down in 
 /clusterstate.json due to java.nio.channels.OverlappingFileLockException.
 ---

 Key: SOLR-7947
 URL: https://issues.apache.org/jira/browse/SOLR-7947
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Redhat Linux Enterprise Server 5.9 64bit
Reporter: Forest Soup
 Attachments: solr.zip


 A SolrCloud with 2 solr node in Tomcat server on 2 VM servers. After restart 
 one solr node, the cores on it turns to down state and logs showing below 
 errors.
 Logs are in attachmenent.
 ERROR - 2015-07-24 09:40:34.887; org.apache.solr.common.SolrException; 
 null:org.apache.solr.common.SolrException: Unable to create core: 
 collection1_shard1_replica1
 at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:989)
 at org.apache.solr.core.CoreContainer.create(CoreContainer.java:606)
 at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258)
 at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250)
 at java.util.concurrent.FutureTask.run(FutureTask.java:273)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482)
 at java.util.concurrent.FutureTask.run(FutureTask.java:273)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
 at java.lang.Thread.run(Thread.java:804)
 Caused by: org.apache.solr.common.SolrException
 at org.apache.solr.core.SolrCore.init(SolrCore.java:844)
 at org.apache.solr.core.SolrCore.init(SolrCore.java:630)
 at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244)
 at org.apache.solr.core.CoreContainer.create(CoreContainer.java:595)
 ... 8 more
 Caused by: java.nio.channels.OverlappingFileLockException
 at sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:267)
 at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:164)
 at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1078)
 at java.nio.channels.FileChannel.tryLock(FileChannel.java:1165)
 at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:217)
 at org.apache.lucene.store.NativeFSLock.isLocked(NativeFSLockFactory.java:319)
 at org.apache.lucene.index.IndexWriter.isLocked(IndexWriter.java:4510)
 at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:485)
 at org.apache.solr.core.SolrCore.init(SolrCore.java:761)
 ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7947) SolrCloud: /live_nodes in ZK shows the server is there, but all cores are down in /clusterstate.json.

2015-08-19 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7947:
--
Attachment: solr.zip

 SolrCloud: /live_nodes in ZK shows the server is there, but all cores are 
 down in /clusterstate.json.
 -

 Key: SOLR-7947
 URL: https://issues.apache.org/jira/browse/SOLR-7947
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Redhat Linux Enterprise Server 5.9 64bit
Reporter: Forest Soup
 Attachments: solr.zip


 A SolrCloud with 2 solr node in Tomcat server on 2 VM servers. After restart 
 one solr node, the cores on it turns to down state and logs showing below 
 errors.
 Logs are in attachmenent.
 ERROR - 2015-07-24 09:40:34.887; org.apache.solr.common.SolrException; 
 null:org.apache.solr.common.SolrException: Unable to create core: 
 collection1_shard1_replica1
 at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:989)
 at org.apache.solr.core.CoreContainer.create(CoreContainer.java:606)
 at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258)
 at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250)
 at java.util.concurrent.FutureTask.run(FutureTask.java:273)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482)
 at java.util.concurrent.FutureTask.run(FutureTask.java:273)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
 at java.lang.Thread.run(Thread.java:804)
 Caused by: org.apache.solr.common.SolrException
 at org.apache.solr.core.SolrCore.init(SolrCore.java:844)
 at org.apache.solr.core.SolrCore.init(SolrCore.java:630)
 at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244)
 at org.apache.solr.core.CoreContainer.create(CoreContainer.java:595)
 ... 8 more
 Caused by: java.nio.channels.OverlappingFileLockException
 at sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:267)
 at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:164)
 at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1078)
 at java.nio.channels.FileChannel.tryLock(FileChannel.java:1165)
 at org.apache.lucene.store.NativeFSLock.obtain(NativeFSLockFactory.java:217)
 at org.apache.lucene.store.NativeFSLock.isLocked(NativeFSLockFactory.java:319)
 at org.apache.lucene.index.IndexWriter.isLocked(IndexWriter.java:4510)
 at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:485)
 at org.apache.solr.core.SolrCore.init(SolrCore.java:761)
 ... 11 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5692) StackOverflowError during SolrCloud leader election process

2015-05-12 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541343#comment-14541343
 ] 

Forest Soup commented on SOLR-5692:
---

I met the same issue within Solr 4.7.0.
Too many recursive calls with below lines:
at 
org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:399)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:259)
at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:164)
at 
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:108)
at 
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:289)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:399)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:259)
at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:164)
at 
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:108)
at 
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:289)

 StackOverflowError during SolrCloud leader election process
 ---

 Key: SOLR-5692
 URL: https://issues.apache.org/jira/browse/SOLR-5692
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6.1
Reporter: Bojan Smid
  Labels: difficulty-hard, impact-medium
 Attachments: recovery-stackoverflow.txt


 I have SolrCloud cluster with 7 nodes, each with few 1000 cores. I got this 
 StackOverflow few times when starting one of the nodes (just a piece of stack 
 trace, the rest repeats, leader election process obviously got stuck in 
 infinite repetition of steps):
 [2/4/14 3:42:43 PM] Bojan: 2014-02-04 15:18:01,947 
 [localhost-startStop-1-EventThread] ERROR org.apache.zookeeper.ClientCnxn- 
 Error while calling watcher 
 java.lang.StackOverflowError
 at java.security.AccessController.doPrivileged(Native Method)
 at java.io.PrintWriter.init(PrintWriter.java:116)
 at java.io.PrintWriter.init(PrintWriter.java:100)
 at org.apache.solr.common.SolrException.toStr(SolrException.java:138)
 at org.apache.solr.common.SolrException.log(SolrException.java:113)
 [2/4/14 3:42:58 PM] Bojan: at 
 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:377)
 at 
 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
 at 
 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
 at 
 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
 at 
 org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)
 at 
 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380)
 at 
 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
  at 
 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
 at 
 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
 at 
 org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)
 at 
 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380)
 at 
 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
 at 
 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
 at 
 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
 at 
 org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)
 at 
 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380)
 at 
 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
 at 
 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
 at 
 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
 at 
 org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:272)
 at 
 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:380)
 at 
 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:184)
 at 
 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:162)
 at 
 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:106)
 at 
 

[jira] [Commented] (SOLR-6213) StackOverflowException in Solr cloud's leader election

2015-05-12 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541349#comment-14541349
 ] 

Forest Soup commented on SOLR-6213:
---

Can we set a max re-try number instead of keep always trying until stack over 
flow?

 StackOverflowException in Solr cloud's leader election
 --

 Key: SOLR-6213
 URL: https://issues.apache.org/jira/browse/SOLR-6213
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10, Trunk
Reporter: Dawid Weiss
Priority: Critical

 This is what's causing test hangs (at least on FreeBSD, LUCENE-5786), 
 possibly on other machines too. The problem is stack overflow from looped 
 calls in:
 {code}
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)

[jira] [Updated] (SOLR-6213) StackOverflowException in Solr cloud's leader election

2015-05-12 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-6213:
--
Attachment: stackoverflow.txt

The stackoverflow exception.

 StackOverflowException in Solr cloud's leader election
 --

 Key: SOLR-6213
 URL: https://issues.apache.org/jira/browse/SOLR-6213
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10, Trunk
Reporter: Dawid Weiss
Priority: Critical
 Attachments: stackoverflow.txt


 This is what's causing test hangs (at least on FreeBSD, LUCENE-5786), 
 possibly on other machines too. The problem is stack overflow from looped 
 calls in:
 {code}
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)

[jira] [Comment Edited] (SOLR-6213) StackOverflowException in Solr cloud's leader election

2015-05-12 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541359#comment-14541359
 ] 

Forest Soup edited comment on SOLR-6213 at 5/13/15 5:27 AM:


The stackoverflow exception is in the attachment.


was (Author: forest_soup):
The stackoverflow exception.

 StackOverflowException in Solr cloud's leader election
 --

 Key: SOLR-6213
 URL: https://issues.apache.org/jira/browse/SOLR-6213
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10, Trunk
Reporter: Dawid Weiss
Priority: Critical
 Attachments: stackoverflow.txt


 This is what's causing test hangs (at least on FreeBSD, LUCENE-5786), 
 possibly on other machines too. The problem is stack overflow from looped 
 calls in:
 {code}
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 

[jira] [Commented] (SOLR-6213) StackOverflowException in Solr cloud's leader election

2015-05-12 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14541350#comment-14541350
 ] 

Forest Soup commented on SOLR-6213:
---

I met the same issue within Solr 4.7.0.
Too many recursive calls with below lines:
at 
org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:399)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:259)
at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:164)
at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:108)
at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:289)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:399)
at 
org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:259)
at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:164)
at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:108)
at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:289)

 StackOverflowException in Solr cloud's leader election
 --

 Key: SOLR-6213
 URL: https://issues.apache.org/jira/browse/SOLR-6213
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.10, Trunk
Reporter: Dawid Weiss
Priority: Critical

 This is what's causing test hangs (at least on FreeBSD, LUCENE-5786), 
 possibly on other machines too. The problem is stack overflow from looped 
 calls in:
 {code}
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:448)

 org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:212)

 org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:163)

 org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:125)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:313)
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:221)

 

[jira] [Commented] (SOLR-6156) Exception while using group with timeAllowed on SolrCloud.

2015-04-22 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506695#comment-14506695
 ] 

Forest Soup commented on SOLR-6156:
---

We have the same issue, when we issue a request like this(I only paste the xml 
format header here with some replacement):
lst name=responseHeader
int name=status500/int
int name=QTime11/int
lst name=params
str name=_route_Q049Y2RsMi1tYWlsMDcvTz1zY24y12345678!/str
str name=facettrue/str
str name=facet.mincount1/str
str name=facet.limit13/str
str name=facet.rangedate/str
str name=facet.range.endNOW/DAY+1DAY/str
str name=facet.range.gap+1DAY/str
str name=wtxml/str
str name=rows0/str
str name=dfbody/str
str name=start0/str
str name=q
((owner:12345678) AND (servername:mail07)) AND 
(((funid:38D46BF5E8F08834852564B50129B2C)) (softdeletion:0))
/str
str name=facet.range.startNOW/DAY-31DAY/str
str name=q.opAND/str
str name=timeAllowed6/str
str name=group.fieldtua0/str
str name=group.sortdate desc/str
str name=grouptrue/str
arr name=facet.field
strstrinetfrom/str
strfunid/str
/arr
/lst
/lst

If we remove the timeAllowed=6, there is no that issue.

We have all cores active according to /clusterstates.json and /live_nodes  in 
both Admin UI and in ZooKeeper.

We have the response:
{
  error: {
msg: org.apache.solr.client.solrj.SolrServerException: No live 
SolrServers available to handle this 
request:[https://hij2-solr1.fen.def2.cn.abc.com:8443/solr/collection1_shard2_replica2,
 https://hij2-solr2.fen.def2.cn.abc.com:8443/solr/collection1_shard2_replica1];,
trace: org.apache.solr.common.SolrException: 
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this 
request:[https://hij2-solr1.fen.def2.cn.abc.com:8443/solr/collection1_shard2_replica2,
 
https://hij2-solr2.fen.def2.cn.abc.com:8443/solr/collection1_shard2_replica1]\n\tat
 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:308)\n\tat
 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)\n\tat 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)\n\tat
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)\n\tat
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)\n\tat
 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)\n\tat
 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)\n\tat
 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)\n\tat
 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)\n\tat
 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)\n\tat
 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)\n\tat
 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)\n\tat 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)\n\tat
 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)\n\tat
 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)\n\tat
 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)\n\tat
 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)\n\tat
 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)\n\tat
 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)\n\tat
 
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)\n\tat
 java.lang.Thread.run(Thread.java:804)\nCaused by: 
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this 
request:[https://hij2-solr1.fen.def2.cn.abc.com:8443/solr/collection1_shard2_replica2,
 
https://hij2-solr2.fen.def2.cn.abc.com:8443/solr/collection1_shard2_replica1]\n\tat
 
org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:387)\n\tat
 
org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(HttpShardHandlerFactory.java:205)\n\tat
 
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:161)\n\tat
 
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:118)\n\tat
 java.util.concurrent.FutureTask.run(FutureTask.java:273)\n\tat 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482)\n\tat 
java.util.concurrent.FutureTask.run(FutureTask.java:273)\n\tat 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)\n\tat
 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)\n\t...
 1 more\nCaused by: 

[jira] [Comment Edited] (SOLR-6156) Exception while using group with timeAllowed on SolrCloud.

2015-04-22 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506695#comment-14506695
 ] 

Forest Soup edited comment on SOLR-6156 at 4/22/15 9:20 AM:


We have the same issue on solr 4.7, when we issue a request like this(I only 
paste the xml format header here with some replacement):
lst name=responseHeader
int name=status500/int
int name=QTime11/int
lst name=params
str name=_route_Q049Y2RsMi1tYWlsMDcvTz1zY24y12345678!/str
str name=facettrue/str
str name=facet.mincount1/str
str name=facet.limit13/str
str name=facet.rangedate/str
str name=facet.range.endNOW/DAY+1DAY/str
str name=facet.range.gap+1DAY/str
str name=wtxml/str
str name=rows0/str
str name=dfbody/str
str name=start0/str
str name=q
((owner:12345678) AND (servername:mail07)) AND 
(((funid:38D46BF5E8F08834852564B50129B2C)) (softdeletion:0))
/str
str name=facet.range.startNOW/DAY-31DAY/str
str name=q.opAND/str
str name=timeAllowed6/str
str name=group.fieldtua0/str
str name=group.sortdate desc/str
str name=grouptrue/str
arr name=facet.field
strstrinetfrom/str
strfunid/str
/arr
/lst
/lst

If we remove the timeAllowed=6, there is no that issue.

We have all cores active according to /clusterstates.json and /live_nodes  in 
both Admin UI and in ZooKeeper.

We have the response:
{
  error: {
msg: org.apache.solr.client.solrj.SolrServerException: No live 
SolrServers available to handle this 
request:[https://hij2-solr1.fen.def2.cn.abc.com:8443/solr/collection1_shard2_replica2,
 https://hij2-solr2.fen.def2.cn.abc.com:8443/solr/collection1_shard2_replica1];,
trace: org.apache.solr.common.SolrException: 
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this 
request:[https://hij2-solr1.fen.def2.cn.abc.com:8443/solr/collection1_shard2_replica2,
 
https://hij2-solr2.fen.def2.cn.abc.com:8443/solr/collection1_shard2_replica1]\n\tat
 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:308)\n\tat
 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat
 org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)\n\tat 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)\n\tat
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)\n\tat
 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)\n\tat
 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)\n\tat
 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)\n\tat
 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)\n\tat
 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)\n\tat
 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)\n\tat
 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)\n\tat
 
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)\n\tat 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)\n\tat
 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)\n\tat
 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)\n\tat
 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)\n\tat
 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)\n\tat
 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)\n\tat
 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)\n\tat
 
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)\n\tat
 java.lang.Thread.run(Thread.java:804)\nCaused by: 
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this 
request:[https://hij2-solr1.fen.def2.cn.abc.com:8443/solr/collection1_shard2_replica2,
 
https://hij2-solr2.fen.def2.cn.abc.com:8443/solr/collection1_shard2_replica1]\n\tat
 
org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:387)\n\tat
 
org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(HttpShardHandlerFactory.java:205)\n\tat
 
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:161)\n\tat
 
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:118)\n\tat
 java.util.concurrent.FutureTask.run(FutureTask.java:273)\n\tat 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482)\n\tat 
java.util.concurrent.FutureTask.run(FutureTask.java:273)\n\tat 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)\n\tat
 

[jira] [Updated] (SOLR-7434) Adding coreName to each log entry

2015-04-21 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7434:
--
Affects Version/s: 5.1

 Adding coreName to each log entry
 -

 Key: SOLR-7434
 URL: https://issues.apache.org/jira/browse/SOLR-7434
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.7
Reporter: Forest Soup

 Could you please add [core name] to each log entry? Thanks!
 For example, it's hard for us to know the exact core having a such issue and 
 the sequence, if there are too many cores in a solr node.
 This line is a good example:
 2015-04-16 13:12:07.244; org.apache.solr.core.SolrCore; 
 [collection3_shard5_replica2] PERFORMANCE WARNING: Overlapping 
 onDeckSearchers=2
 This is a bad example:
 WARN  - 2015-04-16 13:12:11.136; 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher; Error in fetching 
 packets 
 java.io.EOFException
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
   at 
 org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
   at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
   at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
   at 
 org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
   at 
 org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
   at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247)
 WARN  - 2015-04-16 13:12:11.287; 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher; Error in fetching 
 packets 
 java.io.EOFException
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
   at 
 org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
   at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
   at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
   at 
 org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
   at 
 org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
   at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247)
 WARN  - 2015-04-16 13:12:11.465; 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher; Error in fetching 
 packets 
 java.io.EOFException
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
   at 
 org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
   at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
   at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
   at 
 org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
   at 
 org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
   at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247)
 WARN  - 2015-04-16 13:12:11.586; 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher; Error in fetching 
 packets 
 java.io.EOFException
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
   at 
 org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
   at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
   at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
   at 
 org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
   at 
 

[jira] [Updated] (SOLR-7434) Adding coreName to each log entry

2015-04-21 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7434:
--
Affects Version/s: (was: 5.1)
   4.7

 Adding coreName to each log entry
 -

 Key: SOLR-7434
 URL: https://issues.apache.org/jira/browse/SOLR-7434
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.7
Reporter: Forest Soup

 Could you please add [core name] to each log entry? Thanks!
 For example, it's hard for us to know the exact core having a such issue and 
 the sequence, if there are too many cores in a solr node.
 This line is a good example:
 2015-04-16 13:12:07.244; org.apache.solr.core.SolrCore; 
 [collection3_shard5_replica2] PERFORMANCE WARNING: Overlapping 
 onDeckSearchers=2
 This is a bad example:
 WARN  - 2015-04-16 13:12:11.136; 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher; Error in fetching 
 packets 
 java.io.EOFException
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
   at 
 org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
   at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
   at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
   at 
 org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
   at 
 org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
   at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247)
 WARN  - 2015-04-16 13:12:11.287; 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher; Error in fetching 
 packets 
 java.io.EOFException
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
   at 
 org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
   at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
   at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
   at 
 org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
   at 
 org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
   at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247)
 WARN  - 2015-04-16 13:12:11.465; 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher; Error in fetching 
 packets 
 java.io.EOFException
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
   at 
 org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
   at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
   at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
   at 
 org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
   at 
 org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
   at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247)
 WARN  - 2015-04-16 13:12:11.586; 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher; Error in fetching 
 packets 
 java.io.EOFException
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
   at 
 org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
   at 
 org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
   at 
 org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
   at 
 org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
   at 
 org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
   at 
 

[jira] [Created] (SOLR-7434) Adding coreName to each log entry

2015-04-21 Thread Forest Soup (JIRA)
Forest Soup created SOLR-7434:
-

 Summary: Adding coreName to each log entry
 Key: SOLR-7434
 URL: https://issues.apache.org/jira/browse/SOLR-7434
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Forest Soup


Could you please add [core name] to each log entry? Thanks!
For example, it's hard for us to know the exact core having a such issue and 
the sequence, if there are too many cores in a solr node.

This line is a good example:
2015-04-16 13:12:07.244; org.apache.solr.core.SolrCore; 
[collection3_shard5_replica2] PERFORMANCE WARNING: Overlapping onDeckSearchers=2

This is a bad example:
WARN  - 2015-04-16 13:12:11.136; 
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher; Error in fetching 
packets 
java.io.EOFException
at 
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
at 
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
at 
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
at 
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
at 
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
at 
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
at 
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247)
WARN  - 2015-04-16 13:12:11.287; 
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher; Error in fetching 
packets 
java.io.EOFException
at 
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
at 
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
at 
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
at 
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
at 
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
at 
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
at 
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247)
WARN  - 2015-04-16 13:12:11.465; 
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher; Error in fetching 
packets 
java.io.EOFException
at 
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
at 
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
at 
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
at 
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
at 
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
at 
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
at 
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247)
WARN  - 2015-04-16 13:12:11.586; 
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher; Error in fetching 
packets 
java.io.EOFException
at 
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:154)
at 
org.apache.solr.common.util.FastInputStream.readFully(FastInputStream.java:146)
at 
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchPackets(SnapPuller.java:1211)
at 
org.apache.solr.handler.SnapPuller$DirectoryFileFetcher.fetchFile(SnapPuller.java:1174)
at 
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:771)
at 
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:421)
at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:322)
at 
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:155)
at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:437)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:247)
WARN  - 2015-04-16 13:12:11.768; 

[jira] [Commented] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-03-26 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381798#comment-14381798
 ] 

Forest Soup commented on SOLR-6359:
---

We have a SolrCloud with 5 solr servers of Solr 4.7.0. There are one collection 
with 80 shards(2 replicas per shard) on those 5 servers. And we made a patch by 
merge the code of this fix to 4.7.0 stream. And after applied the patch to our 
servers with the config changing uploaded to ZooKeeper, we did a restart on one 
of the 5 solr server, we met some issues on that server.  Below is the details 
-  
The solrconfig.xml we changed:
updateLog
str name=dir$
{solr.ulog.dir:}
/str
int name=numRecordsToKeep1/int
int name=maxNumLogsToKeep100/int
/updateLog

After we restarted one solr server without other 4 servers are running, we met 
below exceptions in the restarted one:
ERROR - 2015-03-16 20:48:48.214; org.apache.solr.common.SolrException; 
org.apache.solr.common.SolrException: Exception writing document id 
Q049bGx0bWFpbDIxL089bGxwX3VzMQ==41703656!B68BF5EC5A4A650D85257E0A00724A3B to 
the index; possible analysis error.
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:703)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:857)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:556)
at 
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:96)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:166)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:225)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:190)
at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:116)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:173)
at 
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:106)
at 
org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:314)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
at 
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:804)
Caused by: 

[jira] [Commented] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-03-26 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381799#comment-14381799
 ] 

Forest Soup commented on SOLR-6359:
---

It looks like https://issues.apache.org/jira/browse/SOLR-4605, but I guess it's 
not the case...

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Ramkumar Aiyengar
Priority: Minor
 Fix For: Trunk, 5.1

 Attachments: SOLR-6359.patch


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-03-26 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381798#comment-14381798
 ] 

Forest Soup edited comment on SOLR-6359 at 3/26/15 2:21 PM:


We have a SolrCloud with 5 solr servers of Solr 4.7.0. There are one collection 
with 80 shards(2 replicas per shard) on those 5 servers. And we made a patch by 
merge the code of this fix to 4.7.0 stream. And after applied the patch to our 
servers with the config changing uploaded to ZooKeeper, we did a restart on one 
of the 5 solr server, we met some issues on that server.  Below is the details 
-  
The solrconfig.xml we changed:
updateLog
str name=dir$
{solr.ulog.dir:}
/str
int name=numRecordsToKeep1/int
int name=maxNumLogsToKeep100/int
/updateLog

After we restarted one solr server without other 4 servers are running, we met 
below exceptions in the restarted one:
ERROR - 2015-03-16 20:48:48.214; org.apache.solr.common.SolrException; 
org.apache.solr.common.SolrException: Exception writing document id 
Q049bGx0bWFpbDIxL089bGxwX3VzMQ==41703656!B68BF5EC5A4A650D85257E0A00724A3B to 
the index; possible analysis error.
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:164)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:703)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:857)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:556)
at 
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:96)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:166)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:225)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:190)
at 
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:116)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:173)
at 
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:106)
at 
org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:314)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
at 
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:804)

[jira] [Issue Comment Deleted] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-03-26 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-6359:
--
Comment: was deleted

(was: It looks like https://issues.apache.org/jira/browse/SOLR-4605, but I 
guess it's not the case...)

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Ramkumar Aiyengar
Priority: Minor
 Fix For: Trunk, 5.1

 Attachments: SOLR-6359.patch


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7292) OutOfMemory happened in Solr, but /clusterstates.json shows cores active

2015-03-23 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7292:
--
Attachment: OOM.txt
failure.txt

 OutOfMemory happened in Solr, but /clusterstates.json shows cores active
 --

 Key: SOLR-7292
 URL: https://issues.apache.org/jira/browse/SOLR-7292
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.7
 Environment: Redhat Linux 6.3 64bit
Reporter: Forest Soup
  Labels: performance
 Attachments: OOM.txt, failure.txt


 One of our 5 Solr server got OOM, but in /clusterstates.json in ZK, it is 
 still active.  The OOM Ex are the attached OOM.txt.
 But update and commit to the collection which has cores on that Solr server 
 will got failure. The logs are in the failure.txt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7292) When there is OutOfMemory happened in Solr and Solr cannot do update, but the /clusterstates.json on ZooKeeper still shows it is active

2015-03-23 Thread Forest Soup (JIRA)
Forest Soup created SOLR-7292:
-

 Summary: When there is OutOfMemory happened in Solr and Solr 
cannot do update, but the /clusterstates.json on ZooKeeper still shows it is 
active
 Key: SOLR-7292
 URL: https://issues.apache.org/jira/browse/SOLR-7292
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.7
 Environment: Redhat Linux 6.3 64bit

Reporter: Forest Soup


One of our 5 Solr server got OOM, but in /clusterstates.json in ZK, it is still 
active.  The OOM Ex are the attached OOM.txt.

But update and commit to the collection which has cores on that Solr server 
will got failure. The logs are in the failure.txt.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7292) OutOfMemory happened in Solr, but /clusterstates.json shows cores active

2015-03-23 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7292:
--
Summary: OutOfMemory happened in Solr, but /clusterstates.json shows cores 
active  (was: When there is OutOfMemory happened in Solr and Solr cannot do 
update, but the /clusterstates.json on ZooKeeper still shows it is active)

 OutOfMemory happened in Solr, but /clusterstates.json shows cores active
 --

 Key: SOLR-7292
 URL: https://issues.apache.org/jira/browse/SOLR-7292
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.7
 Environment: Redhat Linux 6.3 64bit
Reporter: Forest Soup
  Labels: performance

 One of our 5 Solr server got OOM, but in /clusterstates.json in ZK, it is 
 still active.  The OOM Ex are the attached OOM.txt.
 But update and commit to the collection which has cores on that Solr server 
 will got failure. The logs are in the failure.txt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7292) OutOfMemory happened in Solr, but /clusterstates.json shows cores active

2015-03-23 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14377084#comment-14377084
 ] 

Forest Soup commented on SOLR-7292:
---

Thank you all! 
Will consider your suggestion!

 OutOfMemory happened in Solr, but /clusterstates.json shows cores active
 --

 Key: SOLR-7292
 URL: https://issues.apache.org/jira/browse/SOLR-7292
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.7
 Environment: Redhat Linux 6.3 64bit
Reporter: Forest Soup
  Labels: performance
 Attachments: OOM.txt, failure.txt


 One of our 5 Solr server got OOM, but in /clusterstates.json in ZK, it is 
 still active.  The OOM Ex are the attached OOM.txt.
 But update and commit to the collection which has cores on that Solr server 
 will got failure. The logs are in the failure.txt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-7069) A down core(shard replica) on an active node cannot failover the query to its good peer

2015-02-02 Thread Forest Soup (JIRA)
Forest Soup created SOLR-7069:
-

 Summary: A down core(shard replica) on an active node cannot 
failover the query to its good peer
 Key: SOLR-7069
 URL: https://issues.apache.org/jira/browse/SOLR-7069
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
Reporter: Forest Soup






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7069) A down core(shard replica) on an active node cannot failover the query to its good peer

2015-02-02 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7069:
--
Description: When querying a collection with a core in down state, if we 
send the request to the server containing the down core, while the server is 
active, it cannot failover to the good replica of same shard on another server.

 A down core(shard replica) on an active node cannot failover the query to its 
 good peer
 ---

 Key: SOLR-7069
 URL: https://issues.apache.org/jira/browse/SOLR-7069
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
Reporter: Forest Soup

 When querying a collection with a core in down state, if we send the 
 request to the server containing the down core, while the server is active, 
 it cannot failover to the good replica of same shard on another server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-7069) A down core(shard replica) on an active node cannot failover the query to its good peer

2015-02-02 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7069:
--
Description: 
When querying a collection with a core in down state, if we send the request 
to the server containing the down core, while the server is active, it cannot 
failover to the good replica of same shard on another server.

The steps to make a core down on an active server is:
1, delete the content of the data folder of the core
2, restart the solr server the core locates.
Then we can see the core is down while other cores on the same server is 
still active. See attached picture.

When we issue a query to the collection, if we send the request to the server 
containing the down core, we receive below errors:
HTTP Status 500 - {msg=SolrCore 'collection5_shard1_replica2' is not available 
due to init failure: Error opening new 
searcher,trace=org.apache.solr.common.SolrException: SolrCore 
'collection5_shard1_replica2' is not available due to init failure: Error 
opening new searcher at 
org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:827) at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:309)
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
 at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
 at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
 at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
 at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
 at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) 
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
 at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) 
at 
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
 at 
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
 at 
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) 
at 
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
 at java.lang.Thread.run(Thread.java:804) Caused by: 
org.apache.solr.common.SolrException: Error opening new searcher at 
org.apache.solr.core.SolrCore.init(SolrCore.java:844) at 
org.apache.solr.core.SolrCore.init(SolrCore.java:630) at 
org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244) at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:595) at 
org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258) at 
org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250) at 
java.util.concurrent.FutureTask.run(FutureTask.java:273) at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) at 
java.util.concurrent.FutureTask.run(FutureTask.java:273) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) 
... 1 more Caused by: org.apache.solr.common.SolrException: Error opening new 
searcher at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1521) 
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1633) at 
org.apache.solr.core.SolrCore.init(SolrCore.java:827) ... 11 more Caused by: 
java.io.FileNotFoundException: 
/mnt/solrdata1/solr/home/collection5_shard1_replica2/data/index/_12x.si (No 
such file or directory) at 
java.io.RandomAccessFile.init(RandomAccessFile.java:252) at 
org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193) at 
org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:233)
 at 
org.apache.lucene.codecs.lucene46.Lucene46SegmentInfoReader.read(Lucene46SegmentInfoReader.java:49)
 at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:340) at 
org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:404) at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
 at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694)
 at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:400) at 
org.apache.lucene.index.IndexWriter.init(IndexWriter.java:741) at 
org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:77) at 
org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at 

[jira] [Updated] (SOLR-7069) A down core(shard replica) on an active node cannot failover the query to its good peer

2015-02-02 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7069:
--
Attachment: Untitled.png

A down core on an active node.

 A down core(shard replica) on an active node cannot failover the query to its 
 good peer
 ---

 Key: SOLR-7069
 URL: https://issues.apache.org/jira/browse/SOLR-7069
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
Reporter: Forest Soup
 Attachments: Untitled.png


 When querying a collection with a core in down state, if we send the 
 request to the server containing the down core, while the server is active, 
 it cannot failover to the good replica of same shard on another server.
 The steps to make a core down on an active server is:
 1, delete the content of the data folder of the core
 2, restart the solr server the core locates.
 Then we can see the core is down while other cores on the same server is 
 still active. See attached picture.
 When we issue a query to the collection, if we send the request to the server 
 containing the down core, we receive below errors:
 HTTP Status 500 - {msg=SolrCore 'collection5_shard1_replica2' is not 
 available due to init failure: Error opening new 
 searcher,trace=org.apache.solr.common.SolrException: SolrCore 
 'collection5_shard1_replica2' is not available due to init failure: Error 
 opening new searcher at 
 org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:827) at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:309)
  at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
  at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
  at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
  at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
  at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
  at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) 
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
 at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) 
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
  at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) 
 at 
 org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
  at 
 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
  at 
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
  at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
  at 
 org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
  at java.lang.Thread.run(Thread.java:804) Caused by: 
 org.apache.solr.common.SolrException: Error opening new searcher at 
 org.apache.solr.core.SolrCore.init(SolrCore.java:844) at 
 org.apache.solr.core.SolrCore.init(SolrCore.java:630) at 
 org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244) at 
 org.apache.solr.core.CoreContainer.create(CoreContainer.java:595) at 
 org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258) at 
 org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:273) at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:273) at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
  ... 1 more Caused by: org.apache.solr.common.SolrException: Error opening 
 new searcher at 
 org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1521) at 
 org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1633) at 
 org.apache.solr.core.SolrCore.init(SolrCore.java:827) ... 11 more Caused 
 by: java.io.FileNotFoundException: 
 /mnt/solrdata1/solr/home/collection5_shard1_replica2/data/index/_12x.si (No 
 such file or directory) at 
 java.io.RandomAccessFile.init(RandomAccessFile.java:252) at 
 org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193) at 
 org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:233)
  at 
 org.apache.lucene.codecs.lucene46.Lucene46SegmentInfoReader.read(Lucene46SegmentInfoReader.java:49)
  at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:340) at 
 

[jira] [Updated] (SOLR-7069) A down core(shard replica) on an active node cannot failover the query to its good peer on another server

2015-02-02 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-7069:
--
Summary: A down core(shard replica) on an active node cannot failover the 
query to its good peer on another server  (was: A down core(shard replica) on 
an active node cannot failover the query to its good peer)

 A down core(shard replica) on an active node cannot failover the query to its 
 good peer on another server
 -

 Key: SOLR-7069
 URL: https://issues.apache.org/jira/browse/SOLR-7069
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
Reporter: Forest Soup
 Attachments: Untitled.png


 When querying a collection with a core in down state, if we send the 
 request to the server containing the down core, while the server is active, 
 it cannot failover to the good replica of same shard on another server.
 The steps to make a core down on an active server is:
 1, delete the content of the data folder of the core
 2, restart the solr server the core locates.
 Then we can see the core is down while other cores on the same server is 
 still active. See attached picture.
 When we issue a query to the collection, if we send the request to the server 
 containing the down core, we receive below errors:
 HTTP Status 500 - {msg=SolrCore 'collection5_shard1_replica2' is not 
 available due to init failure: Error opening new 
 searcher,trace=org.apache.solr.common.SolrException: SolrCore 
 'collection5_shard1_replica2' is not available due to init failure: Error 
 opening new searcher at 
 org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:827) at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:309)
  at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
  at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
  at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
  at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
  at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
  at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) 
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) 
 at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) 
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
  at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) 
 at 
 org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
  at 
 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
  at 
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
  at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
  at 
 org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
  at java.lang.Thread.run(Thread.java:804) Caused by: 
 org.apache.solr.common.SolrException: Error opening new searcher at 
 org.apache.solr.core.SolrCore.init(SolrCore.java:844) at 
 org.apache.solr.core.SolrCore.init(SolrCore.java:630) at 
 org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244) at 
 org.apache.solr.core.CoreContainer.create(CoreContainer.java:595) at 
 org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258) at 
 org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:273) at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) at 
 java.util.concurrent.FutureTask.run(FutureTask.java:273) at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
  ... 1 more Caused by: org.apache.solr.common.SolrException: Error opening 
 new searcher at 
 org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1521) at 
 org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1633) at 
 org.apache.solr.core.SolrCore.init(SolrCore.java:827) ... 11 more Caused 
 by: java.io.FileNotFoundException: 
 /mnt/solrdata1/solr/home/collection5_shard1_replica2/data/index/_12x.si (No 
 such file or directory) at 
 java.io.RandomAccessFile.init(RandomAccessFile.java:252) at 
 org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193) at 
 org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:233)
  at 
 

[jira] [Issue Comment Deleted] (SOLR-6675) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2015-01-22 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-6675:
--
Comment: was deleted

(was: We agree it's the suggester part. Thanks!)

 Solr webapp deployment is very slow with jmx/ in solrconfig.xml
 -

 Key: SOLR-6675
 URL: https://issues.apache.org/jira/browse/SOLR-6675
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Linux Redhat 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance
 Attachments: 1014.zip, callstack.png


 We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
 index(cores) are big(50~100G) each core. 
 When we start up tomcat, the solr webapp deployment is very slow. From 
 tomcat's catalina log, every time it takes about 10 minutes to get deployed. 
 After we analyzing java core dump, we notice it's because the loading process 
 cannot finish until the MBean calculation for large index is done.
  
 So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
 of solr webapp only take about 1 minute. So we can sure the MBean calculation 
 for large index is the root cause.
 Could you please point me if there is any async way to do statistic 
 monitoring without jmx/ in solrconfig.xml, or let it do calculation after 
 the deployment? Thanks!
 The callstack.png file in the attachment is the call stack of the long 
 blocking thread which is doing statistics calculation.
 The catalina log of tomcat:
 INFO: Starting Servlet Engine: Apache Tomcat/7.0.54
 Oct 13, 2014 2:00:29 AM org.apache.catalina.startup.HostConfig deployWAR
 INFO: Deploying web application archive 
 /opt/ibm/solrsearch/tomcat/webapps/solr.war
 Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployWAR
 INFO: Deployment of web application archive 
 /opt/ibm/solrsearch/tomcat/webapps/solr.war has finished in 594,325 ms 
  Time taken for solr app Deployment is about 10 minutes 
 ---
 Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/manager
 Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/manager has finished in 2,035 ms
 Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/examples
 Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/examples has finished in 1,789 ms
 Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/docs
 Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/docs has finished in 1,037 ms
 Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/ROOT
 Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/ROOT has finished in 948 ms
 Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/host-manager
 Oct 13, 2014 2:10:30 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/host-manager has finished in 951 ms
 Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
 INFO: Starting ProtocolHandler [http-bio-8080]
 Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
 INFO: Starting ProtocolHandler [ajp-bio-8009]
 Oct 13, 2014 2:10:31 AM org.apache.catalina.startup.Catalina start
 INFO: Server startup in 601506 ms



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6675) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2015-01-22 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288613#comment-14288613
 ] 

Forest Soup commented on SOLR-6675:
---

We agree it's the suggester part. Thanks!

 Solr webapp deployment is very slow with jmx/ in solrconfig.xml
 -

 Key: SOLR-6675
 URL: https://issues.apache.org/jira/browse/SOLR-6675
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Linux Redhat 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance
 Attachments: 1014.zip, callstack.png


 We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
 index(cores) are big(50~100G) each core. 
 When we start up tomcat, the solr webapp deployment is very slow. From 
 tomcat's catalina log, every time it takes about 10 minutes to get deployed. 
 After we analyzing java core dump, we notice it's because the loading process 
 cannot finish until the MBean calculation for large index is done.
  
 So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
 of solr webapp only take about 1 minute. So we can sure the MBean calculation 
 for large index is the root cause.
 Could you please point me if there is any async way to do statistic 
 monitoring without jmx/ in solrconfig.xml, or let it do calculation after 
 the deployment? Thanks!
 The callstack.png file in the attachment is the call stack of the long 
 blocking thread which is doing statistics calculation.
 The catalina log of tomcat:
 INFO: Starting Servlet Engine: Apache Tomcat/7.0.54
 Oct 13, 2014 2:00:29 AM org.apache.catalina.startup.HostConfig deployWAR
 INFO: Deploying web application archive 
 /opt/ibm/solrsearch/tomcat/webapps/solr.war
 Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployWAR
 INFO: Deployment of web application archive 
 /opt/ibm/solrsearch/tomcat/webapps/solr.war has finished in 594,325 ms 
  Time taken for solr app Deployment is about 10 minutes 
 ---
 Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/manager
 Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/manager has finished in 2,035 ms
 Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/examples
 Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/examples has finished in 1,789 ms
 Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/docs
 Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/docs has finished in 1,037 ms
 Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/ROOT
 Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/ROOT has finished in 948 ms
 Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/host-manager
 Oct 13, 2014 2:10:30 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/host-manager has finished in 951 ms
 Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
 INFO: Starting ProtocolHandler [http-bio-8080]
 Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
 INFO: Starting ProtocolHandler [ajp-bio-8009]
 Oct 13, 2014 2:10:31 AM org.apache.catalina.startup.Catalina start
 INFO: Server startup in 601506 ms



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6675) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2015-01-22 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288614#comment-14288614
 ] 

Forest Soup commented on SOLR-6675:
---

We agree it's the suggester part. Thanks!

 Solr webapp deployment is very slow with jmx/ in solrconfig.xml
 -

 Key: SOLR-6675
 URL: https://issues.apache.org/jira/browse/SOLR-6675
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Linux Redhat 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance
 Attachments: 1014.zip, callstack.png


 We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
 index(cores) are big(50~100G) each core. 
 When we start up tomcat, the solr webapp deployment is very slow. From 
 tomcat's catalina log, every time it takes about 10 minutes to get deployed. 
 After we analyzing java core dump, we notice it's because the loading process 
 cannot finish until the MBean calculation for large index is done.
  
 So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
 of solr webapp only take about 1 minute. So we can sure the MBean calculation 
 for large index is the root cause.
 Could you please point me if there is any async way to do statistic 
 monitoring without jmx/ in solrconfig.xml, or let it do calculation after 
 the deployment? Thanks!
 The callstack.png file in the attachment is the call stack of the long 
 blocking thread which is doing statistics calculation.
 The catalina log of tomcat:
 INFO: Starting Servlet Engine: Apache Tomcat/7.0.54
 Oct 13, 2014 2:00:29 AM org.apache.catalina.startup.HostConfig deployWAR
 INFO: Deploying web application archive 
 /opt/ibm/solrsearch/tomcat/webapps/solr.war
 Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployWAR
 INFO: Deployment of web application archive 
 /opt/ibm/solrsearch/tomcat/webapps/solr.war has finished in 594,325 ms 
  Time taken for solr app Deployment is about 10 minutes 
 ---
 Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/manager
 Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/manager has finished in 2,035 ms
 Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/examples
 Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/examples has finished in 1,789 ms
 Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/docs
 Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/docs has finished in 1,037 ms
 Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/ROOT
 Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/ROOT has finished in 948 ms
 Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/host-manager
 Oct 13, 2014 2:10:30 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/host-manager has finished in 951 ms
 Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
 INFO: Starting ProtocolHandler [http-bio-8080]
 Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
 INFO: Starting ProtocolHandler [ajp-bio-8009]
 Oct 13, 2014 2:10:31 AM org.apache.catalina.startup.Catalina start
 INFO: Server startup in 601506 ms



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Deleted] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-01-04 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-6359:
--
Comment: was deleted

(was: Thanks. But will there be this case?  
After a snapshot recovery of core A is done, the tlog is still out-of-date 
without any new records from recovery, and it's not cleared. And if the just 
recovered core(core A) taking the leader role, and another core(core C) is 
trying to recover from it. As A's tlog contains the old entries without newest 
ones, will the core C do a peersync only with the old records, but missing the 
newest ones?

And I think the snapshot recovery is because there are too much difference 
between the 2 cores, so the tlog gap are also too much. So the out-of-date tlog 
is no longer needed for peersync.

Our testing shows the snapshot recovery does not clean tlog with below steps:
1, Core A and core B are 2 replicas of a shard.
2, Core A down, and core B took leader role. And it takes some updates and 
record them to its tlog.
3, After A up, it will do recovery from B, and if the difference are too much, 
A will do snapshot pull recovery. And during the snapshot pull recovery, there 
is no other update comes in. After the snapshot pull recovery, the tlog of A is 
not updated, it still does NOT contain any most recent from B. 
* And the tlog are still out-of-date, although the index of A is already 
updated. *
4, Core A down again, and core B still remain the leader role, and it takes 
some other updates and recore them to its tlog.
5, After A up again, it will do recovery from B. But it found its tlog is still 
too old. So it will do a snapshot recovery again, which is not necessary.

Do you agree? Thanks!)

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6683) Need a configurable parameter to control the doc number between peersync and the snapshot pull recovery

2015-01-04 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263806#comment-14263806
 ] 

Forest Soup commented on SOLR-6683:
---

The snapshot recovery does not clear tlog of the core being recovered. Is it an 
issue?

 Need a configurable parameter to control the doc number between peersync and 
 the snapshot pull recovery
 ---

 Key: SOLR-6683
 URL: https://issues.apache.org/jira/browse/SOLR-6683
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.7
 Environment: Redhat Linux 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance

 If there are 100 docs gap between the recovering node and the good node, the 
 solr will do snap pull recovery instead of peersync.
 Can the 100 docs be configurable? For example, there can be 1, 1000, or 
 10 docs gap between the good node and the node to recover.
 For 100 doc, a regular restart of a solr node will trigger a full recovery, 
 which is a huge impact to the performance of the running systems
 Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-01-04 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263807#comment-14263807
 ] 

Forest Soup commented on SOLR-6359:
---

The snapshot recovery does not clear tlog of the core being recovered. Is it an 
issue?

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-01-04 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264276#comment-14264276
 ] 

Forest Soup commented on SOLR-6359:
---

Thanks. But will there be this case?  
After a snapshot recovery of core A is done, the tlog is still out-of-date 
without any new records from recovery, and it's not cleared. And if the just 
recovered core(core A) taking the leader role, and another core(core C) is 
trying to recover from it. As A's tlog contains the old entries without newest 
ones, will the core C do a peersync only with the old records, but missing the 
newest ones?

And I think the snapshot recovery is because there are too much difference 
between the 2 cores, so the tlog gap are also too much. So the out-of-date tlog 
is no longer needed for peersync.

Our testing shows the snapshot recovery does not clean tlog with below steps:
1, Core A and core B are 2 replicas of a shard.
2, Core A down, and core B took leader role. And it takes some updates and 
record them to its tlog.
3, After A up, it will do recovery from B, and if the difference are too much, 
A will do snapshot pull recovery. And during the snapshot pull recovery, there 
is no other update comes in. After the snapshot pull recovery, the tlog of A is 
not updated, it still does NOT contain any most recent from B. 
* And the tlog are still out-of-date, although the index of A is already 
updated. *
4, Core A down again, and core B still remain the leader role, and it takes 
some other updates and recore them to its tlog.
5, After A up again, it will do recovery from B. But it found its tlog is still 
too old. So it will do a snapshot recovery again, which is not necessary.

Do you agree? Thanks!

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-01-04 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264277#comment-14264277
 ] 

Forest Soup commented on SOLR-6359:
---

Thanks. But will there be this case?  
After a snapshot recovery of core A is done, the tlog is still out-of-date 
without any new records from recovery, and it's not cleared. And if the just 
recovered core(core A) taking the leader role, and another core(core C) is 
trying to recover from it. As A's tlog contains the old entries without newest 
ones, will the core C do a peersync only with the old records, but missing the 
newest ones?

And I think the snapshot recovery is because there are too much difference 
between the 2 cores, so the tlog gap are also too much. So the out-of-date tlog 
is no longer needed for peersync.

Our testing shows the snapshot recovery does not clean tlog with below steps:
1, Core A and core B are 2 replicas of a shard.
2, Core A down, and core B took leader role. And it takes some updates and 
record them to its tlog.
3, After A up, it will do recovery from B, and if the difference are too much, 
A will do snapshot pull recovery. And during the snapshot pull recovery, there 
is no other update comes in. After the snapshot pull recovery, the tlog of A is 
not updated, it still does NOT contain any most recent from B. 
* And the tlog are still out-of-date, although the index of A is already 
updated. *
4, Core A down again, and core B still remain the leader role, and it takes 
some other updates and recore them to its tlog.
5, After A up again, it will do recovery from B. But it found its tlog is still 
too old. So it will do a snapshot recovery again, which is not necessary.

Do you agree? Thanks!

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-01-04 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14264277#comment-14264277
 ] 

Forest Soup edited comment on SOLR-6359 at 1/5/15 7:27 AM:
---

Thanks. But will there be this case?  
After a snapshot recovery of core A is done, the tlog is still out-of-date 
without any new records from recovery, and it's not cleared. And if the just 
recovered core(core A) taking the leader role, and another core(core C) is 
trying to recover from it. As A's tlog contains the old entries without newest 
ones, will the core C do a peersync only with the old records, but missing the 
newest ones?

And I think the snapshot recovery is because there are too much difference 
between the 2 cores, so the tlog gap are also too much. So the out-of-date tlog 
is no longer needed for peersync.

Our testing shows the snapshot recovery does not clean tlog with below steps:
1, Core A and core B are 2 replicas of a shard.
2, Core A down, and core B took leader role. And it takes some updates and 
record them to its tlog.
3, After A up, it will do recovery from B, and if the difference are too much, 
A will do snapshot pull recovery. And during the snapshot pull recovery, there 
is no other update comes in. After the snapshot pull recovery, the tlog of A is 
not updated, it still does NOT contain any most recent from B. 
And the tlog are still out-of-date, although the index of A is already updated.
4, Core A down again, and core B still remain the leader role, and it takes 
some other updates and recore them to its tlog.
5, After A up again, it will do recovery from B. But it found its tlog is still 
too old. So it will do a snapshot recovery again, which is not necessary.

Do you agree? Thanks!


was (Author: forest_soup):
Thanks. But will there be this case?  
After a snapshot recovery of core A is done, the tlog is still out-of-date 
without any new records from recovery, and it's not cleared. And if the just 
recovered core(core A) taking the leader role, and another core(core C) is 
trying to recover from it. As A's tlog contains the old entries without newest 
ones, will the core C do a peersync only with the old records, but missing the 
newest ones?

And I think the snapshot recovery is because there are too much difference 
between the 2 cores, so the tlog gap are also too much. So the out-of-date tlog 
is no longer needed for peersync.

Our testing shows the snapshot recovery does not clean tlog with below steps:
1, Core A and core B are 2 replicas of a shard.
2, Core A down, and core B took leader role. And it takes some updates and 
record them to its tlog.
3, After A up, it will do recovery from B, and if the difference are too much, 
A will do snapshot pull recovery. And during the snapshot pull recovery, there 
is no other update comes in. After the snapshot pull recovery, the tlog of A is 
not updated, it still does NOT contain any most recent from B. 
* And the tlog are still out-of-date, although the index of A is already 
updated. *
4, Core A down again, and core B still remain the leader role, and it takes 
some other updates and recore them to its tlog.
5, After A up again, it will do recovery from B. But it found its tlog is still 
too old. So it will do a snapshot recovery again, which is not necessary.

Do you agree? Thanks!

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-01-03 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263763#comment-14263763
 ] 

Forest Soup commented on SOLR-6359:
---

it works but with some pre-condition: the 20% newest existing transaction log 
of the core to be recovered must be newer than the 20% oldest existing 
transaction log of the good core.

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6683) Need a configurable parameter to control the doc number between peersync and the snapshot pull recovery

2015-01-03 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263762#comment-14263762
 ] 

Forest Soup commented on SOLR-6683:
---

it works but with some pre-condition: the 20% newest existing transaction log 
of the core to be recovered must be newer than the 20% oldest existing 
transaction log of the good core.

 Need a configurable parameter to control the doc number between peersync and 
 the snapshot pull recovery
 ---

 Key: SOLR-6683
 URL: https://issues.apache.org/jira/browse/SOLR-6683
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.7
 Environment: Redhat Linux 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance

 If there are 100 docs gap between the recovering node and the good node, the 
 solr will do snap pull recovery instead of peersync.
 Can the 100 docs be configurable? For example, there can be 1, 1000, or 
 10 docs gap between the good node and the node to recover.
 For 100 doc, a regular restart of a solr node will trigger a full recovery, 
 which is a huge impact to the performance of the running systems
 Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6683) Need a configurable parameter to control the doc number between peersync and the snapshot pull recovery

2015-01-03 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263765#comment-14263765
 ] 

Forest Soup commented on SOLR-6683:
---

A full snapshot recovery does not clean the tlog of the core being recovered.

 Need a configurable parameter to control the doc number between peersync and 
 the snapshot pull recovery
 ---

 Key: SOLR-6683
 URL: https://issues.apache.org/jira/browse/SOLR-6683
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.7
 Environment: Redhat Linux 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance

 If there are 100 docs gap between the recovering node and the good node, the 
 solr will do snap pull recovery instead of peersync.
 Can the 100 docs be configurable? For example, there can be 1, 1000, or 
 10 docs gap between the good node and the node to recover.
 For 100 doc, a regular restart of a solr node will trigger a full recovery, 
 which is a huge impact to the performance of the running systems
 Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-01-03 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14263764#comment-14263764
 ] 

Forest Soup commented on SOLR-6359:
---

A full snapshot recovery does not clean the tlog of the core being recovered.

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-01-03 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249827#comment-14249827
 ] 

Forest Soup edited comment on SOLR-6359 at 1/4/15 6:18 AM:
---

I applied the patch for SOLR-6359 on 4.7 and did some test. Set below config:
updateLog
str name=dir$
{solr.ulog.dir:}
/str
int name=numRecordsToKeep1/int
int name=maxNumLogsToKeep100/int
/updateLog


was (Author: forest_soup):
I applied the patch for SOLR-6359 on 4.7 and did some test. It does not work as 
expected. 
When I set below config, it still go into SnapPuller code even if I only newly 
added 800 doc.
updateLog
str name=dir$
{solr.ulog.dir:}
/str
int name=numRecordsToKeep1/int
int name=maxNumLogsToKeep100/int
/updateLog
After my reading code, it seems that lines in 
org.apache.solr.update.PeerSync.handleVersions(ShardResponse srsp) cause the 
issue:
if (ourHighThreshold  otherLow)
{ // Small overlap between version windows and ours is older // This means that 
we might miss updates if we attempted to use this method. // Since there exists 
just one replica that is so much newer, we must // fail the sync. 
log.info(msg() +  Our versions are too old. 
ourHighThreshold=+ourHighThreshold +  otherLowThreshold=+otherLow); return 
false; }
Could you please comment? Thanks!

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6683) Need a configurable parameter to control the doc number between peersync and the snapshot pull recovery

2015-01-03 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249826#comment-14249826
 ] 

Forest Soup edited comment on SOLR-6683 at 1/4/15 6:18 AM:
---

I applied the patch for SOLR-6359 on 4.7 and did some test. Set below config:
updateLog
  str name=dir${solr.ulog.dir:}/str
  int name=numRecordsToKeep1/int
  int name=maxNumLogsToKeep100/int
/updateLog



was (Author: forest_soup):
I applied the patch for SOLR-6359 on 4.7 and did some test. It does not work as 
expected. 
When I set below config, it still go into SnapPuller code even if I only newly 
added 800 doc.
updateLog
  str name=dir${solr.ulog.dir:}/str
  int name=numRecordsToKeep1/int
  int name=maxNumLogsToKeep100/int
/updateLog

After my reading code, it seems that lines in 
org.apache.solr.update.PeerSync.handleVersions(ShardResponse srsp) cause the 
issue:
if (ourHighThreshold  otherLow) {
  // Small overlap between version windows and ours is older
  // This means that we might miss updates if we attempted to use this 
method.
  // Since there exists just one replica that is so much newer, we must
  // fail the sync.
  log.info(msg() +  Our versions are too old. 
ourHighThreshold=+ourHighThreshold +  otherLowThreshold=+otherLow);
  return false;
} 

Could you please comment? Thanks!

 Need a configurable parameter to control the doc number between peersync and 
 the snapshot pull recovery
 ---

 Key: SOLR-6683
 URL: https://issues.apache.org/jira/browse/SOLR-6683
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.7
 Environment: Redhat Linux 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance

 If there are 100 docs gap between the recovering node and the good node, the 
 solr will do snap pull recovery instead of peersync.
 Can the 100 docs be configurable? For example, there can be 1, 1000, or 
 10 docs gap between the good node and the node to recover.
 For 100 doc, a regular restart of a solr node will trigger a full recovery, 
 which is a huge impact to the performance of the running systems
 Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2015-01-03 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249827#comment-14249827
 ] 

Forest Soup edited comment on SOLR-6359 at 1/4/15 6:19 AM:
---

I applied the patch for SOLR-6359 on 4.7 and did some test. Set below config:
updateLog
str name=dir${solr.ulog.dir:}/str
int name=numRecordsToKeep1/int
int name=maxNumLogsToKeep100/int
/updateLog


was (Author: forest_soup):
I applied the patch for SOLR-6359 on 4.7 and did some test. Set below config:
updateLog
str name=dir$
{solr.ulog.dir:}
/str
int name=numRecordsToKeep1/int
int name=maxNumLogsToKeep100/int
/updateLog

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2014-12-17 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246618#comment-14246618
 ] 

Forest Soup edited comment on SOLR-6359 at 12/17/14 7:59 AM:
-

The numRecordsToKeep and maxNumLogsToKeep values should be in the 
updateLog., like below. Right?
!-- Enables a transaction log, used for real-time get, durability, and
 and solr cloud replica recovery.  The log can grow as big as
 uncommitted changes to the index, so use of a hard autoCommit
 is recommended (see below).
 dir - the target directory for transaction logs, defaults to the
solr data directory.  --
updateLog
  str name=dir${solr.ulog.dir:}/str
  int name=numRecordsToKeep1/int
  int name=maxNumLogsToKeep100/int
/updateLog


was (Author: forest_soup):
And where should I set the numRecordsToKeep and maxNumLogsToKeep values? 
Thanks!

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2014-12-17 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246618#comment-14246618
 ] 

Forest Soup edited comment on SOLR-6359 at 12/17/14 10:01 AM:
--

The numRecordsToKeep and maxNumLogsToKeep values should be in the 
updateLog.like below.
!-- Enables a transaction log, used for real-time get, durability, and
 and solr cloud replica recovery.  The log can grow as big as
 uncommitted changes to the index, so use of a hard autoCommit
 is recommended (see below).
 dir - the target directory for transaction logs, defaults to the
solr data directory.  --
updateLog
  str name=dir${solr.ulog.dir:}/str
  int name=numRecordsToKeep1/int
  int name=maxNumLogsToKeep100/int
/updateLog


was (Author: forest_soup):
The numRecordsToKeep and maxNumLogsToKeep values should be in the 
updateLog., like below. Right?
!-- Enables a transaction log, used for real-time get, durability, and
 and solr cloud replica recovery.  The log can grow as big as
 uncommitted changes to the index, so use of a hard autoCommit
 is recommended (see below).
 dir - the target directory for transaction logs, defaults to the
solr data directory.  --
updateLog
  str name=dir${solr.ulog.dir:}/str
  int name=numRecordsToKeep1/int
  int name=maxNumLogsToKeep100/int
/updateLog

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6683) Need a configurable parameter to control the doc number between peersync and the snapshot pull recovery

2014-12-17 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249826#comment-14249826
 ] 

Forest Soup commented on SOLR-6683:
---

I applied the patch for SOLR-6359 on 4.7 and did some test. It does not work as 
expected. 
When I set below config, it still go into SnapPuller code even if I only newly 
added 800 doc.
updateLog
  str name=dir${solr.ulog.dir:}/str
  int name=numRecordsToKeep1/int
  int name=maxNumLogsToKeep100/int
/updateLog

After my reading code, it seems that lines in 
org.apache.solr.update.PeerSync.handleVersions(ShardResponse srsp) cause the 
issue:
if (ourHighThreshold  otherLow) {
  // Small overlap between version windows and ours is older
  // This means that we might miss updates if we attempted to use this 
method.
  // Since there exists just one replica that is so much newer, we must
  // fail the sync.
  log.info(msg() +  Our versions are too old. 
ourHighThreshold=+ourHighThreshold +  otherLowThreshold=+otherLow);
  return false;
} 

Could you please comment? Thanks!

 Need a configurable parameter to control the doc number between peersync and 
 the snapshot pull recovery
 ---

 Key: SOLR-6683
 URL: https://issues.apache.org/jira/browse/SOLR-6683
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.7
 Environment: Redhat Linux 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance

 If there are 100 docs gap between the recovering node and the good node, the 
 solr will do snap pull recovery instead of peersync.
 Can the 100 docs be configurable? For example, there can be 1, 1000, or 
 10 docs gap between the good node and the node to recover.
 For 100 doc, a regular restart of a solr node will trigger a full recovery, 
 which is a huge impact to the performance of the running systems
 Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2014-12-17 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249827#comment-14249827
 ] 

Forest Soup commented on SOLR-6359:
---

I applied the patch for SOLR-6359 on 4.7 and did some test. It does not work as 
expected. 
When I set below config, it still go into SnapPuller code even if I only newly 
added 800 doc.
updateLog
str name=dir$
{solr.ulog.dir:}
/str
int name=numRecordsToKeep1/int
int name=maxNumLogsToKeep100/int
/updateLog
After my reading code, it seems that lines in 
org.apache.solr.update.PeerSync.handleVersions(ShardResponse srsp) cause the 
issue:
if (ourHighThreshold  otherLow)
{ // Small overlap between version windows and ours is older // This means that 
we might miss updates if we attempted to use this method. // Since there exists 
just one replica that is so much newer, we must // fail the sync. 
log.info(msg() +  Our versions are too old. 
ourHighThreshold=+ourHighThreshold +  otherLowThreshold=+otherLow); return 
false; }
Could you please comment? Thanks!

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6675) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2014-12-15 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246497#comment-14246497
 ] 

Forest Soup commented on SOLR-6675:
---

Looks like thread searcherExecutor-5-thread-1 and searcherExecutor-6-thread-1 
blocking the coreLoadExecutor-4-thread-1 and coreLoadExecutor-4-thread-2. 
And searcherExecutor-5-thread-1 and searcherExecutor-6-thread-1 are like 
suggester code.
[~hossman] Could you please help to make sure? Thanks!

 Solr webapp deployment is very slow with jmx/ in solrconfig.xml
 -

 Key: SOLR-6675
 URL: https://issues.apache.org/jira/browse/SOLR-6675
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Linux Redhat 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance
 Attachments: 1014.zip, callstack.png


 We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
 index(cores) are big(50~100G) each core. 
 When we start up tomcat, the solr webapp deployment is very slow. From 
 tomcat's catalina log, every time it takes about 10 minutes to get deployed. 
 After we analyzing java core dump, we notice it's because the loading process 
 cannot finish until the MBean calculation for large index is done.
  
 So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
 of solr webapp only take about 1 minute. So we can sure the MBean calculation 
 for large index is the root cause.
 Could you please point me if there is any async way to do statistic 
 monitoring without jmx/ in solrconfig.xml, or let it do calculation after 
 the deployment? Thanks!
 The callstack.png file in the attachment is the call stack of the long 
 blocking thread which is doing statistics calculation.
 The catalina log of tomcat:
 INFO: Starting Servlet Engine: Apache Tomcat/7.0.54
 Oct 13, 2014 2:00:29 AM org.apache.catalina.startup.HostConfig deployWAR
 INFO: Deploying web application archive 
 /opt/ibm/solrsearch/tomcat/webapps/solr.war
 Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployWAR
 INFO: Deployment of web application archive 
 /opt/ibm/solrsearch/tomcat/webapps/solr.war has finished in 594,325 ms 
  Time taken for solr app Deployment is about 10 minutes 
 ---
 Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/manager
 Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/manager has finished in 2,035 ms
 Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/examples
 Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/examples has finished in 1,789 ms
 Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/docs
 Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/docs has finished in 1,037 ms
 Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/ROOT
 Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/ROOT has finished in 948 ms
 Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/host-manager
 Oct 13, 2014 2:10:30 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/host-manager has finished in 951 ms
 Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
 INFO: Starting ProtocolHandler [http-bio-8080]
 Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
 INFO: Starting ProtocolHandler [ajp-bio-8009]
 Oct 13, 2014 2:10:31 AM org.apache.catalina.startup.Catalina start
 INFO: Server startup in 601506 ms



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2014-12-15 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246505#comment-14246505
 ] 

Forest Soup commented on SOLR-6359:
---

Is the patch only available for Solr 5.0? For Solr 4.7, can we apply the patch? 
Thanks!

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2014-12-15 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246609#comment-14246609
 ] 

Forest Soup commented on SOLR-6359:
---

When could we get the official build with that patch in 4.x or 5.0?

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6359) Allow customization of the number of records and logs kept by UpdateLog

2014-12-15 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14246618#comment-14246618
 ] 

Forest Soup commented on SOLR-6359:
---

And where should I set the numRecordsToKeep and maxNumLogsToKeep values? 
Thanks!

 Allow customization of the number of records and logs kept by UpdateLog
 ---

 Key: SOLR-6359
 URL: https://issues.apache.org/jira/browse/SOLR-6359
 Project: Solr
  Issue Type: Improvement
Reporter: Ramkumar Aiyengar
Assignee: Mark Miller
Priority: Minor
 Fix For: 5.0, Trunk


 Currently {{UpdateLog}} hardcodes the number of logs and records it keeps, 
 and the hardcoded numbers (100 records, 10 logs) can be quite low (esp. the 
 records) in an heavily indexing setup, leading to full recovery even if Solr 
 was just stopped and restarted.
 These values should be customizable (even if only present as expert options).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6675) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2014-12-14 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-6675:
--
Attachment: 1014.zip

The 0001.txt and 0002.txt are the dump files before solr webapp is deployed. 
The 0003.txt is the dump file after solr webapp is deployed.

 Solr webapp deployment is very slow with jmx/ in solrconfig.xml
 -

 Key: SOLR-6675
 URL: https://issues.apache.org/jira/browse/SOLR-6675
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Linux Redhat 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance
 Attachments: 1014.zip, callstack.png


 We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
 index(cores) are big(50~100G) each core. 
 When we start up tomcat, the solr webapp deployment is very slow. From 
 tomcat's catalina log, every time it takes about 10 minutes to get deployed. 
 After we analyzing java core dump, we notice it's because the loading process 
 cannot finish until the MBean calculation for large index is done.
  
 So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
 of solr webapp only take about 1 minute. So we can sure the MBean calculation 
 for large index is the root cause.
 Could you please point me if there is any async way to do statistic 
 monitoring without jmx/ in solrconfig.xml, or let it do calculation after 
 the deployment? Thanks!
 The callstack.png file in the attachment is the call stack of the long 
 blocking thread which is doing statistics calculation.
 The catalina log of tomcat:
 INFO: Starting Servlet Engine: Apache Tomcat/7.0.54
 Oct 13, 2014 2:00:29 AM org.apache.catalina.startup.HostConfig deployWAR
 INFO: Deploying web application archive 
 /opt/ibm/solrsearch/tomcat/webapps/solr.war
 Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployWAR
 INFO: Deployment of web application archive 
 /opt/ibm/solrsearch/tomcat/webapps/solr.war has finished in 594,325 ms 
  Time taken for solr app Deployment is about 10 minutes 
 ---
 Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/manager
 Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/manager has finished in 2,035 ms
 Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/examples
 Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/examples has finished in 1,789 ms
 Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/docs
 Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/docs has finished in 1,037 ms
 Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/ROOT
 Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/ROOT has finished in 948 ms
 Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/host-manager
 Oct 13, 2014 2:10:30 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/host-manager has finished in 951 ms
 Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
 INFO: Starting ProtocolHandler [http-bio-8080]
 Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
 INFO: Starting ProtocolHandler [ajp-bio-8009]
 Oct 13, 2014 2:10:31 AM org.apache.catalina.startup.Catalina start
 INFO: Server startup in 601506 ms



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6675) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2014-12-04 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234034#comment-14234034
 ] 

Forest Soup commented on SOLR-6675:
---

This is our JVM. And we have never tried the latest Solr 4.10.x.  Any idea on 
how to resolve or workaround it? Thanks!

java version 1.7.0
Java(TM) SE Runtime Environment (build pxa6470sr6-20131015_01(SR6))
IBM J9 VM (build 2.6, JRE 1.7.0 Linux amd64-64 Compressed References 
20131013_170512 (JIT enabled, AOT enabled)
J9VM - R26_Java726_SR6_20131013_1510_B170512
JIT  - r11.b05_20131003_47443
GC   - R26_Java726_SR6_20131013_1510_B170512_CMPRSS
J9CL - 20131013_170512)
JCL - 20131011_01 based on Oracle 7u45-b18


 Solr webapp deployment is very slow with jmx/ in solrconfig.xml
 -

 Key: SOLR-6675
 URL: https://issues.apache.org/jira/browse/SOLR-6675
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Linux Redhat 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance
 Attachments: callstack.png


 We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
 index(cores) are big(50~100G) each core. 
 When we start up tomcat, the solr webapp deployment is very slow. From 
 tomcat's catalina log, every time it takes about 10 minutes to get deployed. 
 After we analyzing java core dump, we notice it's because the loading process 
 cannot finish until the MBean calculation for large index is done.
  
 So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
 of solr webapp only take about 1 minute. So we can sure the MBean calculation 
 for large index is the root cause.
 Could you please point me if there is any async way to do statistic 
 monitoring without jmx/ in solrconfig.xml, or let it do calculation after 
 the deployment? Thanks!
 The callstack.png file in the attachment is the call stack of the long 
 blocking thread which is doing statistics calculation.
 The catalina log of tomcat:
 INFO: Starting Servlet Engine: Apache Tomcat/7.0.54
 Oct 13, 2014 2:00:29 AM org.apache.catalina.startup.HostConfig deployWAR
 INFO: Deploying web application archive 
 /opt/ibm/solrsearch/tomcat/webapps/solr.war
 Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployWAR
 INFO: Deployment of web application archive 
 /opt/ibm/solrsearch/tomcat/webapps/solr.war has finished in 594,325 ms 
  Time taken for solr app Deployment is about 10 minutes 
 ---
 Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/manager
 Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/manager has finished in 2,035 ms
 Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/examples
 Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/examples has finished in 1,789 ms
 Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/docs
 Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/docs has finished in 1,037 ms
 Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/ROOT
 Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/ROOT has finished in 948 ms
 Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deploying web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/host-manager
 Oct 13, 2014 2:10:30 AM org.apache.catalina.startup.HostConfig deployDirectory
 INFO: Deployment of web application directory 
 /opt/ibm/solrsearch/tomcat/webapps/host-manager has finished in 951 ms
 Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
 INFO: Starting ProtocolHandler [http-bio-8080]
 Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
 INFO: Starting ProtocolHandler [ajp-bio-8009]
 Oct 13, 2014 2:10:31 AM org.apache.catalina.startup.Catalina start
 INFO: Server startup in 601506 ms



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (SOLR-4470) Support for basic http auth in internal solr requests

2014-12-04 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234046#comment-14234046
 ] 

Forest Soup commented on SOLR-4470:
---

Does anyone have an idea when this will be released? Thanks!

 Support for basic http auth in internal solr requests
 -

 Key: SOLR-4470
 URL: https://issues.apache.org/jira/browse/SOLR-4470
 Project: Solr
  Issue Type: New Feature
  Components: clients - java, multicore, replication (java), SolrCloud
Affects Versions: 4.0
Reporter: Per Steffensen
Assignee: Jan Høydahl
  Labels: authentication, https, solrclient, solrcloud, ssl
 Fix For: Trunk

 Attachments: SOLR-4470.patch, SOLR-4470.patch, SOLR-4470.patch, 
 SOLR-4470.patch, SOLR-4470.patch, SOLR-4470.patch, SOLR-4470.patch, 
 SOLR-4470.patch, SOLR-4470.patch, SOLR-4470.patch, SOLR-4470.patch, 
 SOLR-4470.patch, SOLR-4470_branch_4x_r1452629.patch, 
 SOLR-4470_branch_4x_r1452629.patch, SOLR-4470_branch_4x_r145.patch, 
 SOLR-4470_trunk_r1568857.patch


 We want to protect any HTTP-resource (url). We want to require credentials no 
 matter what kind of HTTP-request you make to a Solr-node.
 It can faily easy be acheived as described on 
 http://wiki.apache.org/solr/SolrSecurity. This problem is that Solr-nodes 
 also make internal request to other Solr-nodes, and for it to work 
 credentials need to be provided here also.
 Ideally we would like to forward credentials from a particular request to 
 all the internal sub-requests it triggers. E.g. for search and update 
 request.
 But there are also internal requests
 * that only indirectly/asynchronously triggered from outside requests (e.g. 
 shard creation/deletion/etc based on calls to the Collection API)
 * that do not in any way have relation to an outside super-request (e.g. 
 replica synching stuff)
 We would like to aim at a solution where original credentials are 
 forwarded when a request directly/synchronously trigger a subrequest, and 
 fallback to a configured internal credentials for the 
 asynchronous/non-rooted requests.
 In our solution we would aim at only supporting basic http auth, but we would 
 like to make a framework around it, so that not to much refactoring is 
 needed if you later want to make support for other kinds of auth (e.g. digest)
 We will work at a solution but create this JIRA issue early in order to get 
 input/comments from the community as early as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6683) Need a configurable parameter to control the doc number between peersync and the snapshot pull recovery

2014-12-04 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14234052#comment-14234052
 ] 

Forest Soup commented on SOLR-6683:
---

Thanks, Ramkumar. 

We will try it. Thanks!

 Need a configurable parameter to control the doc number between peersync and 
 the snapshot pull recovery
 ---

 Key: SOLR-6683
 URL: https://issues.apache.org/jira/browse/SOLR-6683
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.7
 Environment: Redhat Linux 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance

 If there are 100 docs gap between the recovering node and the good node, the 
 solr will do snap pull recovery instead of peersync.
 Can the 100 docs be configurable? For example, there can be 1, 1000, or 
 10 docs gap between the good node and the node to recover.
 For 100 doc, a regular restart of a solr node will trigger a full recovery, 
 which is a huge impact to the performance of the running systems
 Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6683) Need a configurable parameter to control the doc number between peersync and the snapshot pull recovery

2014-10-31 Thread Forest Soup (JIRA)
Forest Soup created SOLR-6683:
-

 Summary: Need a configurable parameter to control the doc number 
between peersync and the snapshot pull recovery
 Key: SOLR-6683
 URL: https://issues.apache.org/jira/browse/SOLR-6683
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.7
 Environment: Redhat Linux 64bit
Reporter: Forest Soup
Priority: Critical


If there are 100 docs gap between the recovering node and the good node, the 
solr will do snap pull recovery instead of peersync.

Can the 100 docs be configurable? For example, there can be 1, 1000, or 10 
docs gap between the good node and the node to recover.

Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6683) Need a configurable parameter to control the doc number between peersync and the snapshot pull recovery

2014-10-31 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-6683:
--
Description: 
If there are 100 docs gap between the recovering node and the good node, the 
solr will do snap pull recovery instead of peersync.

Can the 100 docs be configurable? For example, there can be 1, 1000, or 10 
docs gap between the good node and the node to recover.

For 100 doc, a regular restart of a solr node will trigger a full recovery, 
which is a huge impact to the performance of the running systems

Thanks!

  was:
If there are 100 docs gap between the recovering node and the good node, the 
solr will do snap pull recovery instead of peersync.

Can the 100 docs be configurable? For example, there can be 1, 1000, or 10 
docs gap between the good node and the node to recover.

Thanks!


 Need a configurable parameter to control the doc number between peersync and 
 the snapshot pull recovery
 ---

 Key: SOLR-6683
 URL: https://issues.apache.org/jira/browse/SOLR-6683
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Affects Versions: 4.7
 Environment: Redhat Linux 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance

 If there are 100 docs gap between the recovering node and the good node, the 
 solr will do snap pull recovery instead of peersync.
 Can the 100 docs be configurable? For example, there can be 1, 1000, or 
 10 docs gap between the good node and the node to recover.
 For 100 doc, a regular restart of a solr node will trigger a full recovery, 
 which is a huge impact to the performance of the running systems
 Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6674) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2014-10-30 Thread Forest Soup (JIRA)
Forest Soup created SOLR-6674:
-

 Summary: Solr webapp deployment is very slow with jmx/ in 
solrconfig.xml
 Key: SOLR-6674
 URL: https://issues.apache.org/jira/browse/SOLR-6674
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Linux Redhat 64bit
Reporter: Forest Soup
Priority: Critical


We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
index(cores) are big(50~100G) each core. 

When we start up tomcat, the solr webapp deployment is very slow. From tomcat's 
catalina log, every time it takes about 10 minutes to get deployed. After we 
analyzing java core dump, we notice it's because the loading process cannot 
finish until the MBean calculation for large index is done.
 
So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
of solr webapp only take about 1 minute. So we can sure the MBean calculation 
for large index is the root cause.

Could you please point me if there is any async way to do statistic monitoring 
without jmx/ in solrconfig.xml, or let it do calculation after the 
deployment? Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6675) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2014-10-30 Thread Forest Soup (JIRA)
Forest Soup created SOLR-6675:
-

 Summary: Solr webapp deployment is very slow with jmx/ in 
solrconfig.xml
 Key: SOLR-6675
 URL: https://issues.apache.org/jira/browse/SOLR-6675
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Linux Redhat 64bit
Reporter: Forest Soup
Priority: Critical


We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
index(cores) are big(50~100G) each core. 

When we start up tomcat, the solr webapp deployment is very slow. From tomcat's 
catalina log, every time it takes about 10 minutes to get deployed. After we 
analyzing java core dump, we notice it's because the loading process cannot 
finish until the MBean calculation for large index is done.
 
So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
of solr webapp only take about 1 minute. So we can sure the MBean calculation 
for large index is the root cause.

Could you please point me if there is any async way to do statistic monitoring 
without jmx/ in solrconfig.xml, or let it do calculation after the 
deployment? Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6675) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2014-10-30 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189798#comment-14189798
 ] 

Forest Soup commented on SOLR-6675:
---

The catalina log of tomcat:

INFO: Starting Servlet Engine: Apache Tomcat/7.0.54
Oct 13, 2014 2:00:29 AM org.apache.catalina.startup.HostConfig deployWAR
INFO: Deploying web application archive 
/opt/ibm/solrsearch/tomcat/webapps/solr.war
Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployWAR
INFO: Deployment of web application archive 
/opt/ibm/solrsearch/tomcat/webapps/solr.war has finished in 594,325 ms 
 Time taken for solr app Deployment is about 10 minutes 
---
Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/manager
Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/manager has finished in 2,035 ms
Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/examples
Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/examples has finished in 1,789 ms
Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/docs
Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/docs has finished in 1,037 ms
Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/ROOT
Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/ROOT has finished in 948 ms
Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/host-manager
Oct 13, 2014 2:10:30 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/host-manager has finished in 951 ms
Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler [http-bio-8080]
Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler [ajp-bio-8009]
Oct 13, 2014 2:10:31 AM org.apache.catalina.startup.Catalina start
INFO: Server startup in 601506 ms   



 Solr webapp deployment is very slow with jmx/ in solrconfig.xml
 -

 Key: SOLR-6675
 URL: https://issues.apache.org/jira/browse/SOLR-6675
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Linux Redhat 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance

 We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
 index(cores) are big(50~100G) each core. 
 When we start up tomcat, the solr webapp deployment is very slow. From 
 tomcat's catalina log, every time it takes about 10 minutes to get deployed. 
 After we analyzing java core dump, we notice it's because the loading process 
 cannot finish until the MBean calculation for large index is done.
  
 So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
 of solr webapp only take about 1 minute. So we can sure the MBean calculation 
 for large index is the root cause.
 Could you please point me if there is any async way to do statistic 
 monitoring without jmx/ in solrconfig.xml, or let it do calculation after 
 the deployment? Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6675) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2014-10-30 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189802#comment-14189802
 ] 

Forest Soup edited comment on SOLR-6675 at 10/30/14 8:33 AM:
-

The callstack.png file in the attachment is the call stack of the long blocking 
thread which is doing statistics calculation.


was (Author: forest_soup):
The call stack of the long blocking thread which is doing statistics 
calculation.

 Solr webapp deployment is very slow with jmx/ in solrconfig.xml
 -

 Key: SOLR-6675
 URL: https://issues.apache.org/jira/browse/SOLR-6675
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Linux Redhat 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance
 Attachments: callstack.png


 We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
 index(cores) are big(50~100G) each core. 
 When we start up tomcat, the solr webapp deployment is very slow. From 
 tomcat's catalina log, every time it takes about 10 minutes to get deployed. 
 After we analyzing java core dump, we notice it's because the loading process 
 cannot finish until the MBean calculation for large index is done.
  
 So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
 of solr webapp only take about 1 minute. So we can sure the MBean calculation 
 for large index is the root cause.
 Could you please point me if there is any async way to do statistic 
 monitoring without jmx/ in solrconfig.xml, or let it do calculation after 
 the deployment? Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6675) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2014-10-30 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-6675:
--
Attachment: callstack.png

The call stack of the long blocking thread which is doing statistics 
calculation.

 Solr webapp deployment is very slow with jmx/ in solrconfig.xml
 -

 Key: SOLR-6675
 URL: https://issues.apache.org/jira/browse/SOLR-6675
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Linux Redhat 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance
 Attachments: callstack.png


 We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
 index(cores) are big(50~100G) each core. 
 When we start up tomcat, the solr webapp deployment is very slow. From 
 tomcat's catalina log, every time it takes about 10 minutes to get deployed. 
 After we analyzing java core dump, we notice it's because the loading process 
 cannot finish until the MBean calculation for large index is done.
  
 So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
 of solr webapp only take about 1 minute. So we can sure the MBean calculation 
 for large index is the root cause.
 Could you please point me if there is any async way to do statistic 
 monitoring without jmx/ in solrconfig.xml, or let it do calculation after 
 the deployment? Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6675) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2014-10-30 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-6675:
--
Description: 
We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
index(cores) are big(50~100G) each core. 

When we start up tomcat, the solr webapp deployment is very slow. From tomcat's 
catalina log, every time it takes about 10 minutes to get deployed. After we 
analyzing java core dump, we notice it's because the loading process cannot 
finish until the MBean calculation for large index is done.
 
So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
of solr webapp only take about 1 minute. So we can sure the MBean calculation 
for large index is the root cause.

Could you please point me if there is any async way to do statistic monitoring 
without jmx/ in solrconfig.xml, or let it do calculation after the 
deployment? Thanks!



The catalina log of tomcat:

INFO: Starting Servlet Engine: Apache Tomcat/7.0.54
Oct 13, 2014 2:00:29 AM org.apache.catalina.startup.HostConfig deployWAR
INFO: Deploying web application archive 
/opt/ibm/solrsearch/tomcat/webapps/solr.war
Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployWAR
INFO: Deployment of web application archive 
/opt/ibm/solrsearch/tomcat/webapps/solr.war has finished in 594,325 ms 
 Time taken for solr app Deployment is about 10 minutes 
---
Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/manager
Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/manager has finished in 2,035 ms
Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/examples
Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/examples has finished in 1,789 ms
Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/docs
Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/docs has finished in 1,037 ms
Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/ROOT
Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/ROOT has finished in 948 ms
Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/host-manager
Oct 13, 2014 2:10:30 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/host-manager has finished in 951 ms
Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler [http-bio-8080]
Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler [ajp-bio-8009]
Oct 13, 2014 2:10:31 AM org.apache.catalina.startup.Catalina start
INFO: Server startup in 601506 ms


  was:
We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
index(cores) are big(50~100G) each core. 

When we start up tomcat, the solr webapp deployment is very slow. From tomcat's 
catalina log, every time it takes about 10 minutes to get deployed. After we 
analyzing java core dump, we notice it's because the loading process cannot 
finish until the MBean calculation for large index is done.
 
So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
of solr webapp only take about 1 minute. So we can sure the MBean calculation 
for large index is the root cause.

Could you please point me if there is any async way to do statistic monitoring 
without jmx/ in solrconfig.xml, or let it do calculation after the 
deployment? Thanks!


 Solr webapp deployment is very slow with jmx/ in solrconfig.xml
 -

 Key: SOLR-6675
 URL: https://issues.apache.org/jira/browse/SOLR-6675
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Linux Redhat 64bit
Reporter: Forest Soup
Priority: Critical
  Labels: performance
 Attachments: callstack.png


 We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
 

[jira] [Updated] (SOLR-6675) Solr webapp deployment is very slow with jmx/ in solrconfig.xml

2014-10-30 Thread Forest Soup (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Forest Soup updated SOLR-6675:
--
Description: 
We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
index(cores) are big(50~100G) each core. 

When we start up tomcat, the solr webapp deployment is very slow. From tomcat's 
catalina log, every time it takes about 10 minutes to get deployed. After we 
analyzing java core dump, we notice it's because the loading process cannot 
finish until the MBean calculation for large index is done.
 
So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
of solr webapp only take about 1 minute. So we can sure the MBean calculation 
for large index is the root cause.

Could you please point me if there is any async way to do statistic monitoring 
without jmx/ in solrconfig.xml, or let it do calculation after the 
deployment? Thanks!

The callstack.png file in the attachment is the call stack of the long blocking 
thread which is doing statistics calculation.

The catalina log of tomcat:
INFO: Starting Servlet Engine: Apache Tomcat/7.0.54
Oct 13, 2014 2:00:29 AM org.apache.catalina.startup.HostConfig deployWAR
INFO: Deploying web application archive 
/opt/ibm/solrsearch/tomcat/webapps/solr.war
Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployWAR
INFO: Deployment of web application archive 
/opt/ibm/solrsearch/tomcat/webapps/solr.war has finished in 594,325 ms 
 Time taken for solr app Deployment is about 10 minutes 
---
Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/manager
Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/manager has finished in 2,035 ms
Oct 13, 2014 2:10:26 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/examples
Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/examples has finished in 1,789 ms
Oct 13, 2014 2:10:27 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/docs
Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/docs has finished in 1,037 ms
Oct 13, 2014 2:10:28 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/ROOT
Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/ROOT has finished in 948 ms
Oct 13, 2014 2:10:29 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deploying web application directory 
/opt/ibm/solrsearch/tomcat/webapps/host-manager
Oct 13, 2014 2:10:30 AM org.apache.catalina.startup.HostConfig deployDirectory
INFO: Deployment of web application directory 
/opt/ibm/solrsearch/tomcat/webapps/host-manager has finished in 951 ms
Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler [http-bio-8080]
Oct 13, 2014 2:10:31 AM org.apache.coyote.AbstractProtocol start
INFO: Starting ProtocolHandler [ajp-bio-8009]
Oct 13, 2014 2:10:31 AM org.apache.catalina.startup.Catalina start
INFO: Server startup in 601506 ms


  was:
We have a SolrCloud with Solr version 4.7 with Tomcat 7. And our solr 
index(cores) are big(50~100G) each core. 

When we start up tomcat, the solr webapp deployment is very slow. From tomcat's 
catalina log, every time it takes about 10 minutes to get deployed. After we 
analyzing java core dump, we notice it's because the loading process cannot 
finish until the MBean calculation for large index is done.
 
So we tried to remove the jmx/ from solrconfig.xml, after that, the loading 
of solr webapp only take about 1 minute. So we can sure the MBean calculation 
for large index is the root cause.

Could you please point me if there is any async way to do statistic monitoring 
without jmx/ in solrconfig.xml, or let it do calculation after the 
deployment? Thanks!



The catalina log of tomcat:

INFO: Starting Servlet Engine: Apache Tomcat/7.0.54
Oct 13, 2014 2:00:29 AM org.apache.catalina.startup.HostConfig deployWAR
INFO: Deploying web application archive 
/opt/ibm/solrsearch/tomcat/webapps/solr.war
Oct 13, 2014 2:10:23 AM org.apache.catalina.startup.HostConfig deployWAR
INFO: Deployment of web application archive 
/opt/ibm/solrsearch/tomcat/webapps/solr.war has finished in 594,325 ms 
 Time taken for solr app 

[jira] [Commented] (SOLR-6335) org.apache.solr.common.SolrException: no servers hosting shard

2014-08-10 Thread Forest Soup (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092340#comment-14092340
 ] 

Forest Soup commented on SOLR-6335:
---

Thanks Erick.

Before I open JIRA, I have searched, but found no similar root cause of my 
question. 
And I asked in below link, but no response. My ZK connection and network is 
good. 
So, could you please help? Thanks!
http://lucene.472066.n3.nabble.com/org-apache-solr-common-SolrException-no-servers-hosting-shard-td4151637.html

 org.apache.solr.common.SolrException: no servers hosting shard
 --

 Key: SOLR-6335
 URL: https://issues.apache.org/jira/browse/SOLR-6335
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.7
 Environment: Red Hat Enterprise Linux Server release 6.4 (Santiago) 
 64bit
Reporter: Forest Soup
 Attachments: solrconfig_perf0804.xml


 http://lucene.472066.n3.nabble.com/org-apache-solr-common-SolrException-no-servers-hosting-shard-td4151637.html
 I have 2 solr nodes(solr1 and solr2) in a SolrCloud. 
 After this issue happened, solr2 are in recovering state. And after it takes 
 long time to finish recovery, there is this issue again, and it turn to 
 recovery again. It happens again and again. 
 ERROR - 2014-08-04 21:12:27.917; org.apache.solr.common.SolrException; 
 org.apache.solr.common.SolrException: no servers hosting shard: 
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:148)
  
 at 
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:118)
  
 at java.util.concurrent.FutureTask.run(FutureTask.java:273) 
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) 
 at java.util.concurrent.FutureTask.run(FutureTask.java:273) 
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
  
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
  
 at java.lang.Thread.run(Thread.java:804) 
 We have those settings in solrconfig.xml different with default: 
 maxIndexingThreads24/maxIndexingThreads  
 ramBufferSizeMB200/ramBufferSizeMB
 maxBufferedDocs1/maxBufferedDocs  
  autoCommit  
maxDocs1000/maxDocs  
maxTime${solr.autoCommit.maxTime:15000}/maxTime
openSearchertrue/openSearcher  
  /autoCommit
  autoSoftCommit  

maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime
  /autoSoftCommit
 filterCache class=solr.FastLRUCache 
  size=16384 
  initialSize=16384 
  autowarmCount=4096/
 queryResultCache class=solr.LRUCache 
  size=16384 
  initialSize=16384 
  autowarmCount=4096/
 documentCache class=solr.LRUCache 
size=16384 
initialSize=16384 
autowarmCount=4096/
fieldValueCache class=solr.FastLRUCache 
 size=16384 
 autowarmCount=1024 
 showItems=32 /
queryResultWindowSize50/queryResultWindowSize
 The full solrconfig.xml is as attachment. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >