[jira] [Created] (SOLR-6552) Core admin API can't unload the invalid the core
Raintung Li created SOLR-6552: - Summary: Core admin API can't unload the invalid the core Key: SOLR-6552 URL: https://issues.apache.org/jira/browse/SOLR-6552 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.6.1 Reporter: Raintung Li If the core is invalid, I can't remove it actually, especial solr cloud, I only see the down status in the Cloud node status page. If I want to clean it, I can't do anything. Actually I just want to clean in the ZK cluster status. What is invalid? The core because config is not valid that can't load. The server removed, but doesn't call unload the core api. The server node don't exist in the cloud. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6552) Core admin API can't unload the invalid the core
[ https://issues.apache.org/jira/browse/SOLR-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144583#comment-14144583 ] Raintung Li commented on SOLR-6552: --- I just look the path, it doesn't remove the core from Zookeeper, zk should still exist. Core admin API can't unload the invalid the core - Key: SOLR-6552 URL: https://issues.apache.org/jira/browse/SOLR-6552 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.6.1 Reporter: Raintung Li If the core is invalid, I can't remove it actually, especial solr cloud, I only see the down status in the Cloud node status page. If I want to clean it, I can't do anything. Actually I just want to clean in the ZK cluster status. What is invalid? The core because config is not valid that can't load. The server removed, but doesn't call unload the core api. The server node don't exist in the cloud. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6552) Core admin API can't unload the invalid the core
[ https://issues.apache.org/jira/browse/SOLR-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144583#comment-14144583 ] Raintung Li edited comment on SOLR-6552 at 9/23/14 9:16 AM: I just look the path, it doesn't remove the core from Zookeeper. ZK should still exist, if it is invalid. was (Author: raintung.li): I just look the path, it doesn't remove the core from Zookeeper, zk should still exist. Core admin API can't unload the invalid the core - Key: SOLR-6552 URL: https://issues.apache.org/jira/browse/SOLR-6552 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.6.1 Reporter: Raintung Li If the core is invalid, I can't remove it actually, especial solr cloud, I only see the down status in the Cloud node status page. If I want to clean it, I can't do anything. Actually I just want to clean in the ZK cluster status. What is invalid? The core because config is not valid that can't load. The server removed, but doesn't call unload the core api. The server node don't exist in the cloud. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6552) Core admin API can't unload the invalid the core
[ https://issues.apache.org/jira/browse/SOLR-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6552: -- Attachment: SOLR-6552.txt Just simple to unload the invalid core Core admin API can't unload the invalid the core - Key: SOLR-6552 URL: https://issues.apache.org/jira/browse/SOLR-6552 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.6.1 Reporter: Raintung Li Attachments: SOLR-6552.txt If the core is invalid, I can't remove it actually, especial solr cloud, I only see the down status in the Cloud node status page. If I want to clean it, I can't do anything. Actually I just want to clean in the ZK cluster status. What is invalid? The core because config is not valid that can't load. The server removed, but doesn't call unload the core api. The server node don't exist in the cloud. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6553) StackOverflowError
Raintung Li created SOLR-6553: - Summary: StackOverflowError Key: SOLR-6553 URL: https://issues.apache.org/jira/browse/SOLR-6553 Project: Solr Issue Type: Bug Components: SolrCloud Environment: One collection, one shard, two replica Reporter: Raintung Li The server log: Error while calling watcher java.lang.StackOverflowError at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3366) at java.util.regex.Pattern$Curly.match(Pattern.java:3737) at java.util.regex.Pattern$GroupHead.match(Pattern.java:4168) at java.util.regex.Pattern$Slice.match(Pattern.java:3482) at java.util.regex.Pattern$Curly.match1(Pattern.java:3797) at java.util.regex.Pattern$Curly.match(Pattern.java:3746) at java.util.regex.Pattern$Ques.match(Pattern.java:3691) at java.util.regex.Pattern$Curly.match1(Pattern.java:3797) at java.util.regex.Pattern$Curly.match(Pattern.java:3746) at java.util.regex.Matcher.match(Matcher.java:1127) at java.util.regex.Matcher.matches(Matcher.java:502) at org.apache.solr.cloud.LeaderElector.getSeq(LeaderElector.java:167) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:265) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:383) at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:173) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:383) at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:173) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:383) If one of replica miss connection with zookeeper, the other happen in the recovery status.(I don't know how to happen this.) Replica rejoin the leader elect, and it is leader (only him).. check self shouldIBeLeader it is recovery status, cancel the election(do recovery), and rejoin the leader election again. It is InfiniteLoop, at last StackOverFlow, also create the many threads to do recovery.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6553) StackOverflowError
[ https://issues.apache.org/jira/browse/SOLR-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6553: -- Priority: Critical (was: Major) StackOverflowError -- Key: SOLR-6553 URL: https://issues.apache.org/jira/browse/SOLR-6553 Project: Solr Issue Type: Bug Components: SolrCloud Environment: One collection, one shard, two replica Reporter: Raintung Li Priority: Critical The server log: Error while calling watcher java.lang.StackOverflowError at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3366) at java.util.regex.Pattern$Curly.match(Pattern.java:3737) at java.util.regex.Pattern$GroupHead.match(Pattern.java:4168) at java.util.regex.Pattern$Slice.match(Pattern.java:3482) at java.util.regex.Pattern$Curly.match1(Pattern.java:3797) at java.util.regex.Pattern$Curly.match(Pattern.java:3746) at java.util.regex.Pattern$Ques.match(Pattern.java:3691) at java.util.regex.Pattern$Curly.match1(Pattern.java:3797) at java.util.regex.Pattern$Curly.match(Pattern.java:3746) at java.util.regex.Matcher.match(Matcher.java:1127) at java.util.regex.Matcher.matches(Matcher.java:502) at org.apache.solr.cloud.LeaderElector.getSeq(LeaderElector.java:167) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:265) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:383) at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:173) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:383) at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:173) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:383) If one of replica miss connection with zookeeper, the other happen in the recovery status.(I don't know how to happen this.) Replica rejoin the leader elect, and it is leader (only him).. check self shouldIBeLeader it is recovery status, cancel the election(do recovery), and rejoin the leader election again. It is InfiniteLoop, at last StackOverFlow, also create the many threads to do recovery.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6553) StackOverflowError
[ https://issues.apache.org/jira/browse/SOLR-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6553: -- Affects Version/s: 4.6 4.6.1 StackOverflowError -- Key: SOLR-6553 URL: https://issues.apache.org/jira/browse/SOLR-6553 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6, 4.6.1 Environment: One collection, one shard, two replica Reporter: Raintung Li Priority: Critical The server log: Error while calling watcher java.lang.StackOverflowError at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3366) at java.util.regex.Pattern$Curly.match(Pattern.java:3737) at java.util.regex.Pattern$GroupHead.match(Pattern.java:4168) at java.util.regex.Pattern$Slice.match(Pattern.java:3482) at java.util.regex.Pattern$Curly.match1(Pattern.java:3797) at java.util.regex.Pattern$Curly.match(Pattern.java:3746) at java.util.regex.Pattern$Ques.match(Pattern.java:3691) at java.util.regex.Pattern$Curly.match1(Pattern.java:3797) at java.util.regex.Pattern$Curly.match(Pattern.java:3746) at java.util.regex.Matcher.match(Matcher.java:1127) at java.util.regex.Matcher.matches(Matcher.java:502) at org.apache.solr.cloud.LeaderElector.getSeq(LeaderElector.java:167) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:265) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:383) at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:173) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:383) at org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:173) at org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100) at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266) at org.apache.solr.cloud.ShardLeaderElectionContext.rejoinLeaderElection(ElectionContext.java:383) If one of replica miss connection with zookeeper, the other happen in the recovery status.(I don't know how to happen this.) Replica rejoin the leader elect, and it is leader (only him).. check self shouldIBeLeader it is recovery status, cancel the election(do recovery), and rejoin the leader election again. It is InfiniteLoop, at last StackOverFlow, also create the many threads to do recovery.. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6498) LeaderElector sometimes will appear multiple ephemeral nodes in the zookeeper
Raintung Li created SOLR-6498: - Summary: LeaderElector sometimes will appear multiple ephemeral nodes in the zookeeper Key: SOLR-6498 URL: https://issues.apache.org/jira/browse/SOLR-6498 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Environment: linux Reporter: Raintung Li Sometimes overseer_elect/collection_shard_leader_elect election path will appear multiple same node different sessionid ephemeral nodes. ex. 92427566579253248-core_node1-n_32 92427566579253249-core_node1-n_33 I can't trace what it happen. But if that, the result will be the new register node can't be elect the leader, we also know the old sessionid ephemeral node is invalid, but don't know why it is exist. And the other issue : joinElection method: try { leaderSeqPath = zkClient.create(shardsElectZkPath + / + id + -n_, null, CreateMode.EPHEMERAL_SEQUENTIAL, false); context.leaderSeqPath = leaderSeqPath; cont = false; } catch (ConnectionLossException e) { // we don't know if we made our node or not... ListString entries = zkClient.getChildren(shardsElectZkPath, null, true); boolean foundId = false; for (String entry : entries) { String nodeId = getNodeId(entry); if (id.equals(nodeId)) { // we did create our node... foundId = true; break; } } if (!foundId) { cont = true; if (tries++ 20) { throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, , e); } try { Thread.sleep(50); } catch (InterruptedException e2) { Thread.currentThread().interrupt(); } } } If meet the ConnectionLossException status, maybe will double create the ephemeral sequential node. For my suggestion, can't trace why create the two ephemeral nodes for the same server, but can protect it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6498) LeaderElector sometimes will appear multiple ephemeral nodes in the zookeeper
[ https://issues.apache.org/jira/browse/SOLR-6498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6498: -- Attachment: SOLR-6498.txt LeaderElector sometimes will appear multiple ephemeral nodes in the zookeeper - Key: SOLR-6498 URL: https://issues.apache.org/jira/browse/SOLR-6498 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Environment: linux Reporter: Raintung Li Attachments: SOLR-6498.txt Sometimes overseer_elect/collection_shard_leader_elect election path will appear multiple same node different sessionid ephemeral nodes. ex. 92427566579253248-core_node1-n_32 92427566579253249-core_node1-n_33 I can't trace what it happen. But if that, the result will be the new register node can't be elect the leader, we also know the old sessionid ephemeral node is invalid, but don't know why it is exist. And the other issue : joinElection method: try { leaderSeqPath = zkClient.create(shardsElectZkPath + / + id + -n_, null, CreateMode.EPHEMERAL_SEQUENTIAL, false); context.leaderSeqPath = leaderSeqPath; cont = false; } catch (ConnectionLossException e) { // we don't know if we made our node or not... ListString entries = zkClient.getChildren(shardsElectZkPath, null, true); boolean foundId = false; for (String entry : entries) { String nodeId = getNodeId(entry); if (id.equals(nodeId)) { // we did create our node... foundId = true; break; } } if (!foundId) { cont = true; if (tries++ 20) { throw new ZooKeeperException(SolrException.ErrorCode.SERVER_ERROR, , e); } try { Thread.sleep(50); } catch (InterruptedException e2) { Thread.currentThread().interrupt(); } } } If meet the ConnectionLossException status, maybe will double create the ephemeral sequential node. For my suggestion, can't trace why create the two ephemeral nodes for the same server, but can protect it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6184) Replication fetchLatestIndex always failed, that will occur the recovery error.
[ https://issues.apache.org/jira/browse/SOLR-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040673#comment-14040673 ] Raintung Li commented on SOLR-6184: --- How to estimate the duration? You will keep the update in the memory, and need think to avoid the OOM case. Replication fetchLatestIndex always failed, that will occur the recovery error. --- Key: SOLR-6184 URL: https://issues.apache.org/jira/browse/SOLR-6184 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6, 4.6.1 Environment: the index file size is more than 70G Reporter: Raintung Li Attachments: Solr-6184.txt Usually the copy full index 70G need 20 minutes at least, 100M read/write network or disk r/w. If in the 20 minutes happen one hard commit, that means the copy full index snap pull will be failed, the temp folder will be removed because it is failed pull task. In the production, update index will happen in every minute, redo pull task always failed because index always change. And also always redo the pull it will occur the network and disk usage keep the high level. For my suggestion, the fetchLatestIndex can be do again in some frequency. Don't need remove the tmp folder, and copy the largest index at first. Redo the fetchLatestIndex don't download the same biggest file again, only will copy the commit index just now, at last the task will be easy success. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6184) Replication fetchLatestIndex always failed, that will occur the recovery error.
Raintung Li created SOLR-6184: - Summary: Replication fetchLatestIndex always failed, that will occur the recovery error. Key: SOLR-6184 URL: https://issues.apache.org/jira/browse/SOLR-6184 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1, 4.6 Environment: the index file size is more than 70G Reporter: Raintung Li Usually the copy full index 70G need 20 minutes at least, 100M read/write network or disk r/w. If in the 20 minutes happen one hard commit, that means the copy full index snap pull will be failed, the temp folder will be removed because it is failed pull task. In the production, update index will happen in every minute, redo pull task always failed because index always change. And also always redo the pull it will occur the network and disk usage keep the high level. For my suggestion, the fetchLatestIndex can be done again in some frequency. Don't need remove the tmp folder, and copy the largest index at first. Redo the fetchLatestIndex don't download the same biggest file again, only will copy the commit index just now, at last the task will be easy success. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6184) Replication fetchLatestIndex always failed, that will occur the recovery error.
[ https://issues.apache.org/jira/browse/SOLR-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6184: -- Description: Usually the copy full index 70G need 20 minutes at least, 100M read/write network or disk r/w. If in the 20 minutes happen one hard commit, that means the copy full index snap pull will be failed, the temp folder will be removed because it is failed pull task. In the production, update index will happen in every minute, redo pull task always failed because index always change. And also always redo the pull it will occur the network and disk usage keep the high level. For my suggestion, the fetchLatestIndex can be do again in some frequency. Don't need remove the tmp folder, and copy the largest index at first. Redo the fetchLatestIndex don't download the same biggest file again, only will copy the commit index just now, at last the task will be easy success. was: Usually the copy full index 70G need 20 minutes at least, 100M read/write network or disk r/w. If in the 20 minutes happen one hard commit, that means the copy full index snap pull will be failed, the temp folder will be removed because it is failed pull task. In the production, update index will happen in every minute, redo pull task always failed because index always change. And also always redo the pull it will occur the network and disk usage keep the high level. For my suggestion, the fetchLatestIndex can be done again in some frequency. Don't need remove the tmp folder, and copy the largest index at first. Redo the fetchLatestIndex don't download the same biggest file again, only will copy the commit index just now, at last the task will be easy success. Replication fetchLatestIndex always failed, that will occur the recovery error. --- Key: SOLR-6184 URL: https://issues.apache.org/jira/browse/SOLR-6184 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6, 4.6.1 Environment: the index file size is more than 70G Reporter: Raintung Li Usually the copy full index 70G need 20 minutes at least, 100M read/write network or disk r/w. If in the 20 minutes happen one hard commit, that means the copy full index snap pull will be failed, the temp folder will be removed because it is failed pull task. In the production, update index will happen in every minute, redo pull task always failed because index always change. And also always redo the pull it will occur the network and disk usage keep the high level. For my suggestion, the fetchLatestIndex can be do again in some frequency. Don't need remove the tmp folder, and copy the largest index at first. Redo the fetchLatestIndex don't download the same biggest file again, only will copy the commit index just now, at last the task will be easy success. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6184) Replication fetchLatestIndex always failed, that will occur the recovery error.
[ https://issues.apache.org/jira/browse/SOLR-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6184: -- Attachment: Solr-6184.txt Replication fetchLatestIndex always failed, that will occur the recovery error. --- Key: SOLR-6184 URL: https://issues.apache.org/jira/browse/SOLR-6184 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6, 4.6.1 Environment: the index file size is more than 70G Reporter: Raintung Li Attachments: Solr-6184.txt Usually the copy full index 70G need 20 minutes at least, 100M read/write network or disk r/w. If in the 20 minutes happen one hard commit, that means the copy full index snap pull will be failed, the temp folder will be removed because it is failed pull task. In the production, update index will happen in every minute, redo pull task always failed because index always change. And also always redo the pull it will occur the network and disk usage keep the high level. For my suggestion, the fetchLatestIndex can be do again in some frequency. Don't need remove the tmp folder, and copy the largest index at first. Redo the fetchLatestIndex don't download the same biggest file again, only will copy the commit index just now, at last the task will be easy success. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6117) Replication command=fetchindex always return success.
Raintung Li created SOLR-6117: - Summary: Replication command=fetchindex always return success. Key: SOLR-6117 URL: https://issues.apache.org/jira/browse/SOLR-6117 Project: Solr Issue Type: Bug Components: replication (java) Affects Versions: 4.6 Reporter: Raintung Li Replication API command=fetchindex do fetch the index. while occur the error, still give success response. API should return the right status, especially WAIT parameter is true.(synchronous). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6117) Replication command=fetchindex always return success.
[ https://issues.apache.org/jira/browse/SOLR-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6117: -- Attachment: SOLR-6117.txt Replication command=fetchindex always return success. - Key: SOLR-6117 URL: https://issues.apache.org/jira/browse/SOLR-6117 Project: Solr Issue Type: Bug Components: replication (java) Affects Versions: 4.6 Reporter: Raintung Li Attachments: SOLR-6117.txt Replication API command=fetchindex do fetch the index. while occur the error, still give success response. API should return the right status, especially WAIT parameter is true.(synchronous). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6117) Replication command=fetchindex always return success.
[ https://issues.apache.org/jira/browse/SOLR-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6117: -- Attachment: SOLR-6117.txt Replication command=fetchindex always return success. - Key: SOLR-6117 URL: https://issues.apache.org/jira/browse/SOLR-6117 Project: Solr Issue Type: Bug Components: replication (java) Affects Versions: 4.6 Reporter: Raintung Li Attachments: SOLR-6117.txt Replication API command=fetchindex do fetch the index. while occur the error, still give success response. API should return the right status, especially WAIT parameter is true.(synchronous). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6117) Replication command=fetchindex always return success.
[ https://issues.apache.org/jira/browse/SOLR-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6117: -- Attachment: (was: SOLR-6117.txt) Replication command=fetchindex always return success. - Key: SOLR-6117 URL: https://issues.apache.org/jira/browse/SOLR-6117 Project: Solr Issue Type: Bug Components: replication (java) Affects Versions: 4.6 Reporter: Raintung Li Attachments: SOLR-6117.txt Replication API command=fetchindex do fetch the index. while occur the error, still give success response. API should return the right status, especially WAIT parameter is true.(synchronous). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6056) Zookeeper crash JVM stack OOM because of recover strategy
[ https://issues.apache.org/jira/browse/SOLR-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6056: -- Description: Some errorsorg.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later, that occur distributedupdateprocessor trig the core admin recover process. That means every update request will send the core admin recover request. (see the code DistributedUpdateProcessor.java doFinish()) The terrible thing is CoreAdminHandler will start a new thread to publish the recover status and start recovery. Threads increase very quickly, and stack OOM , Overseer can't handle a lot of status update , zookeeper node for /overseer/queue/qn-125553 increase more than 40 thousand in two minutes. At the last zookeeper crash. The worse thing is queue has too much nodes in the zookeeper, the cluster can't publish the right status because only one overseer work, I have to start three threads to clear the queue nodes. The cluster doesn't work normal near 30 minutes... was: Some errorsorg.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later, that occur distributedupdateprocessor trig the core admin recover process. That means every update request will send the core admin recover request. (see the code DistributedUpdateProcessor.java doFinish()) The terrible thing is CoreAdminHandler will start a new thread to publish the recover status and start recovery. Threads increase very quickly, and stack OOM , Overseer can't handle a lot of status update , zookeeper node for /overseer/queue/qn-125553 increase more than 40 thousand in two minutes. At the last zookeeper crash. The worse thing is queue has to much nodes in the zookeeper, the cluster can't publish the right status because only one overseer work, I have to start three threads to clear the queue nodes. The cluster doesn't work normal near 30 minutes... Zookeeper crash JVM stack OOM because of recover strategy -- Key: SOLR-6056 URL: https://issues.apache.org/jira/browse/SOLR-6056 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Environment: Two linux servers, 65G memory, 16 core cpu 20 collections, every collection has one shard two replica one zookeeper Reporter: Raintung Li Priority: Critical Labels: cluster, crash, recover Some errorsorg.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later, that occur distributedupdateprocessor trig the core admin recover process. That means every update request will send the core admin recover request. (see the code DistributedUpdateProcessor.java doFinish()) The terrible thing is CoreAdminHandler will start a new thread to publish the recover status and start recovery. Threads increase very quickly, and stack OOM , Overseer can't handle a lot of status update , zookeeper node for /overseer/queue/qn-125553 increase more than 40 thousand in two minutes. At the last zookeeper crash. The worse thing is queue has too much nodes in the zookeeper, the cluster can't publish the right status because only one overseer work, I have to start three threads to clear the queue nodes. The cluster doesn't work normal near 30 minutes... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6056) Zookeeper crash JVM stack OOM because of recover strategy
[ https://issues.apache.org/jira/browse/SOLR-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6056: -- Priority: Critical (was: Major) Zookeeper crash JVM stack OOM because of recover strategy -- Key: SOLR-6056 URL: https://issues.apache.org/jira/browse/SOLR-6056 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Environment: Two linux server, 65G, 16 core cup 20 collections, every collection has one shard two replica one zookeeper Reporter: Raintung Li Priority: Critical Labels: cluster, crash, recover Some errorsorg.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later, that occur distributedupdateprocessor trig the core admin recover process. That means every update request will send the core admin recover request. (see the code DistributedUpdateProcessor.java doFinish()) The terrible thing is CoreAdminHandler will start a new thread to publish the recover status and start recovery. Threads increase very quickly, and stack OOM , Overseer can't handle a lot of status update , zookeeper node for /overseer/queue/qn-125553 increase more than 40 thousand in two minutes. At the last zookeeper crash. The worse thing is queue has to much nodes in the zookeeper, the cluster can't publish the right status because only one overseer work, I have to start three threads to clear the queue nodes. The cluster doesn't work normal near 30 minutes... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6056) Zookeeper crash JVM stack OOM because of recover strategy
[ https://issues.apache.org/jira/browse/SOLR-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6056: -- Environment: Two linux servers, 65G memory, 16 core cpu 20 collections, every collection has one shard two replica one zookeeper was: Two linux server, 65G, 16 core cup 20 collections, every collection has one shard two replica one zookeeper Zookeeper crash JVM stack OOM because of recover strategy -- Key: SOLR-6056 URL: https://issues.apache.org/jira/browse/SOLR-6056 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Environment: Two linux servers, 65G memory, 16 core cpu 20 collections, every collection has one shard two replica one zookeeper Reporter: Raintung Li Priority: Critical Labels: cluster, crash, recover Some errorsorg.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later, that occur distributedupdateprocessor trig the core admin recover process. That means every update request will send the core admin recover request. (see the code DistributedUpdateProcessor.java doFinish()) The terrible thing is CoreAdminHandler will start a new thread to publish the recover status and start recovery. Threads increase very quickly, and stack OOM , Overseer can't handle a lot of status update , zookeeper node for /overseer/queue/qn-125553 increase more than 40 thousand in two minutes. At the last zookeeper crash. The worse thing is queue has to much nodes in the zookeeper, the cluster can't publish the right status because only one overseer work, I have to start three threads to clear the queue nodes. The cluster doesn't work normal near 30 minutes... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6056) Zookeeper crash JVM stack OOM because of recover strategy
[ https://issues.apache.org/jira/browse/SOLR-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6056: -- Attachment: patch-6056.txt Zookeeper crash JVM stack OOM because of recover strategy -- Key: SOLR-6056 URL: https://issues.apache.org/jira/browse/SOLR-6056 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Environment: Two linux servers, 65G memory, 16 core cpu 20 collections, every collection has one shard two replica one zookeeper Reporter: Raintung Li Priority: Critical Labels: cluster, crash, recover Attachments: patch-6056.txt Some errorsorg.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later, that occur distributedupdateprocessor trig the core admin recover process. That means every update request will send the core admin recover request. (see the code DistributedUpdateProcessor.java doFinish()) The terrible thing is CoreAdminHandler will start a new thread to publish the recover status and start recovery. Threads increase very quickly, and stack OOM , Overseer can't handle a lot of status update , zookeeper node for /overseer/queue/qn-125553 increase more than 40 thousand in two minutes. At the last zookeeper crash. The worse thing is queue has too much nodes in the zookeeper, the cluster can't publish the right status because only one overseer work, I have to start three threads to clear the queue nodes. The cluster doesn't work normal near 30 minutes... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6056) Zookeeper crash JVM stack OOM because of recover strategy
[ https://issues.apache.org/jira/browse/SOLR-6056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994821#comment-13994821 ] Raintung Li commented on SOLR-6056: --- 1. Move report the status from the coreadminhandle to the doRecovery method, only one thread will report this status. 2. While find the thread is working for recovery, the other recovery thread will quit except set the parameter to enforce recovery. Zookeeper crash JVM stack OOM because of recover strategy -- Key: SOLR-6056 URL: https://issues.apache.org/jira/browse/SOLR-6056 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Environment: Two linux servers, 65G memory, 16 core cpu 20 collections, every collection has one shard two replica one zookeeper Reporter: Raintung Li Priority: Critical Labels: cluster, crash, recover Attachments: patch-6056.txt Some errorsorg.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later, that occur distributedupdateprocessor trig the core admin recover process. That means every update request will send the core admin recover request. (see the code DistributedUpdateProcessor.java doFinish()) The terrible thing is CoreAdminHandler will start a new thread to publish the recover status and start recovery. Threads increase very quickly, and stack OOM , Overseer can't handle a lot of status update , zookeeper node for /overseer/queue/qn-125553 increase more than 40 thousand in two minutes. At the last zookeeper crash. The worse thing is queue has too much nodes in the zookeeper, the cluster can't publish the right status because only one overseer work, I have to start three threads to clear the queue nodes. The cluster doesn't work normal near 30 minutes... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6056) Zookeeper crash JVM stack OOM because of recover strategy
Raintung Li created SOLR-6056: - Summary: Zookeeper crash JVM stack OOM because of recover strategy Key: SOLR-6056 URL: https://issues.apache.org/jira/browse/SOLR-6056 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Environment: Two linux server, 65G, 16 core cup 20 collections, every collection has one shard two replica one zookeeper Reporter: Raintung Li Some errorsorg.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later, that occur distributedupdateprocessor trig the core admin recover process. That means every update request will send the core admin recover request. (see the code DistributedUpdateProcessor.java doFinish()) The terrible thing is CoreAdminHandler will start a new thread to publish the recover status and start recovery. Threads increase very quickly, and stack OOM , Overseer can't handle a lot of status update , zookeeper node for /overseer/queue/qn-125553 increase more than 40 thousand in two minutes. At the last zookeeper crash. The worse thing is queue has to much nodes in the zookeeper, the cluster can't publish the right status because only one overseer work, I have to start three threads to clear the queue nodes. The cluster doesn't work normal near 30 minutes... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5938) ConcurrentUpdateSolrServer don't parser the response while response status code isn't 200
Raintung Li created SOLR-5938: - Summary: ConcurrentUpdateSolrServer don't parser the response while response status code isn't 200 Key: SOLR-5938 URL: https://issues.apache.org/jira/browse/SOLR-5938 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Environment: one cloud has two server, one shard, one leader one replica, send the index into replica server, replica server forward leader server. Reporter: Raintung Li ConcurrentUpdateSolrServer only give back the error that don't parser the response body, you can't get the error reason from remote server. EX. You send the index request to one solr server, this server forward the other leader server. forward case invoke the ConcurrentUpdateSolrServer.java, you can't get the right error message only check it in the leader server if happen error. Actually leader server had sent the error message to the forwarding server. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5938) ConcurrentUpdateSolrServer don't parser the response while response status code isn't 200
[ https://issues.apache.org/jira/browse/SOLR-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5938: -- Attachment: SOLR-5938.txt The file patch ConcurrentUpdateSolrServer don't parser the response while response status code isn't 200 - Key: SOLR-5938 URL: https://issues.apache.org/jira/browse/SOLR-5938 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Environment: one cloud has two server, one shard, one leader one replica, send the index into replica server, replica server forward leader server. Reporter: Raintung Li Labels: solrj Attachments: SOLR-5938.txt ConcurrentUpdateSolrServer only give back the error that don't parser the response body, you can't get the error reason from remote server. EX. You send the index request to one solr server, this server forward the other leader server. forward case invoke the ConcurrentUpdateSolrServer.java, you can't get the right error message only check it in the leader server if happen error. Actually leader server had sent the error message to the forwarding server. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5938) ConcurrentUpdateSolrServer don't parser the response while response status code isn't 200
[ https://issues.apache.org/jira/browse/SOLR-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5938: -- Attachment: SOLR-5938.txt ConcurrentUpdateSolrServer don't parser the response while response status code isn't 200 - Key: SOLR-5938 URL: https://issues.apache.org/jira/browse/SOLR-5938 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Environment: one cloud has two server, one shard, one leader one replica, send the index into replica server, replica server forward leader server. Reporter: Raintung Li Labels: solrj Attachments: SOLR-5938.txt ConcurrentUpdateSolrServer only give back the error that don't parser the response body, you can't get the error reason from remote server. EX. You send the index request to one solr server, this server forward the other leader server. forward case invoke the ConcurrentUpdateSolrServer.java, you can't get the right error message only check it in the leader server if happen error. Actually leader server had sent the error message to the forwarding server. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5938) ConcurrentUpdateSolrServer don't parser the response while response status code isn't 200
[ https://issues.apache.org/jira/browse/SOLR-5938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5938: -- Attachment: (was: SOLR-5938.txt) ConcurrentUpdateSolrServer don't parser the response while response status code isn't 200 - Key: SOLR-5938 URL: https://issues.apache.org/jira/browse/SOLR-5938 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Environment: one cloud has two server, one shard, one leader one replica, send the index into replica server, replica server forward leader server. Reporter: Raintung Li Labels: solrj Attachments: SOLR-5938.txt ConcurrentUpdateSolrServer only give back the error that don't parser the response body, you can't get the error reason from remote server. EX. You send the index request to one solr server, this server forward the other leader server. forward case invoke the ConcurrentUpdateSolrServer.java, you can't get the right error message only check it in the leader server if happen error. Actually leader server had sent the error message to the forwarding server. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5842) facet.pivot need provide the more information and additional function
[ https://issues.apache.org/jira/browse/SOLR-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5842: -- Attachment: patch-5842-2.txt Fix the bug facet.pivot need provide the more information and additional function - Key: SOLR-5842 URL: https://issues.apache.org/jira/browse/SOLR-5842 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.6 Reporter: Raintung Li Attachments: patch-5842-2.txt, patch-5842.txt Because facet can set the facet.limit and facet.offset, we can't get the next array size for facet.pivot. If you want to get the next pivot size, you have to set the facet.limit to max integer then to count the array size. In that way you will get a lot of terms for pivot field that's impact the network and Client. Update some functions in the API For example: facet=truefacet.pivot=test,testb,id facet.pivot.min.field=id -- Get the id min value facet.pivot.max.field=id -- Get the id max value facet.pivot.sum.field=id-- Sum the id value facet.pivot.count=true --- Open the get array size function facet.pivot.count.field=id --Get the id array size facet.pivot.count.next=true -- Get the next pivot field array size Response: lst name=facet_pivot long name=idSUM572/long long name=idMAX333/long long name=idMIN1/long long name=idArrCount12/long arr name=test,testb,id lst str name=fieldtest/str str name=valuechange.me/str int name=count5/int long name=idSUM91/long long name=idMAX33/long long name=idMIN1/long long name=idArrCount5/long long name=testbArrCount2/long arr name=pivot lst str name=fieldtestb/str str name=valuetest/str int name=count1/int long name=idSUM3/long long name=idMAX3/long long name=idMIN3/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value3/int int name=count1/int /lst /arr /lst lst str name=fieldtestb/str null name=value/ int name=count4/int long name=idSUM88/long long name=idMAX33/long long name=idMIN1/long long name=idArrCount4/long arr name=pivot lst str name=fieldid/str int name=value1/int int name=count1/int /lst lst str name=fieldid/str int name=value22/int int name=count1/int /lst lst str name=fieldid/str int name=value32/int int name=count1/int /lst lst str name=fieldid/str int name=value33/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=value100/str int name=count1/int long name=idSUM66/long long name=idMAX66/long long name=idMIN66/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str null name=value/ int name=count1/int long name=idSUM66/long long name=idMAX66/long long name=idMIN66/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value66/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=value200/str int name=count1/int long name=idSUM34/long long name=idMAX34/long long name=idMIN34/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str null name=value/ int name=count1/int long name=idSUM34/long long name=idMAX34/long long name=idMIN34/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value34/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=value500/str int name=count1/int long name=idSUM23/long long name=idMAX23/long long name=idMIN23/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str null name=value/ int name=count1/int long name=idSUM23/long long name=idMAX23/long long name=idMIN23/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value23/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=valuechange.me1/str int name=count1/int long name=idSUM4/long long name=idMAX4/long long name=idMIN4/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str str name=valuetest1/str int name=count1/int long name=idSUM4/long long name=idMAX4/long long name=idMIN4/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value4/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=valueme/str int name=count1/int long name=idSUM11/long long name=idMAX11/long long name=idMIN11/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str str name=valuechange.me/str int name=count1/int long name=idSUM11/long long name=idMAX11/long long
[jira] [Created] (SOLR-5887) Document exception don't give core information
Raintung Li created SOLR-5887: - Summary: Document exception don't give core information Key: SOLR-5887 URL: https://issues.apache.org/jira/browse/SOLR-5887 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.7, 4.6.1, 4.6 Reporter: Raintung Li ERROR: [doc=7ee72880-4352-402c-a614-cbc73d9c470a] unknown field 'location' Document validation need core information. If you have many cores, it is very hard to find the which core has issue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5887) Document exception don't give core information
[ https://issues.apache.org/jira/browse/SOLR-5887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5887: -- Attachment: SOLR-5887.txt Document exception don't give core information -- Key: SOLR-5887 URL: https://issues.apache.org/jira/browse/SOLR-5887 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6, 4.6.1, 4.7 Reporter: Raintung Li Labels: core, document, exception, solr Attachments: SOLR-5887.txt ERROR: [doc=7ee72880-4352-402c-a614-cbc73d9c470a] unknown field 'location' Document validation need core information. If you have many cores, it is very hard to find the which core has issue. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5842) facet.pivot need provide the more information and additional function
Raintung Li created SOLR-5842: - Summary: facet.pivot need provide the more information and additional function Key: SOLR-5842 URL: https://issues.apache.org/jira/browse/SOLR-5842 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.6 Reporter: Raintung Li Because facet can set the facet.limit and facet.offset, we can't get the next array size for facet.pivot. If you want to get the next pivot size, you have to set the facet.limit to max integer then to count the array size. In that way you will get a lot of terms for pivot field that's impact the network and Client. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5842) facet.pivot need provide the more information and additional function
[ https://issues.apache.org/jira/browse/SOLR-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5842: -- Description: Because facet can set the facet.limit and facet.offset, we can't get the next array size for facet.pivot. If you want to get the next pivot size, you have to set the facet.limit to max integer then to count the array size. In that way you will get a lot of terms for pivot field that's impact the network and Client. Update some functions in the API For example: facet=truefacet.pivot=test,testb,id facet.pivot.min.field=id -- Get the id min value facet.pivot.max.field=id -- Get the id max value facet.pivot.sum.field=id-- Sum the id value facet.pivot.count=true --- Open the get array size function facet.pivot.count.field=id --Get the id array size facet.pivot.count.next=true -- Get the next pivot field array size Response: lst name=facet_pivot long name=idSUM572/long long name=idMAX333/long long name=idMIN1/long long name=idArrCount12/long arr name=test,testb,id lst str name=fieldtest/str str name=valuechange.me/str int name=count5/int long name=idSUM91/long long name=idMAX33/long long name=idMIN1/long long name=idArrCount5/long long name=testbArrCount2/long arr name=pivot lst str name=fieldtestb/str str name=valuetest/str int name=count1/int long name=idSUM3/long long name=idMAX3/long long name=idMIN3/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value3/int int name=count1/int /lst /arr /lst lst str name=fieldtestb/str null name=value/ int name=count4/int long name=idSUM88/long long name=idMAX33/long long name=idMIN1/long long name=idArrCount4/long arr name=pivot lst str name=fieldid/str int name=value1/int int name=count1/int /lst lst str name=fieldid/str int name=value22/int int name=count1/int /lst lst str name=fieldid/str int name=value32/int int name=count1/int /lst lst str name=fieldid/str int name=value33/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=value100/str int name=count1/int long name=idSUM66/long long name=idMAX66/long long name=idMIN66/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str null name=value/ int name=count1/int long name=idSUM66/long long name=idMAX66/long long name=idMIN66/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value66/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=value200/str int name=count1/int long name=idSUM34/long long name=idMAX34/long long name=idMIN34/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str null name=value/ int name=count1/int long name=idSUM34/long long name=idMAX34/long long name=idMIN34/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value34/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=value500/str int name=count1/int long name=idSUM23/long long name=idMAX23/long long name=idMIN23/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str null name=value/ int name=count1/int long name=idSUM23/long long name=idMAX23/long long name=idMIN23/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value23/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=valuechange.me1/str int name=count1/int long name=idSUM4/long long name=idMAX4/long long name=idMIN4/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str str name=valuetest1/str int name=count1/int long name=idSUM4/long long name=idMAX4/long long name=idMIN4/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value4/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=valueme/str int name=count1/int long name=idSUM11/long long name=idMAX11/long long name=idMIN11/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str str name=valuechange.me/str int name=count1/int long name=idSUM11/long long name=idMAX11/long long name=idMIN11/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value11/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=valueok/str int name=count1/int long name=idSUM333/long long name=idMAX333/long long name=idMIN333/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str str name=valueok/str int name=count1/int long name=idSUM333/long long name=idMAX333/long long name=idMIN333/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value333/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str null name=value/ int
[jira] [Updated] (SOLR-5842) facet.pivot need provide the more information and additional function
[ https://issues.apache.org/jira/browse/SOLR-5842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5842: -- Attachment: patch-5842.txt update code for functions facet.pivot need provide the more information and additional function - Key: SOLR-5842 URL: https://issues.apache.org/jira/browse/SOLR-5842 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.6 Reporter: Raintung Li Attachments: patch-5842.txt Because facet can set the facet.limit and facet.offset, we can't get the next array size for facet.pivot. If you want to get the next pivot size, you have to set the facet.limit to max integer then to count the array size. In that way you will get a lot of terms for pivot field that's impact the network and Client. Update some functions in the API For example: facet=truefacet.pivot=test,testb,id facet.pivot.min.field=id -- Get the id min value facet.pivot.max.field=id -- Get the id max value facet.pivot.sum.field=id-- Sum the id value facet.pivot.count=true --- Open the get array size function facet.pivot.count.field=id --Get the id array size facet.pivot.count.next=true -- Get the next pivot field array size Response: lst name=facet_pivot long name=idSUM572/long long name=idMAX333/long long name=idMIN1/long long name=idArrCount12/long arr name=test,testb,id lst str name=fieldtest/str str name=valuechange.me/str int name=count5/int long name=idSUM91/long long name=idMAX33/long long name=idMIN1/long long name=idArrCount5/long long name=testbArrCount2/long arr name=pivot lst str name=fieldtestb/str str name=valuetest/str int name=count1/int long name=idSUM3/long long name=idMAX3/long long name=idMIN3/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value3/int int name=count1/int /lst /arr /lst lst str name=fieldtestb/str null name=value/ int name=count4/int long name=idSUM88/long long name=idMAX33/long long name=idMIN1/long long name=idArrCount4/long arr name=pivot lst str name=fieldid/str int name=value1/int int name=count1/int /lst lst str name=fieldid/str int name=value22/int int name=count1/int /lst lst str name=fieldid/str int name=value32/int int name=count1/int /lst lst str name=fieldid/str int name=value33/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=value100/str int name=count1/int long name=idSUM66/long long name=idMAX66/long long name=idMIN66/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str null name=value/ int name=count1/int long name=idSUM66/long long name=idMAX66/long long name=idMIN66/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value66/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=value200/str int name=count1/int long name=idSUM34/long long name=idMAX34/long long name=idMIN34/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str null name=value/ int name=count1/int long name=idSUM34/long long name=idMAX34/long long name=idMIN34/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value34/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=value500/str int name=count1/int long name=idSUM23/long long name=idMAX23/long long name=idMIN23/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str null name=value/ int name=count1/int long name=idSUM23/long long name=idMAX23/long long name=idMIN23/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value23/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=valuechange.me1/str int name=count1/int long name=idSUM4/long long name=idMAX4/long long name=idMIN4/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str str name=valuetest1/str int name=count1/int long name=idSUM4/long long name=idMAX4/long long name=idMIN4/long long name=idArrCount1/long arr name=pivot lst str name=fieldid/str int name=value4/int int name=count1/int /lst /arr /lst /arr /lst lst str name=fieldtest/str str name=valueme/str int name=count1/int long name=idSUM11/long long name=idMAX11/long long name=idMIN11/long long name=idArrCount1/long long name=testbArrCount1/long arr name=pivot lst str name=fieldtestb/str str name=valuechange.me/str int name=count1/int long name=idSUM11/long long name=idMAX11/long long
[jira] [Created] (SOLR-5784) Solr create collection can support clone, and alias can support add collection function
Raintung Li created SOLR-5784: - Summary: Solr create collection can support clone, and alias can support add collection function Key: SOLR-5784 URL: https://issues.apache.org/jira/browse/SOLR-5784 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.6.1, 4.6 Reporter: Raintung Li Solr API improvement. a. Clone the collection Create a new collection that configuration is same as the other collection exclude the index data. It can make easy to create the collection only know the other collection name. URL example: http://localhost:8983/solr/admin/collections?action=clonename=new collection namecloneCollection=[clone name] b. Add one collection into alias. Alias API only update the alias should be know all collections, it isn't easy use. http://localhost:8983/solr/admin/collections?action=CREATEALIASname=aliasaddCollections=collection1 a,b case easy use for Alias function. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5747) Update the config need manual restart the cluster server.
[ https://issues.apache.org/jira/browse/SOLR-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909984#comment-13909984 ] Raintung Li commented on SOLR-5747: --- My point is it is import for operation, Solr cloud should be restarted automatic, for ZK issue I have delivered reload task to other thread, also we can control the reload event frequency, even if we can get the locked from ZK to control reload sequence to avoid hit the ZK loading. Automatic is same as restart piece by piece, it can be configuration that I agree. The manual reload is last keeper that if automatic doesn't work. Update the config need manual restart the cluster server. - Key: SOLR-5747 URL: https://issues.apache.org/jira/browse/SOLR-5747 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.5, 4.5.1, 4.6, 4.6.1 Environment: linux, solr cloud Reporter: Raintung Li Labels: collection, config Attachments: patch-5747 Many collections bundle one config. If I update the config, I need manual reload the collection one by one. Could monitor config for every collection. If update the config, the monitor will notify and automatic reload collection for every solr core. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5767) dataimport configuration file issue in the solr cloud
Raintung Li created SOLR-5767: - Summary: dataimport configuration file issue in the solr cloud Key: SOLR-5767 URL: https://issues.apache.org/jira/browse/SOLR-5767 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler, SolrCloud Affects Versions: 4.6.1, 4.6 Reporter: Raintung Li Many collections can be corresponding to one config, dataimport configuration file should bundle into the collections not config. Data import module use SolrResourceLoader to load the config file, and write the result into dataimport.properties file. For config file the path(ZK): /configs/[configname]/data-config.xml or ClassPath For result path(ZK): /configs/[CollectionName]/ dataimport.properties it look like very confused that we maybe can update the same design -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5767) dataimport configuration file issue in the solr cloud
[ https://issues.apache.org/jira/browse/SOLR-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5767: -- Description: Many collections can be corresponding to one config, dataimport configuration file should bundle into the collections not config. Data import module use SolrResourceLoader to load the config file, and write the result into dataimport.properties file. For config file the path(ZK): /configs/[configname]/data-config.xml or ClassPath For result path(ZK): /configs/[CollectionName]/ dataimport.properties it look like very confused that we maybe can update the same design Like this as below. /configs/[configname]/dataimport/[CollectionName]/data-config.xml /configs/[configname]/dataimport/[CollectionName]/dataimport.properties was: Many collections can be corresponding to one config, dataimport configuration file should bundle into the collections not config. Data import module use SolrResourceLoader to load the config file, and write the result into dataimport.properties file. For config file the path(ZK): /configs/[configname]/data-config.xml or ClassPath For result path(ZK): /configs/[CollectionName]/ dataimport.properties it look like very confused that we maybe can update the same design dataimport configuration file issue in the solr cloud - Key: SOLR-5767 URL: https://issues.apache.org/jira/browse/SOLR-5767 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler, SolrCloud Affects Versions: 4.6, 4.6.1 Reporter: Raintung Li Many collections can be corresponding to one config, dataimport configuration file should bundle into the collections not config. Data import module use SolrResourceLoader to load the config file, and write the result into dataimport.properties file. For config file the path(ZK): /configs/[configname]/data-config.xml or ClassPath For result path(ZK): /configs/[CollectionName]/ dataimport.properties it look like very confused that we maybe can update the same design Like this as below. /configs/[configname]/dataimport/[CollectionName]/data-config.xml /configs/[configname]/dataimport/[CollectionName]/dataimport.properties -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5767) dataimport configuration file issue in the solr cloud
[ https://issues.apache.org/jira/browse/SOLR-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5767: -- Attachment: patch-5767.txt dataimport configuration file issue in the solr cloud - Key: SOLR-5767 URL: https://issues.apache.org/jira/browse/SOLR-5767 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler, SolrCloud Affects Versions: 4.6, 4.6.1 Reporter: Raintung Li Attachments: patch-5767.txt Many collections can be corresponding to one config, dataimport configuration file should bundle into the collections not config. Data import module use SolrResourceLoader to load the config file, and write the result into dataimport.properties file. For config file the path(ZK): /configs/[configname]/data-config.xml or ClassPath For result path(ZK): /configs/[CollectionName]/ dataimport.properties it look like very confused that we maybe can update the same design Like this as below. /configs/[configname]/dataimport/[CollectionName]/data-config.xml /configs/[configname]/dataimport/[CollectionName]/dataimport.properties -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5747) Update the config need manual restart the cluster server.
Raintung Li created SOLR-5747: - Summary: Update the config need manual restart the cluster server. Key: SOLR-5747 URL: https://issues.apache.org/jira/browse/SOLR-5747 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.6.1, 4.6, 4.5.1, 4.5 Environment: linux, solr cloud Reporter: Raintung Li Many collections bundle one config. If I update the config, I need manual reload the collection one by one. Could monitor config for every collection. If update the config, the monitor will notify and automatic reload collection for every solr core. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5747) Update the config need manual restart the cluster server.
[ https://issues.apache.org/jira/browse/SOLR-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5747: -- Attachment: patch-5747 Update the config need manual restart the cluster server. - Key: SOLR-5747 URL: https://issues.apache.org/jira/browse/SOLR-5747 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.5, 4.5.1, 4.6, 4.6.1 Environment: linux, solr cloud Reporter: Raintung Li Labels: collection, config Attachments: patch-5747 Many collections bundle one config. If I update the config, I need manual reload the collection one by one. Could monitor config for every collection. If update the config, the monitor will notify and automatic reload collection for every solr core. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5747) Update the config need manual restart the cluster server.
[ https://issues.apache.org/jira/browse/SOLR-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905327#comment-13905327 ] Raintung Li commented on SOLR-5747: --- Update one new file in the zookeeper /configs/configname ZKContainer add the watcher for this file. any update config file, will also update this file at last. And zk watcher will get notify, and reload the core use corecontain method. Update the config need manual restart the cluster server. - Key: SOLR-5747 URL: https://issues.apache.org/jira/browse/SOLR-5747 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.5, 4.5.1, 4.6, 4.6.1 Environment: linux, solr cloud Reporter: Raintung Li Labels: collection, config Attachments: patch-5747 Many collections bundle one config. If I update the config, I need manual reload the collection one by one. Could monitor config for every collection. If update the config, the monitor will notify and automatic reload collection for every solr core. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5674) The rows improvement for QueryComponet
Raintung Li created SOLR-5674: - Summary: The rows improvement for QueryComponet Key: SOLR-5674 URL: https://issues.apache.org/jira/browse/SOLR-5674 Project: Solr Issue Type: Bug Components: contrib - Clustering Affects Versions: 4.6, 4.5.1, 4.3.1 Environment: JVM7 Reporter: Raintung Li For solr Rows issues: 1. Solr don't provide get full results API, usually customer will set the rows is Integer.maxvalue try to get the full results that cause the other issue. OOM issue in solr :SOLR-5661(https://issues.apache.org/jira/browse/SOLR-5661) How about open the API for rows=-1? That means return full results. Sometimes the result count will every biggest that will cause the heap OOM, but usually we can suggest the customer to make sure the result really small that can call this API. Actually we don't want to make the second call to get full results. For one is call API get total number, for two get the result set rows into total number. 2. A litter improve, because every shard node return results has been ordered. Add first shard list into the PriorityQueue that don't need compare again only filter the same unique id. 3. Create the PriorityQueue after check all shard return sizes, that can avoid the unnecessary memory cost especially biggest rows. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5674) The rows improvement for QueryComponet
[ https://issues.apache.org/jira/browse/SOLR-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5674: -- Attachment: SOLR-5674.txt I don't have environment to test, just take a look. The rows improvement for QueryComponet -- Key: SOLR-5674 URL: https://issues.apache.org/jira/browse/SOLR-5674 Project: Solr Issue Type: Bug Components: contrib - Clustering Affects Versions: 4.3.1, 4.5.1, 4.6 Environment: JVM7 Reporter: Raintung Li Labels: QueryComponet, rows Attachments: SOLR-5674.txt For solr Rows issues: 1. Solr don't provide get full results API, usually customer will set the rows is Integer.maxvalue try to get the full results that cause the other issue. OOM issue in solr :SOLR-5661(https://issues.apache.org/jira/browse/SOLR-5661) How about open the API for rows=-1? That means return full results. Sometimes the result count will every biggest that will cause the heap OOM, but usually we can suggest the customer to make sure the result really small that can call this API. Actually we don't want to make the second call to get full results. For one is call API get total number, for two get the result set rows into total number. 2. A litter improve, because every shard node return results has been ordered. Add first shard list into the PriorityQueue that don't need compare again only filter the same unique id. 3. Create the PriorityQueue after check all shard return sizes, that can avoid the unnecessary memory cost especially biggest rows. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5661) PriorityQueue has OOM (Requested array size exceeds VM limit) issue
[ https://issues.apache.org/jira/browse/SOLR-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13885114#comment-13885114 ] Raintung Li commented on SOLR-5661: --- I create the other issue SOLR-5674 to distinguish track biggest rows issue. PriorityQueue has OOM (Requested array size exceeds VM limit) issue --- Key: SOLR-5661 URL: https://issues.apache.org/jira/browse/SOLR-5661 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 4.3.1, 4.4, 4.5, 4.5.1, 4.6 Environment: JDK 7 Reporter: Raintung Li Assignee: Michael McCandless Fix For: 5.0, 4.7 Attachments: patch-5661.txt It look like JDK7 change the design for max_array_length logic, it isn't max_jint, and it should be max_jint - header_size(type). If you deliver the Integer.MaxValue to create the PriorityQueue and have enough memory, you will find it is ok in JVM6 but not work in JVM7. JVM7 will throw OOM error while do array rang checking. It should the compatible issue between JVM6 and JVM7. Maybe need protect in the code logic, throw OOM look like big issues for customer. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5661) PriorityQueue has OOM (Requested array size exceeds VM limit) issue
[ https://issues.apache.org/jira/browse/SOLR-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13882535#comment-13882535 ] Raintung Li commented on SOLR-5661: --- Yes, I ask the many rows that it is max integer. The root node will collect the results from different node(shards) to combin and meger the results. For this case, will directly create the queue . Fix the Lucene side, it is enough. For solr issue, should be how to handle the biggest rowid, for biggest rowid should have the different logic. Can't direct create the biggest queue. PriorityQueue has OOM (Requested array size exceeds VM limit) issue --- Key: SOLR-5661 URL: https://issues.apache.org/jira/browse/SOLR-5661 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 4.3.1, 4.4, 4.5, 4.5.1, 4.6 Environment: JDK 7 Reporter: Raintung Li Assignee: Michael McCandless Fix For: 5.0, 4.7 Attachments: patch-5661.txt It look like JDK7 change the design for max_array_length logic, it isn't max_jint, and it should be max_jint - header_size(type). If you deliver the Integer.MaxValue to create the PriorityQueue and have enough memory, you will find it is ok in JVM6 but not work in JVM7. JVM7 will throw OOM error while do array rang checking. It should the compatible issue between JVM6 and JVM7. Maybe need protect in the code logic, throw OOM look like big issues for customer. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5661) PriorityQueue has OOM (Requested array size exceeds VM limit) issue
Raintung Li created SOLR-5661: - Summary: PriorityQueue has OOM (Requested array size exceeds VM limit) issue Key: SOLR-5661 URL: https://issues.apache.org/jira/browse/SOLR-5661 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 4.6, 4.5.1, 4.5, 4.4, 4.3.1 Environment: JDK 7 Reporter: Raintung Li It look like JDK7 change the design for max_array_length logic, it isn't max_jint, and it should be max_jint - header_size(type). If you deliver the Integer.MaxValue to create the PriorityQueue and have enough memory, you will find it is ok in JVM6 but not work in JVM7. JVM7 will throw OOM error while do array rang checking. It should the compatible issue between JVM6 and JVM7. Maybe need protect in the code logic, throw OOM look like big issues for customer. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5661) PriorityQueue has OOM (Requested array size exceeds VM limit) issue
[ https://issues.apache.org/jira/browse/SOLR-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881610#comment-13881610 ] Raintung Li commented on SOLR-5661: --- If you have multiple shard for one collection, send the query url for max integer rowid, it can easy replicate. Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:64) at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:37) at org.apache.solr.handler.component.ShardFieldSortedHitQueue.init(ShardDoc.java:113) at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:790) at org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:649) at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:628) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:710) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:413) PriorityQueue has OOM (Requested array size exceeds VM limit) issue --- Key: SOLR-5661 URL: https://issues.apache.org/jira/browse/SOLR-5661 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 4.3.1, 4.4, 4.5, 4.5.1, 4.6 Environment: JDK 7 Reporter: Raintung Li It look like JDK7 change the design for max_array_length logic, it isn't max_jint, and it should be max_jint - header_size(type). If you deliver the Integer.MaxValue to create the PriorityQueue and have enough memory, you will find it is ok in JVM6 but not work in JVM7. JVM7 will throw OOM error while do array rang checking. It should the compatible issue between JVM6 and JVM7. Maybe need protect in the code logic, throw OOM look like big issues for customer. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5661) PriorityQueue has OOM (Requested array size exceeds VM limit) issue
[ https://issues.apache.org/jira/browse/SOLR-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5661: -- Attachment: patch-5661.txt add the protection logic. PriorityQueue has OOM (Requested array size exceeds VM limit) issue --- Key: SOLR-5661 URL: https://issues.apache.org/jira/browse/SOLR-5661 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 4.3.1, 4.4, 4.5, 4.5.1, 4.6 Environment: JDK 7 Reporter: Raintung Li Attachments: patch-5661.txt It look like JDK7 change the design for max_array_length logic, it isn't max_jint, and it should be max_jint - header_size(type). If you deliver the Integer.MaxValue to create the PriorityQueue and have enough memory, you will find it is ok in JVM6 but not work in JVM7. JVM7 will throw OOM error while do array rang checking. It should the compatible issue between JVM6 and JVM7. Maybe need protect in the code logic, throw OOM look like big issues for customer. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5646) SolrEventListener listener should have destroy function
Raintung Li created SOLR-5646: - Summary: SolrEventListener listener should have destroy function Key: SOLR-5646 URL: https://issues.apache.org/jira/browse/SOLR-5646 Project: Solr Issue Type: Improvement Components: multicore Affects Versions: 4.6, 4.5.1, 4.5 Environment: normal multiple core Reporter: Raintung Li solr can support the self listener, but only register the listener in the core load, but don't have destroy function while core close/server shutdown. SolrEventListener can provide the destroy function to be invoked by core close, that can destroy listener resource. (Thread or Cache or others) -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5646) SolrEventListener listener should have destroy function
[ https://issues.apache.org/jira/browse/SOLR-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5646: -- Attachment: patch-5646 SolrEventListener listener should have destroy function --- Key: SOLR-5646 URL: https://issues.apache.org/jira/browse/SOLR-5646 Project: Solr Issue Type: Improvement Components: multicore Affects Versions: 4.5, 4.5.1, 4.6 Environment: normal multiple core Reporter: Raintung Li Labels: destroy, listener Attachments: patch-5646 Original Estimate: 12h Remaining Estimate: 12h solr can support the self listener, but only register the listener in the core load, but don't have destroy function while core close/server shutdown. SolrEventListener can provide the destroy function to be invoked by core close, that can destroy listener resource. (Thread or Cache or others) -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5546) Add the aggregate function in the faceted function
[ https://issues.apache.org/jira/browse/SOLR-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5546: -- Description: Faceted only provide the count for different type in one field, sometimes we will want to aggregate the other field in the same faceted not only count. The API maybe can use: Date Query: facet=truefacet.date=tTransDatefacet.date.start=NOW/MONTH-12MONTHSfacet.date.end=NOWfacet.date.gap=%2B1MONTHf.tTransDate.facet.sum.field=amountMoneyf.tTransDate.facet.max.field=amountMoney Query: facet=truefacet.query=amountMoney:[*+TO+500]facet.query=amountMoney:[500+TO+*]f.query.facet.sum.field=amountMoney Range: facet.range=amountMoneyf.amountMoney.facet.range.start=0f.amountMoney.facet.range.end=1000f.amountMoney.facet.range.gap=100f.amountMoney.facet.sum.field=amountMoney Field: facet=truefacet.field=amountMoneyf.amountMoney.facet.sum.field=amountMoney facetd.rt=sum(field1),max(field2),count that means: Response XML will be lst name=count int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=sum.field1 int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=max.field2 int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst was: Faceted only provide the count for different type in one field, sometimes we will want to aggregate the other field in the same faceted not only count. The API maybe can use facetd.rt=sum(field1),max(field2),count that means: Response XML will be lst name=count int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=sum(field1) int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=max(field2) int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst Add the aggregate function in the faceted function -- Key: SOLR-5546 URL: https://issues.apache.org/jira/browse/SOLR-5546 Project: Solr Issue Type: Improvement Components: contrib - Clustering Affects Versions: 4.5.1 Reporter: Raintung Li Faceted only provide the count for different type in one field, sometimes we will want to aggregate the other field in the same faceted not only count. The API maybe can use: Date Query: facet=truefacet.date=tTransDatefacet.date.start=NOW/MONTH-12MONTHSfacet.date.end=NOWfacet.date.gap=%2B1MONTHf.tTransDate.facet.sum.field=amountMoneyf.tTransDate.facet.max.field=amountMoney Query: facet=truefacet.query=amountMoney:[*+TO+500]facet.query=amountMoney:[500+TO+*]f.query.facet.sum.field=amountMoney Range: facet.range=amountMoneyf.amountMoney.facet.range.start=0f.amountMoney.facet.range.end=1000f.amountMoney.facet.range.gap=100f.amountMoney.facet.sum.field=amountMoney Field: facet=truefacet.field=amountMoneyf.amountMoney.facet.sum.field=amountMoney facetd.rt=sum(field1),max(field2),count that means: Response XML will be lst name=count int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=sum.field1 int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int
[jira] [Created] (SOLR-5546) Add the aggregate function in the faceted function
Raintung Li created SOLR-5546: - Summary: Add the aggregate function in the faceted function Key: SOLR-5546 URL: https://issues.apache.org/jira/browse/SOLR-5546 Project: Solr Issue Type: Improvement Components: contrib - Clustering Affects Versions: 4.5.1 Reporter: Raintung Li -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5546) Add the aggregate function in the faceted function
[ https://issues.apache.org/jira/browse/SOLR-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5546: -- Description: Faceted only provide the count for different type in one field, sometimes we will want to aggregate the other field in the same faceted not only count. The API maybe can use facetd.rt=sum(field1),max(field2),count that means: Response XML will be lst name=count int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=sum(field1) int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=max(field1) int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst Add the aggregate function in the faceted function -- Key: SOLR-5546 URL: https://issues.apache.org/jira/browse/SOLR-5546 Project: Solr Issue Type: Improvement Components: contrib - Clustering Affects Versions: 4.5.1 Reporter: Raintung Li Faceted only provide the count for different type in one field, sometimes we will want to aggregate the other field in the same faceted not only count. The API maybe can use facetd.rt=sum(field1),max(field2),count that means: Response XML will be lst name=count int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=sum(field1) int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=max(field1) int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5546) Add the aggregate function in the faceted function
[ https://issues.apache.org/jira/browse/SOLR-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-5546: -- Description: Faceted only provide the count for different type in one field, sometimes we will want to aggregate the other field in the same faceted not only count. The API maybe can use facetd.rt=sum(field1),max(field2),count that means: Response XML will be lst name=count int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=sum(field1) int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=max(field2) int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst was: Faceted only provide the count for different type in one field, sometimes we will want to aggregate the other field in the same faceted not only count. The API maybe can use facetd.rt=sum(field1),max(field2),count that means: Response XML will be lst name=count int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=sum(field1) int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=max(field1) int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst Add the aggregate function in the faceted function -- Key: SOLR-5546 URL: https://issues.apache.org/jira/browse/SOLR-5546 Project: Solr Issue Type: Improvement Components: contrib - Clustering Affects Versions: 4.5.1 Reporter: Raintung Li Faceted only provide the count for different type in one field, sometimes we will want to aggregate the other field in the same faceted not only count. The API maybe can use facetd.rt=sum(field1),max(field2),count that means: Response XML will be lst name=count int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=sum(field1) int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst lst name=max(field2) int name=2007-08-11T00:00:00.000Z1/int int name=2007-08-12T00:00:00.000Z5/int int name=2007-08-13T00:00:00.000Z3/int int name=2007-08-14T00:00:00.000Z7/int int name=2007-08-15T00:00:00.000Z2/int int name=2007-08-16T00:00:00.000Z16/int /lst -- This message was sent by Atlassian JIRA (v6.1.4#6159) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4408) Server hanging on startup
[ https://issues.apache.org/jira/browse/SOLR-4408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13633638#comment-13633638 ] Raintung Li commented on SOLR-4408: --- It is different bug with SOLR-4400. Can remove the link. Server hanging on startup - Key: SOLR-4408 URL: https://issues.apache.org/jira/browse/SOLR-4408 Project: Solr Issue Type: Bug Affects Versions: 4.1 Environment: OpenJDK 64-Bit Server VM (23.2-b09 mixed mode) Tomcat 7.0 Eclipse Juno + WTP Reporter: Francois-Xavier Bonnet Assignee: Erick Erickson Fix For: 4.3 Attachments: patch-4408.txt While starting, the server hangs indefinitely. Everything works fine when I first start the server with no index created yet but if I fill the index then stop and start the server, it hangs. Could it be a lock that is never released? Here is what I get in a full thread dump: 2013-02-06 16:28:52 Full thread dump OpenJDK 64-Bit Server VM (23.2-b09 mixed mode): searcherExecutor-4-thread-1 prio=10 tid=0x7fbdfc16a800 nid=0x42c6 in Object.wait() [0x7fbe0ab1] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xc34c1c48 (a java.lang.Object) at java.lang.Object.wait(Object.java:503) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1492) - locked 0xc34c1c48 (a java.lang.Object) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1312) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1247) at org.apache.solr.request.SolrQueryRequestBase.getSearcher(SolrQueryRequestBase.java:94) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:213) at org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:112) at org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:203) at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:180) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816) at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64) at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1594) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) coreLoadExecutor-3-thread-1 prio=10 tid=0x7fbe04194000 nid=0x42c5 in Object.wait() [0x7fbe0ac11000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xc34c1c48 (a java.lang.Object) at java.lang.Object.wait(Object.java:503) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1492) - locked 0xc34c1c48 (a java.lang.Object) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1312) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1247) at org.apache.solr.handler.ReplicationHandler.getIndexVersion(ReplicationHandler.java:495) at org.apache.solr.handler.ReplicationHandler.getStatistics(ReplicationHandler.java:518) at org.apache.solr.core.JmxMonitoredMap$SolrDynamicMBean.getMBeanInfo(JmxMonitoredMap.java:232) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:512) at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:140) at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:636) at org.apache.solr.core.SolrCore.init(SolrCore.java:809) at org.apache.solr.core.SolrCore.init(SolrCore.java:607) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1003) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629) at
[jira] [Created] (SOLR-4705) HttpShardHandler null point exception
Raintung Li created SOLR-4705: - Summary: HttpShardHandler null point exception Key: SOLR-4705 URL: https://issues.apache.org/jira/browse/SOLR-4705 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.2, 4.2.1 Reporter: Raintung Li Priority: Blocker Call search URL; select?q=testshards=ip/solr/ checkDistributed method throw null pointer exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4705) HttpShardHandler null point exception
[ https://issues.apache.org/jira/browse/SOLR-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4705: -- Attachment: patch-4705 add null pointer protection. HttpShardHandler null point exception - Key: SOLR-4705 URL: https://issues.apache.org/jira/browse/SOLR-4705 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.2, 4.2.1 Reporter: Raintung Li Priority: Blocker Attachments: patch-4705 Call search URL; select?q=testshards=ip/solr/ checkDistributed method throw null pointer exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4705) HttpShardHandler null point exception
[ https://issues.apache.org/jira/browse/SOLR-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4705: -- Attachment: (was: patch-4705) HttpShardHandler null point exception - Key: SOLR-4705 URL: https://issues.apache.org/jira/browse/SOLR-4705 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.2, 4.2.1 Reporter: Raintung Li Priority: Blocker Attachments: patch-4705.txt Call search URL; select?q=testshards=ip/solr/ checkDistributed method throw null pointer exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4705) HttpShardHandler null point exception
[ https://issues.apache.org/jira/browse/SOLR-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4705: -- Attachment: patch-4705.txt HttpShardHandler null point exception - Key: SOLR-4705 URL: https://issues.apache.org/jira/browse/SOLR-4705 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.2, 4.2.1 Reporter: Raintung Li Priority: Blocker Attachments: patch-4705.txt Call search URL; select?q=testshards=ip/solr/ checkDistributed method throw null pointer exception. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4509) Disable Stale Check - Distributed Search (Performance)
[ https://issues.apache.org/jira/browse/SOLR-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591942#comment-13591942 ] Raintung Li commented on SOLR-4509: --- Markus, The tomcat default work threads are only 150, connection refuse is cased by Socket listener queue full, you can expand the work threads in the tomcat configuration. Disable Stale Check - Distributed Search (Performance) -- Key: SOLR-4509 URL: https://issues.apache.org/jira/browse/SOLR-4509 Project: Solr Issue Type: Improvement Components: search Environment: 5 node SmartOS cluster (all nodes living in same global zone - i.e. same physical machine) Reporter: Ryan Zezeski Priority: Minor Attachments: baremetal-stale-nostale-med-latency.dat, baremetal-stale-nostale-med-latency.svg, baremetal-stale-nostale-throughput.dat, baremetal-stale-nostale-throughput.svg, IsStaleTime.java, SOLR-4509.patch By disabling the Apache HTTP Client stale check I've witnessed a 2-4x increase in throughput and reduction of over 100ms. This patch was made in the context of a project I'm leading, called Yokozuna, which relies on distributed search. Here's the patch on Yokozuna: https://github.com/rzezeski/yokozuna/pull/26 Here's a write-up I did on my findings: http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html I'm happy to answer any questions or make changes to the patch to make it acceptable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer
[ https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585565#comment-13585565 ] Raintung Li commented on SOLR-4449: --- Let me describe it clear to make the same page. Normal case: Client send 1 request for search, the Servlet will have handle request thread that we call it is request main thread. Then main thread will start 3 threads to send the request to 3 shards, because this collection has 3 shards. Main thread block to wait the full 3 threads(shards) response. Result: We need 4 threads in the normal case that don't send the second request. Your case: Client send 1 request search . Then main thread will start 6 threads to send the request to 3 shards.. Main thread block to wait the 3 threads response, the other 3 threads are stared in the LB Result: We need 7 threads in the normal case that don't send the second request. My case: Client send 1 request for search, Then this thread will . Change: Main thread wait the 3 threads response in the fixed time. Which shard is overtime, main thread submit the second request. Result: We need 4 threads in the normal case that don't send the second request. Enable backup requests for the internal solr load balancer -- Key: SOLR-4449 URL: https://issues.apache.org/jira/browse/SOLR-4449 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: philip hoy Priority: Minor Attachments: SOLR-4449.patch Add the ability to configure the built-in solr load balancer such that it submits a backup request to the next server in the list if the initial request takes too long. Employing such an algorithm could improve the latency of the 9xth percentile albeit at the expense of increasing overall load due to additional requests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4073) Overseer will miss operations in some cases for OverseerCollectionProcessor
[ https://issues.apache.org/jira/browse/SOLR-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585569#comment-13585569 ] Raintung Li commented on SOLR-4073: --- Yes, I had done some of changes in the other patch. Overseer will miss operations in some cases for OverseerCollectionProcessor Key: SOLR-4073 URL: https://issues.apache.org/jira/browse/SOLR-4073 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud Reporter: Raintung Li Assignee: Mark Miller Fix For: 4.2, 5.0 Attachments: patch-4073 Original Estimate: 168h Remaining Estimate: 168h One overseer disconnect with Zookeeper, but overseer thread still handle the request(A) in the DistributedQueue. Example: overseer thread reconnect Zookeeper try to remove the Top's request. workQueue.remove();. Now the other server will take over the overseer privilege because old overseer disconnect. Start overseer thread and handle the queue request(A) again, and remove the request(A) from queue, then try to get the top's request(B, doesn't get). In the this time old overseer reconnect with ZooKeeper, and remove the top's request from queue. Now the top request is B, it is moved by old overseer server. New overseer server never do B request,because this request deleted by old overseer server, at the last this request(B) miss operations. At best, distributeQueue.peek can get the request's ID that will be removed for workqueue.remove(ID), not remove the top's request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4449) Enable backup requests for the internal solr load balancer
[ https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4449: -- Attachment: patch-4449.txt It is only the sample, not test. Enable backup requests for the internal solr load balancer -- Key: SOLR-4449 URL: https://issues.apache.org/jira/browse/SOLR-4449 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: philip hoy Priority: Minor Attachments: patch-4449.txt, SOLR-4449.patch Add the ability to configure the built-in solr load balancer such that it submits a backup request to the next server in the list if the initial request takes too long. Employing such an algorithm could improve the latency of the 9xth percentile albeit at the expense of increasing overall load due to additional requests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4449) Enable backup requests for the internal solr load balancer
[ https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585682#comment-13585682 ] Raintung Li edited comment on SOLR-4449 at 2/25/13 7:25 AM: It is only the sample my idea, doesn't test. was (Author: raintung.li): It is only the sample, not test. Enable backup requests for the internal solr load balancer -- Key: SOLR-4449 URL: https://issues.apache.org/jira/browse/SOLR-4449 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: philip hoy Priority: Minor Attachments: patch-4449.txt, SOLR-4449.patch Add the ability to configure the built-in solr load balancer such that it submits a backup request to the next server in the list if the initial request takes too long. Employing such an algorithm could improve the latency of the 9xth percentile albeit at the expense of increasing overall load due to additional requests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer
[ https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582029#comment-13582029 ] Raintung Li commented on SOLR-4449: --- Hi Philip, I have one suggestion that don't open new thread in BackupRequestLBHttpSolrServer that will increase threads double special in the search function. Search receive request thread will wait the response from multiple shards, this thread can submit the second request to shards of overtime. For example: One search request: 3 shards, the first request: Need 1+3*2 = 7 threads to handle , the second request: Bad case(3 shards all need resend again) 10 threads Change to the first request: Need 1+3=4 threads to handle the second request: Bad case 7 threads Look solr use simple random to do LB. https://blog.heroku.com/archives/2013/2/16/routing_performance_update/ This blog attacked the random strategy was terrible. Enable backup requests for the internal solr load balancer -- Key: SOLR-4449 URL: https://issues.apache.org/jira/browse/SOLR-4449 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: philip hoy Priority: Minor Attachments: SOLR-4449.patch Add the ability to configure the built-in solr load balancer such that it submits a backup request to the next server in the list if the initial request takes too long. Employing such an algorithm could improve the latency of the 9xth percentile albeit at the expense of increasing overall load due to additional requests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4449) Enable backup requests for the internal solr load balancer
[ https://issues.apache.org/jira/browse/SOLR-4449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582805#comment-13582805 ] Raintung Li commented on SOLR-4449: --- Hi Philip, although all threads are in the ThreadPool that don't create the new thread, it still occupy the thread in the threads pool. For one search request, you will use double threads than before in the first request(Normal case). If it is coming 100 request, that means 300 threads will be used additional for 3 shards. If wait the response and send the second request in the HttpShardHandler.submit method, it will reduce the unnecessary threads cost. Enable backup requests for the internal solr load balancer -- Key: SOLR-4449 URL: https://issues.apache.org/jira/browse/SOLR-4449 Project: Solr Issue Type: New Feature Components: SolrCloud Reporter: philip hoy Priority: Minor Attachments: SOLR-4449.patch Add the ability to configure the built-in solr load balancer such that it submits a backup request to the next server in the list if the initial request takes too long. Employing such an algorithm could improve the latency of the 9xth percentile albeit at the expense of increasing overall load due to additional requests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4408) Server hanging on startup
[ https://issues.apache.org/jira/browse/SOLR-4408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4408: -- Attachment: patch-4408.txt Update to fix it. Server hanging on startup - Key: SOLR-4408 URL: https://issues.apache.org/jira/browse/SOLR-4408 Project: Solr Issue Type: Bug Affects Versions: 4.1 Environment: OpenJDK 64-Bit Server VM (23.2-b09 mixed mode) Tomcat 7.0 Eclipse Juno + WTP Reporter: Francois-Xavier Bonnet Assignee: Erick Erickson Fix For: 4.2, 5.0 Attachments: patch-4408.txt While starting, the server hangs indefinitely. Everything works fine when I first start the server with no index created yet but if I fill the index then stop and start the server, it hangs. Could it be a lock that is never released? Here is what I get in a full thread dump: 2013-02-06 16:28:52 Full thread dump OpenJDK 64-Bit Server VM (23.2-b09 mixed mode): searcherExecutor-4-thread-1 prio=10 tid=0x7fbdfc16a800 nid=0x42c6 in Object.wait() [0x7fbe0ab1] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xc34c1c48 (a java.lang.Object) at java.lang.Object.wait(Object.java:503) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1492) - locked 0xc34c1c48 (a java.lang.Object) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1312) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1247) at org.apache.solr.request.SolrQueryRequestBase.getSearcher(SolrQueryRequestBase.java:94) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:213) at org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:112) at org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:203) at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:180) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816) at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64) at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1594) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) coreLoadExecutor-3-thread-1 prio=10 tid=0x7fbe04194000 nid=0x42c5 in Object.wait() [0x7fbe0ac11000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xc34c1c48 (a java.lang.Object) at java.lang.Object.wait(Object.java:503) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1492) - locked 0xc34c1c48 (a java.lang.Object) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1312) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1247) at org.apache.solr.handler.ReplicationHandler.getIndexVersion(ReplicationHandler.java:495) at org.apache.solr.handler.ReplicationHandler.getStatistics(ReplicationHandler.java:518) at org.apache.solr.core.JmxMonitoredMap$SolrDynamicMBean.getMBeanInfo(JmxMonitoredMap.java:232) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:512) at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:140) at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:636) at org.apache.solr.core.SolrCore.init(SolrCore.java:809) at org.apache.solr.core.SolrCore.init(SolrCore.java:607) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1003) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1033) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624) at
[jira] [Commented] (SOLR-4408) Server hanging on startup
[ https://issues.apache.org/jira/browse/SOLR-4408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13579760#comment-13579760 ] Raintung Li commented on SOLR-4408: --- It is short fix, the main issue listener only one thread execute. If any listener method is blocked, the notify method registerSearch() will blocked. How about the other thread execute the method registerSearcher()? Server hanging on startup - Key: SOLR-4408 URL: https://issues.apache.org/jira/browse/SOLR-4408 Project: Solr Issue Type: Bug Affects Versions: 4.1 Environment: OpenJDK 64-Bit Server VM (23.2-b09 mixed mode) Tomcat 7.0 Eclipse Juno + WTP Reporter: Francois-Xavier Bonnet Assignee: Erick Erickson Fix For: 4.2, 5.0 Attachments: patch-4408.txt While starting, the server hangs indefinitely. Everything works fine when I first start the server with no index created yet but if I fill the index then stop and start the server, it hangs. Could it be a lock that is never released? Here is what I get in a full thread dump: 2013-02-06 16:28:52 Full thread dump OpenJDK 64-Bit Server VM (23.2-b09 mixed mode): searcherExecutor-4-thread-1 prio=10 tid=0x7fbdfc16a800 nid=0x42c6 in Object.wait() [0x7fbe0ab1] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xc34c1c48 (a java.lang.Object) at java.lang.Object.wait(Object.java:503) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1492) - locked 0xc34c1c48 (a java.lang.Object) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1312) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1247) at org.apache.solr.request.SolrQueryRequestBase.getSearcher(SolrQueryRequestBase.java:94) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:213) at org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:112) at org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:203) at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:180) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816) at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64) at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1594) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) coreLoadExecutor-3-thread-1 prio=10 tid=0x7fbe04194000 nid=0x42c5 in Object.wait() [0x7fbe0ac11000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xc34c1c48 (a java.lang.Object) at java.lang.Object.wait(Object.java:503) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1492) - locked 0xc34c1c48 (a java.lang.Object) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1312) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1247) at org.apache.solr.handler.ReplicationHandler.getIndexVersion(ReplicationHandler.java:495) at org.apache.solr.handler.ReplicationHandler.getStatistics(ReplicationHandler.java:518) at org.apache.solr.core.JmxMonitoredMap$SolrDynamicMBean.getMBeanInfo(JmxMonitoredMap.java:232) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:512) at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:140) at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:636) at org.apache.solr.core.SolrCore.init(SolrCore.java:809) at org.apache.solr.core.SolrCore.init(SolrCore.java:607) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1003) at
[jira] [Commented] (SOLR-4408) Server hanging on startup
[ https://issues.apache.org/jira/browse/SOLR-4408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13580112#comment-13580112 ] Raintung Li commented on SOLR-4408: --- If SolrIndexSearcher can be used only finish execute the listeners one by one, all listener can't block the thread. Otherwise can define the other thread to monitor the execute complete then notify wait for getSearch method that easy to troubleshooting. If the listeners don't need execute order sequence necessary, the multiple threads concurrent handle the listeners to improve the start efficiency. Server hanging on startup - Key: SOLR-4408 URL: https://issues.apache.org/jira/browse/SOLR-4408 Project: Solr Issue Type: Bug Affects Versions: 4.1 Environment: OpenJDK 64-Bit Server VM (23.2-b09 mixed mode) Tomcat 7.0 Eclipse Juno + WTP Reporter: Francois-Xavier Bonnet Assignee: Erick Erickson Fix For: 4.2, 5.0 Attachments: patch-4408.txt While starting, the server hangs indefinitely. Everything works fine when I first start the server with no index created yet but if I fill the index then stop and start the server, it hangs. Could it be a lock that is never released? Here is what I get in a full thread dump: 2013-02-06 16:28:52 Full thread dump OpenJDK 64-Bit Server VM (23.2-b09 mixed mode): searcherExecutor-4-thread-1 prio=10 tid=0x7fbdfc16a800 nid=0x42c6 in Object.wait() [0x7fbe0ab1] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xc34c1c48 (a java.lang.Object) at java.lang.Object.wait(Object.java:503) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1492) - locked 0xc34c1c48 (a java.lang.Object) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1312) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1247) at org.apache.solr.request.SolrQueryRequestBase.getSearcher(SolrQueryRequestBase.java:94) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:213) at org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:112) at org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:203) at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:180) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816) at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64) at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1594) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) coreLoadExecutor-3-thread-1 prio=10 tid=0x7fbe04194000 nid=0x42c5 in Object.wait() [0x7fbe0ac11000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xc34c1c48 (a java.lang.Object) at java.lang.Object.wait(Object.java:503) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1492) - locked 0xc34c1c48 (a java.lang.Object) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1312) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1247) at org.apache.solr.handler.ReplicationHandler.getIndexVersion(ReplicationHandler.java:495) at org.apache.solr.handler.ReplicationHandler.getStatistics(ReplicationHandler.java:518) at org.apache.solr.core.JmxMonitoredMap$SolrDynamicMBean.getMBeanInfo(JmxMonitoredMap.java:232) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:512) at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:140) at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:636) at org.apache.solr.core.SolrCore.init(SolrCore.java:809) at
[jira] [Commented] (SOLR-4408) Server hanging on startup
[ https://issues.apache.org/jira/browse/SOLR-4408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13579698#comment-13579698 ] Raintung Li commented on SOLR-4408: --- The major problem searchExecutor is only one thread executor, if any listener method wait thread will block the registerSearcher() that cause can't notifyAll the wait object. Servlet can't initialize success. It is bug. BTW, If you want avoid this, you can set useColdSearcher=true in the solrconfig.xml. Server hanging on startup - Key: SOLR-4408 URL: https://issues.apache.org/jira/browse/SOLR-4408 Project: Solr Issue Type: Bug Affects Versions: 4.1 Environment: OpenJDK 64-Bit Server VM (23.2-b09 mixed mode) Tomcat 7.0 Eclipse Juno + WTP Reporter: Francois-Xavier Bonnet Assignee: Erick Erickson Fix For: 4.2, 5.0 While starting, the server hangs indefinitely. Everything works fine when I first start the server with no index created yet but if I fill the index then stop and start the server, it hangs. Could it be a lock that is never released? Here is what I get in a full thread dump: 2013-02-06 16:28:52 Full thread dump OpenJDK 64-Bit Server VM (23.2-b09 mixed mode): searcherExecutor-4-thread-1 prio=10 tid=0x7fbdfc16a800 nid=0x42c6 in Object.wait() [0x7fbe0ab1] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xc34c1c48 (a java.lang.Object) at java.lang.Object.wait(Object.java:503) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1492) - locked 0xc34c1c48 (a java.lang.Object) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1312) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1247) at org.apache.solr.request.SolrQueryRequestBase.getSearcher(SolrQueryRequestBase.java:94) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:213) at org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:112) at org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:203) at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:180) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1816) at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64) at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1594) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) coreLoadExecutor-3-thread-1 prio=10 tid=0x7fbe04194000 nid=0x42c5 in Object.wait() [0x7fbe0ac11000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0xc34c1c48 (a java.lang.Object) at java.lang.Object.wait(Object.java:503) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1492) - locked 0xc34c1c48 (a java.lang.Object) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1312) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1247) at org.apache.solr.handler.ReplicationHandler.getIndexVersion(ReplicationHandler.java:495) at org.apache.solr.handler.ReplicationHandler.getStatistics(ReplicationHandler.java:518) at org.apache.solr.core.JmxMonitoredMap$SolrDynamicMBean.getMBeanInfo(JmxMonitoredMap.java:232) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:319) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:512) at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:140) at org.apache.solr.core.JmxMonitoredMap.put(JmxMonitoredMap.java:51) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:636) at org.apache.solr.core.SolrCore.init(SolrCore.java:809) at org.apache.solr.core.SolrCore.init(SolrCore.java:607) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:1003)
[jira] [Commented] (SOLR-4043) Add ability to get success/failure responses from Collections API.
[ https://issues.apache.org/jira/browse/SOLR-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556167#comment-13556167 ] Raintung Li commented on SOLR-4043: --- Got it, it is ok. I know you are very busy. :) Add ability to get success/failure responses from Collections API. -- Key: SOLR-4043 URL: https://issues.apache.org/jira/browse/SOLR-4043 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud cluster Reporter: Raintung Li Assignee: Mark Miller Fix For: 4.2, 5.0 Attachments: patch-4043.txt, SOLR-4043_brach4.x.txt, SOLR-4043.patch The create/delete/reload collections are asynchronous process, the client can't get the right response, only make sure the information have been saved into the OverseerCollectionQueue. The client will get the response directly that don't wait the result of behavior(create/delete/reload collection) whatever successful. The easy solution is client wait until the asynchronous process success, the create/delete/reload collection thread will save the response into OverseerCollectionQueue, then notify client to get the response. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4043) Add ability to get success/failure responses from Collections API.
[ https://issues.apache.org/jira/browse/SOLR-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556167#comment-13556167 ] Raintung Li edited comment on SOLR-4043 at 1/17/13 1:46 PM: Got it, it is ok. I know you are very busy. :), and how about 4.1 release date? was (Author: raintung.li): Got it, it is ok. I know you are very busy. :) Add ability to get success/failure responses from Collections API. -- Key: SOLR-4043 URL: https://issues.apache.org/jira/browse/SOLR-4043 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud cluster Reporter: Raintung Li Assignee: Mark Miller Fix For: 4.2, 5.0 Attachments: patch-4043.txt, SOLR-4043_brach4.x.txt, SOLR-4043.patch The create/delete/reload collections are asynchronous process, the client can't get the right response, only make sure the information have been saved into the OverseerCollectionQueue. The client will get the response directly that don't wait the result of behavior(create/delete/reload collection) whatever successful. The easy solution is client wait until the asynchronous process success, the create/delete/reload collection thread will save the response into OverseerCollectionQueue, then notify client to get the response. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4043) Add ability to get success/failure responses from Collections API.
[ https://issues.apache.org/jira/browse/SOLR-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13553424#comment-13553424 ] Raintung Li commented on SOLR-4043: --- Actually it is bug, API doesn't return correctly response. Add ability to get success/failure responses from Collections API. -- Key: SOLR-4043 URL: https://issues.apache.org/jira/browse/SOLR-4043 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud cluster Reporter: Raintung Li Assignee: Mark Miller Fix For: 4.1, 5.0 Attachments: patch-4043.txt, SOLR-4043_brach4.x.txt, SOLR-4043.patch The create/delete/reload collections are asynchronous process, the client can't get the right response, only make sure the information have been saved into the OverseerCollectionQueue. The client will get the response directly that don't wait the result of behavior(create/delete/reload collection) whatever successful. The easy solution is client wait until the asynchronous process success, the create/delete/reload collection thread will save the response into OverseerCollectionQueue, then notify client to get the response. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4236) Commit issue: Can't search while add commit=true in the call URL about insert index
Raintung Li created SOLR-4236: - Summary: Commit issue: Can't search while add commit=true in the call URL about insert index Key: SOLR-4236 URL: https://issues.apache.org/jira/browse/SOLR-4236 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0, 4.0-BETA Environment: one collection, one shard, three sever, one leader, two duplicate Reporter: Raintung Li I setup three instances for solr cloud for one same collection and shards, the cloud is one instance is shard leader and the others are replicate. Send the index request to one instance, the URL example like this. curl http://localhost:7002/solr/update?commit=true; -H Content-Type: text/xml --data-binary 'adddocfield name=idtest/field/doc/add' If send the request to the leader server, only the leader server can search this index, the replicate can't search. I close the autoSoftCommit. If request send to the replicate server, all servers can't search this index. The major problem: SolrCmdDistributor distribAdd method will batch some requests in the cache. DistributedUpdateProcessor class method processCommit will trigger the send the distribute request after the send commit request. If send the testing index's request to replicate server, replicate server will dispatch the request to leader server. But in this case, commit command will send to the other server before actually index request. The index can be searched only wait the softCommit or the other commit command coming. A litter confuse: Why commit command don't need the leader server send to duplicate server? Only receive request server send the commit to full shards server? It look like solr doesn't implement the transaction logic. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4206) Solr server should have the self start/stop script
Raintung Li created SOLR-4206: - Summary: Solr server should have the self start/stop script Key: SOLR-4206 URL: https://issues.apache.org/jira/browse/SOLR-4206 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.0, 4.0-BETA, 4.0-ALPHA Environment: Linux Reporter: Raintung Li Start the solr server, need manual shell command to start. The start/stop script easy to let admin deploy and control server, especial stop solr server. Directly kill the process, that doesn't good choice. Shell script can let the start/stop more easy, and some parameter can be configuration in the properties file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4206) Solr server should have the self start/stop script
[ https://issues.apache.org/jira/browse/SOLR-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4206: -- Attachment: solr.sh solr.properties Property file - config for solr start Solr.sh - Shell script run in the linux. Solr server should have the self start/stop script -- Key: SOLR-4206 URL: https://issues.apache.org/jira/browse/SOLR-4206 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Linux Reporter: Raintung Li Attachments: solr.properties, solr.sh Start the solr server, need manual shell command to start. The start/stop script easy to let admin deploy and control server, especial stop solr server. Directly kill the process, that doesn't good choice. Shell script can let the start/stop more easy, and some parameter can be configuration in the properties file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4206) Solr server should have the self start/stop script
[ https://issues.apache.org/jira/browse/SOLR-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4206: -- Attachment: (was: solr.properties) Solr server should have the self start/stop script -- Key: SOLR-4206 URL: https://issues.apache.org/jira/browse/SOLR-4206 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Linux Reporter: Raintung Li Attachments: solr.properties, solr.sh Start the solr server, need manual shell command to start. The start/stop script easy to let admin deploy and control server, especial stop solr server. Directly kill the process, that doesn't good choice. Shell script can let the start/stop more easy, and some parameter can be configuration in the properties file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4206) Solr server should have the self start/stop script
[ https://issues.apache.org/jira/browse/SOLR-4206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4206: -- Attachment: solr.properties Solr server should have the self start/stop script -- Key: SOLR-4206 URL: https://issues.apache.org/jira/browse/SOLR-4206 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Linux Reporter: Raintung Li Attachments: solr.properties, solr.sh Start the solr server, need manual shell command to start. The start/stop script easy to let admin deploy and control server, especial stop solr server. Directly kill the process, that doesn't good choice. Shell script can let the start/stop more easy, and some parameter can be configuration in the properties file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4043) Add ability to get success/failure responses from Collections API.
[ https://issues.apache.org/jira/browse/SOLR-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4043: -- Attachment: SOLR-4043_brach4.x.txt Merge to branch 4.x Add ability to get success/failure responses from Collections API. -- Key: SOLR-4043 URL: https://issues.apache.org/jira/browse/SOLR-4043 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud cluster Reporter: Raintung Li Assignee: Mark Miller Fix For: 4.1, 5.0 Attachments: patch-4043.txt, SOLR-4043_brach4.x.txt The create/delete/reload collections are asynchronous process, the client can't get the right response, only make sure the information have been saved into the OverseerCollectionQueue. The client will get the response directly that don't wait the result of behavior(create/delete/reload collection) whatever successful. The easy solution is client wait until the asynchronous process success, the create/delete/reload collection thread will save the response into OverseerCollectionQueue, then notify client to get the response. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4043) Add ability to get success/failure responses from Collections API.
[ https://issues.apache.org/jira/browse/SOLR-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531905#comment-13531905 ] Raintung Li commented on SOLR-4043: --- ok, I will do it in this weekend. Add ability to get success/failure responses from Collections API. -- Key: SOLR-4043 URL: https://issues.apache.org/jira/browse/SOLR-4043 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud cluster Reporter: Raintung Li Assignee: Mark Miller Fix For: 4.1, 5.0 Attachments: patch-4043.txt The create/delete/reload collections are asynchronous process, the client can't get the right response, only make sure the information have been saved into the OverseerCollectionQueue. The client will get the response directly that don't wait the result of behavior(create/delete/reload collection) whatever successful. The easy solution is client wait until the asynchronous process success, the create/delete/reload collection thread will save the response into OverseerCollectionQueue, then notify client to get the response. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4043) Add ability to get success/failure responses from Collections API.
[ https://issues.apache.org/jira/browse/SOLR-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530688#comment-13530688 ] Raintung Li commented on SOLR-4043: --- This issue not only let client can get the response without the wait, but also it is important to get the right response from overseer that can make the decision for admin to redo the call. Add ability to get success/failure responses from Collections API. -- Key: SOLR-4043 URL: https://issues.apache.org/jira/browse/SOLR-4043 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud cluster Reporter: Raintung Li Assignee: Mark Miller Fix For: 4.1, 5.0 Attachments: patch-4043.txt The create/delete/reload collections are asynchronous process, the client can't get the right response, only make sure the information have been saved into the OverseerCollectionQueue. The client will get the response directly that don't wait the result of behavior(create/delete/reload collection) whatever successful. The easy solution is client wait until the asynchronous process success, the create/delete/reload collection thread will save the response into OverseerCollectionQueue, then notify client to get the response. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4043) Add ability to get success/failure responses from Collections API.
[ https://issues.apache.org/jira/browse/SOLR-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510206#comment-13510206 ] Raintung Li commented on SOLR-4043: --- Mark, this patch is base on 4.0 Add ability to get success/failure responses from Collections API. -- Key: SOLR-4043 URL: https://issues.apache.org/jira/browse/SOLR-4043 Project: Solr Issue Type: Improvement Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud cluster Reporter: Raintung Li Assignee: Mark Miller Fix For: 4.1, 5.0 Attachments: patch-4043.txt The create/delete/reload collections are asynchronous process, the client can't get the right response, only make sure the information have been saved into the OverseerCollectionQueue. The client will get the response directly that don't wait the result of behavior(create/delete/reload collection) whatever successful. The easy solution is client wait until the asynchronous process success, the create/delete/reload collection thread will save the response into OverseerCollectionQueue, then notify client to get the response. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4129) Solr doesn't support log4j
Raintung Li created SOLR-4129: - Summary: Solr doesn't support log4j Key: SOLR-4129 URL: https://issues.apache.org/jira/browse/SOLR-4129 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0, 4.0-BETA, 4.0-ALPHA Reporter: Raintung Li For many project use the log4j, actually solr use slf logger framework, slf can easy to integrate with log4j. But solr use log4j-over-slf.jar to handle log4j case. This jar has some issues. a. Actually last invoke slf to print the logger (for solr it is JDK14.logging). b. Not implement all log4j function. ex. Logger.setLevel() c. JDK14 log miss some function, ex. thread.info, day rolling Some dependence project had been used log4j that the customer still want to use it, that exist the configuration file. JDK14 log has many different with Log4j. The bad thing is log4j-over-slf.jar conflict with log4j. The other project need remove it log4j. We shouldn't use log4j-over-slf.jar, still reuse log4j if customer want to use it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4099) Suspect zookeeper client thread doesn't call back the watcher, that occur the overseer collection can't work normal.
Raintung Li created SOLR-4099: - Summary: Suspect zookeeper client thread doesn't call back the watcher, that occur the overseer collection can't work normal. Key: SOLR-4099 URL: https://issues.apache.org/jira/browse/SOLR-4099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0, 4.0-BETA, 4.0-ALPHA Environment: Zookeeper version: 3.2 Reporter: Raintung Li In test environment, our zookeeper version is old that our requirement version. Not use solr default 3.3.6 version. The overseer collection processor stop work. Trace the dump, the overseer wait for LatchChildWatcher.await. Check the zookeeper /overseer/collection-queue-work, block a lot of operation for collection. Check the logic, suspect the zookeeper client doesn't call back the watchevent that register the path /overseer/collection-queue-work, unlucky the log level is debug. This case doesn't happen often, very little. But if it happen, it is fatal, we have to stop the leader server. We suggest the compensate solution, that doesn't await until notify. Only wait some time that maybe it is ten minutes or a half of hour or other value to recheck the queue again. Of cause if get the notify, that can direct work normal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4099) Suspect zookeeper client thread doesn't call back the watcher, that occur the overseer collection can't work normal.
[ https://issues.apache.org/jira/browse/SOLR-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4099: -- Description: In test environment, our zookeeper version is old that our requirement version. Not use solr default 3.3.6 version. The overseer collection processor stop work. Trace the dump, the overseer wait for LatchChildWatcher.await. Check the zookeeper /overseer/collection-queue-work, block a lot of operation for collection. Check the logic, suspect the zookeeper client doesn't call back the watchevent that register the path /overseer/collection-queue-work, unlucky the log level is debug. This case doesn't happen often, very little. But if it happen, it is fatal, we have to stop the leader server. Suggest the compensate solution, that doesn't await until notify. Only wait some time that maybe it is ten minutes or a half of hour or other value to recheck the queue again. Of cause if get the notify, that can direct work normal. was: In test environment, our zookeeper version is old that our requirement version. Not use solr default 3.3.6 version. The overseer collection processor stop work. Trace the dump, the overseer wait for LatchChildWatcher.await. Check the zookeeper /overseer/collection-queue-work, block a lot of operation for collection. Check the logic, suspect the zookeeper client doesn't call back the watchevent that register the path /overseer/collection-queue-work, unlucky the log level is debug. This case doesn't happen often, very little. But if it happen, it is fatal, we have to stop the leader server. We suggest the compensate solution, that doesn't await until notify. Only wait some time that maybe it is ten minutes or a half of hour or other value to recheck the queue again. Of cause if get the notify, that can direct work normal. Suspect zookeeper client thread doesn't call back the watcher, that occur the overseer collection can't work normal. Key: SOLR-4099 URL: https://issues.apache.org/jira/browse/SOLR-4099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Zookeeper version: 3.2 Reporter: Raintung Li Original Estimate: 12h Remaining Estimate: 12h In test environment, our zookeeper version is old that our requirement version. Not use solr default 3.3.6 version. The overseer collection processor stop work. Trace the dump, the overseer wait for LatchChildWatcher.await. Check the zookeeper /overseer/collection-queue-work, block a lot of operation for collection. Check the logic, suspect the zookeeper client doesn't call back the watchevent that register the path /overseer/collection-queue-work, unlucky the log level is debug. This case doesn't happen often, very little. But if it happen, it is fatal, we have to stop the leader server. Suggest the compensate solution, that doesn't await until notify. Only wait some time that maybe it is ten minutes or a half of hour or other value to recheck the queue again. Of cause if get the notify, that can direct work normal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4099) Suspect zookeeper client thread doesn't call back the watcher, that occur the overseer collection can't work normal.
[ https://issues.apache.org/jira/browse/SOLR-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4099: -- Attachment: patch-4099.txt example Suspect zookeeper client thread doesn't call back the watcher, that occur the overseer collection can't work normal. Key: SOLR-4099 URL: https://issues.apache.org/jira/browse/SOLR-4099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Zookeeper version: 3.2 Reporter: Raintung Li Attachments: patch-4099.txt Original Estimate: 12h Remaining Estimate: 12h In test environment, our zookeeper version is old that our requirement version. Not use solr default 3.3.6 version. The overseer collection processor stop work. Trace the dump, the overseer wait for LatchChildWatcher.await. Check the zookeeper /overseer/collection-queue-work, block a lot of operation for collection. Check the logic, suspect the zookeeper client doesn't call back the watchevent that register the path /overseer/collection-queue-work, unlucky the log level is debug. This case doesn't happen often, very little. But if it happen, it is fatal, we have to stop the leader server. Suggest the compensate solution, that doesn't await until notify. Only wait some time that maybe it is ten minutes or a half of hour or other value to recheck the queue again. Of cause if get the notify, that can direct work normal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4073) Overseer will miss operations in some cases for OverseerCollectionProcessor
[ https://issues.apache.org/jira/browse/SOLR-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4073: -- Summary: Overseer will miss operations in some cases for OverseerCollectionProcessor (was: Overseer will miss operations in some cases.) Overseer will miss operations in some cases for OverseerCollectionProcessor Key: SOLR-4073 URL: https://issues.apache.org/jira/browse/SOLR-4073 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud Reporter: Raintung Li Attachments: patch-4073 Original Estimate: 168h Remaining Estimate: 168h One overseer disconnect with Zookeeper, but overseer thread still handle the request(A) in the DistributedQueue. Example: overseer thread reconnect Zookeeper try to remove the Top's request. workQueue.remove();. Now the other server will take over the overseer privilege because old overseer disconnect. Start overseer thread and handle the queue request(A) again, and remove the request(A) from queue, then try to get the top's request(B, doesn't get). In the this time old overseer reconnect with ZooKeeper, and remove the top's request from queue. Now the top request is B, it is moved by old overseer server. New overseer server never do B request,because this request deleted by old overseer server, at the last this request(B) miss operations. At best, distributeQueue.peek can get the request's ID that will be removed for workqueue.remove(ID), not remove the top's request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4088) ZkController baseURL only get the host name not address, will occur can't get the right URL to do recovery.
Raintung Li created SOLR-4088: - Summary: ZkController baseURL only get the host name not address, will occur can't get the right URL to do recovery. Key: SOLR-4088 URL: https://issues.apache.org/jira/browse/SOLR-4088 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0, 4.0-BETA, 4.0-ALPHA Environment: Linux env Reporter: Raintung Li In the ZKController.getHostAddress, it gets host name. In the linux, get hostname from /etc/sysconfig/network or /etc/hostname that maybe not config for ip address. Other server can't access it(http://hostname:port/..) correctly that cause recovery fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4088) ZkController baseURL only get the host name not address, will occur can't get the right URL to do recovery.
[ https://issues.apache.org/jira/browse/SOLR-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4088: -- Attachment: patch-4088.txt ZkController baseURL only get the host name not address, will occur can't get the right URL to do recovery. -- Key: SOLR-4088 URL: https://issues.apache.org/jira/browse/SOLR-4088 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Linux env Reporter: Raintung Li Attachments: patch-4088.txt In the ZKController.getHostAddress, it gets host name. In the linux, get hostname from /etc/sysconfig/network or /etc/hostname that maybe not config for ip address. Other server can't access it(http://hostname:port/..) correctly that cause recovery fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4088) ZkController baseURL only get the host name not address, will occur can't get the right URL to do recovery.
[ https://issues.apache.org/jira/browse/SOLR-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4088: -- Attachment: (was: patch-4088.txt) ZkController baseURL only get the host name not address, will occur can't get the right URL to do recovery. -- Key: SOLR-4088 URL: https://issues.apache.org/jira/browse/SOLR-4088 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Linux env Reporter: Raintung Li Attachments: patch-4088.txt In the ZKController.getHostAddress, it gets host name. In the linux, get hostname from /etc/sysconfig/network or /etc/hostname that maybe not config for ip address. Other server can't access it(http://hostname:port/..) correctly that cause recovery fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4088) ZkController baseURL only get the host name not address, will occur can't get the right URL to do recovery.
[ https://issues.apache.org/jira/browse/SOLR-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4088: -- Attachment: patch-4088.txt change host name to host address ZkController baseURL only get the host name not address, will occur can't get the right URL to do recovery. -- Key: SOLR-4088 URL: https://issues.apache.org/jira/browse/SOLR-4088 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Linux env Reporter: Raintung Li Attachments: patch-4088.txt In the ZKController.getHostAddress, it gets host name. In the linux, get hostname from /etc/sysconfig/network or /etc/hostname that maybe not config for ip address. Other server can't access it(http://hostname:port/..) correctly that cause recovery fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4073) Overseer will miss operations in some cases.
Raintung Li created SOLR-4073: - Summary: Overseer will miss operations in some cases. Key: SOLR-4073 URL: https://issues.apache.org/jira/browse/SOLR-4073 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0, 4.0-BETA, 4.0-ALPHA Environment: Solr cloud Reporter: Raintung Li One overseer disconnect with Zookeeper, but overseer thread still handle the request(A) in the DistributedQueue. Example: overseer thread reconnect Zookeeper try to remove the Top's request. workQueue.remove();. Now the other server will take over the overseer privilege because old overseer disconnect. Start overseer thread and handle the queue request(A) again, and remove the request(A) from queue, then try to get the top's request(B, doesn't get). In the this time old overseer reconnect with ZooKeeper, and remove the top's request from queue. Now the top request is B, it is moved by old overseer server. New overseer server never do B request,because this request deleted by old overseer server, at the last this request(B) miss operations. At best, distributeQueue.peek can get the request's ID that will be removed for workqueue.remove(ID), not remove the top's request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4073) Overseer will miss operations in some cases.
[ https://issues.apache.org/jira/browse/SOLR-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4073: -- Attachment: patch-4073 Simple Overseer will miss operations in some cases. Key: SOLR-4073 URL: https://issues.apache.org/jira/browse/SOLR-4073 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud Reporter: Raintung Li Attachments: patch-4073 Original Estimate: 168h Remaining Estimate: 168h One overseer disconnect with Zookeeper, but overseer thread still handle the request(A) in the DistributedQueue. Example: overseer thread reconnect Zookeeper try to remove the Top's request. workQueue.remove();. Now the other server will take over the overseer privilege because old overseer disconnect. Start overseer thread and handle the queue request(A) again, and remove the request(A) from queue, then try to get the top's request(B, doesn't get). In the this time old overseer reconnect with ZooKeeper, and remove the top's request from queue. Now the top request is B, it is moved by old overseer server. New overseer server never do B request,because this request deleted by old overseer server, at the last this request(B) miss operations. At best, distributeQueue.peek can get the request's ID that will be removed for workqueue.remove(ID), not remove the top's request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4073) Overseer will miss operations in some cases.
[ https://issues.apache.org/jira/browse/SOLR-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13497154#comment-13497154 ] Raintung Li edited comment on SOLR-4073 at 11/14/12 3:22 PM: - example was (Author: raintung.li): Simple Overseer will miss operations in some cases. Key: SOLR-4073 URL: https://issues.apache.org/jira/browse/SOLR-4073 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud Reporter: Raintung Li Attachments: patch-4073 Original Estimate: 168h Remaining Estimate: 168h One overseer disconnect with Zookeeper, but overseer thread still handle the request(A) in the DistributedQueue. Example: overseer thread reconnect Zookeeper try to remove the Top's request. workQueue.remove();. Now the other server will take over the overseer privilege because old overseer disconnect. Start overseer thread and handle the queue request(A) again, and remove the request(A) from queue, then try to get the top's request(B, doesn't get). In the this time old overseer reconnect with ZooKeeper, and remove the top's request from queue. Now the top request is B, it is moved by old overseer server. New overseer server never do B request,because this request deleted by old overseer server, at the last this request(B) miss operations. At best, distributeQueue.peek can get the request's ID that will be removed for workqueue.remove(ID), not remove the top's request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4055) Remove/Reload the collection has the thread safe issue.
[ https://issues.apache.org/jira/browse/SOLR-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4055: -- Summary: Remove/Reload the collection has the thread safe issue. (was: Remove/Reload the collection will occur the thread safe issue.) Remove/Reload the collection has the thread safe issue. --- Key: SOLR-4055 URL: https://issues.apache.org/jira/browse/SOLR-4055 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud Reporter: Raintung Li Attachments: patch-4055 OverseerCollectionProcessor class for collectionCmd method has thread safe issue. The major issue is ModifiableSolrParams params instance will deliver into other thread use(HttpShardHandler.submit). Modify parameter will affect the other threads the correct parameter. In the method collectionCmd , change the value params.set(CoreAdminParams.CORE, node.getStr(ZkStateReader.CORE_NAME_PROP)); , that occur send the http request thread will get the wrong core name. The result is that can't delete/reload the right core. The easy fix is clone the ModifiableSolrParams for every request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4055) Remove/Reload the collection will occur the thread safe issue.
Raintung Li created SOLR-4055: - Summary: Remove/Reload the collection will occur the thread safe issue. Key: SOLR-4055 URL: https://issues.apache.org/jira/browse/SOLR-4055 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0, 4.0-BETA, 4.0-ALPHA Environment: Solr cloud Reporter: Raintung Li OverseerCollectionProcessor class for collectionCmd method has thread safe issue. The major issue is ModifiableSolrParams params instance will deliver into other thread use(HttpShardHandler.submit). Modify parameter will affect the other threads the correct parameter. In the method collectionCmd , change the value params.set(CoreAdminParams.CORE, node.getStr(ZkStateReader.CORE_NAME_PROP)); , that occur send the http request thread will get the wrong core name. The result is that can't delete/reload the right core. The easy fix is clone the ModifiableSolrParams for every request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4055) Remove/Reload the collection will occur the thread safe issue.
[ https://issues.apache.org/jira/browse/SOLR-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4055: -- Attachment: patch-4055 the bug patch Remove/Reload the collection will occur the thread safe issue. -- Key: SOLR-4055 URL: https://issues.apache.org/jira/browse/SOLR-4055 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud Reporter: Raintung Li Attachments: patch-4055 OverseerCollectionProcessor class for collectionCmd method has thread safe issue. The major issue is ModifiableSolrParams params instance will deliver into other thread use(HttpShardHandler.submit). Modify parameter will affect the other threads the correct parameter. In the method collectionCmd , change the value params.set(CoreAdminParams.CORE, node.getStr(ZkStateReader.CORE_NAME_PROP)); , that occur send the http request thread will get the wrong core name. The result is that can't delete/reload the right core. The easy fix is clone the ModifiableSolrParams for every request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3964) Solr does not return error, even though create collection unsuccessfully
[ https://issues.apache.org/jira/browse/SOLR-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13493048#comment-13493048 ] Raintung Li commented on SOLR-3964: --- Fix it in the https://issues.apache.org/jira/browse/SOLR-4043, anyone will check it? Solr does not return error, even though create collection unsuccessfully - Key: SOLR-3964 URL: https://issues.apache.org/jira/browse/SOLR-3964 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0 Reporter: milesli Labels: lack, message, response Original Estimate: 6h Remaining Estimate: 6h Solr does not return error, even though create/delete collection unsuccessfully; even though the request URL is incorrect; (example: http://127.0.0.1:8983/solr/admin/collections?action=CREATEname=tenancy_milesnumShards=3numReplicas=2collection.configName=myconf) even though pass the collection name already exists; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-4043) It isn't right response for create/delete/reload collections
Raintung Li created SOLR-4043: - Summary: It isn't right response for create/delete/reload collections Key: SOLR-4043 URL: https://issues.apache.org/jira/browse/SOLR-4043 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0, 4.0-BETA, 4.0-ALPHA Environment: Solr cloud cluster Reporter: Raintung Li Attachments: patch-4043.txt The create/delete/reload collections are asynchronous process, the client can't get the right response, only make sure the information have been saved into the OverseerCollectionQueue. The client will get the response directly that don't wait the result of behavior(create/delete/reload collection) whatever successful. The easy solution is client wait until the asynchronous process success, the create/delete/reload collection thread will save the response into OverseerCollectionQueue, then notify client to get the response. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-4043) It isn't right response for create/delete/reload collections
[ https://issues.apache.org/jira/browse/SOLR-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-4043: -- Attachment: patch-4043.txt The patch fix this bug It isn't right response for create/delete/reload collections - Key: SOLR-4043 URL: https://issues.apache.org/jira/browse/SOLR-4043 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.0-ALPHA, 4.0-BETA, 4.0 Environment: Solr cloud cluster Reporter: Raintung Li Attachments: patch-4043.txt The create/delete/reload collections are asynchronous process, the client can't get the right response, only make sure the information have been saved into the OverseerCollectionQueue. The client will get the response directly that don't wait the result of behavior(create/delete/reload collection) whatever successful. The easy solution is client wait until the asynchronous process success, the create/delete/reload collection thread will save the response into OverseerCollectionQueue, then notify client to get the response. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org