[jira] [Commented] (SOLR-5638) Collection creation partially works, but results in unusable configuration due to missing config in ZK
[ https://issues.apache.org/jira/browse/SOLR-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881127#comment-13881127 ] Mark Miller commented on SOLR-5638: --- We should fail fast if the config does not exist - avoid even trying to create those cores. This can probably happen in other scenarios as well though, so probably a good idea to consider Steve's approach as well. Collection creation partially works, but results in unusable configuration due to missing config in ZK -- Key: SOLR-5638 URL: https://issues.apache.org/jira/browse/SOLR-5638 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Reporter: Nathan Neulinger Attachments: SOLR-5638.patch Need help properly recovering from 'collection gets created without config being defined'. Right now, if you submit a collection create and the config is missing, it will proceed with partially creating cores, but then the cores fail to load. This requires manual intervention on the server to fix unless you pick a new colllection name: What's worse - if you retry the create a second time, it will usually try to create the replicas in the opposite order, resulting in TWO broken cores on each box, one for each attempted replica. beta1-newarch_hive1_v12_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Specified config does not exist in ZooKeeper:hivepoint-unknown beta1-newarch_hive1_v12_shard1_replica2: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Specified config does not exist in ZooKeeper:hivepoint-unknown I already know how to clear this up manually, but this is something where solr is allowing a condition in external service to result in a corrupted/partial configuration. I can see an easy option for resolving this as a workaround - allow a collection CREATE operation to specify reuseCores - i.e. allow it to use an existing core of the proper name if it already exists. Right now you wind up getting: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica1': Could not create a new core in solr/beta1-newarch_hive1_v12_shard1_replica1/as another core is already defined there org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica2': Could not create a new core in solr/beta1-newarch_hive1_v12_shard1_replica2/as another core is already defined there -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5638) Collection creation partially works, but results in unusable configuration due to missing config in ZK
[ https://issues.apache.org/jira/browse/SOLR-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881131#comment-13881131 ] Nathan Neulinger commented on SOLR-5638: Will the unload result the core still being on the disk - just not loaded? In which case, what happens when the create collection is requested again and it decides to lay out the replicas in the other order? Collection creation partially works, but results in unusable configuration due to missing config in ZK -- Key: SOLR-5638 URL: https://issues.apache.org/jira/browse/SOLR-5638 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Reporter: Nathan Neulinger Attachments: SOLR-5638.patch Need help properly recovering from 'collection gets created without config being defined'. Right now, if you submit a collection create and the config is missing, it will proceed with partially creating cores, but then the cores fail to load. This requires manual intervention on the server to fix unless you pick a new colllection name: What's worse - if you retry the create a second time, it will usually try to create the replicas in the opposite order, resulting in TWO broken cores on each box, one for each attempted replica. beta1-newarch_hive1_v12_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Specified config does not exist in ZooKeeper:hivepoint-unknown beta1-newarch_hive1_v12_shard1_replica2: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Specified config does not exist in ZooKeeper:hivepoint-unknown I already know how to clear this up manually, but this is something where solr is allowing a condition in external service to result in a corrupted/partial configuration. I can see an easy option for resolving this as a workaround - allow a collection CREATE operation to specify reuseCores - i.e. allow it to use an existing core of the proper name if it already exists. Right now you wind up getting: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica1': Could not create a new core in solr/beta1-newarch_hive1_v12_shard1_replica1/as another core is already defined there org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica2': Could not create a new core in solr/beta1-newarch_hive1_v12_shard1_replica2/as another core is already defined there -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5638) Collection creation partially works, but results in unusable configuration due to missing config in ZK
[ https://issues.apache.org/jira/browse/SOLR-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881249#comment-13881249 ] Steve Molloy commented on SOLR-5638: Yes, we should fail fast instead of this, hence the comment about this not being the best solution. But it avoids seeing errors again and again about core not being able to load. As for recreating the collection in a different order, I haven't specifically checked, but assume you'd end up with the new cores properly created and the old ones would remain on disk, unloaded. Again, this isn't a real solution, just a band-aid to reduce the impact. Collection creation partially works, but results in unusable configuration due to missing config in ZK -- Key: SOLR-5638 URL: https://issues.apache.org/jira/browse/SOLR-5638 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Reporter: Nathan Neulinger Attachments: SOLR-5638.patch Need help properly recovering from 'collection gets created without config being defined'. Right now, if you submit a collection create and the config is missing, it will proceed with partially creating cores, but then the cores fail to load. This requires manual intervention on the server to fix unless you pick a new colllection name: What's worse - if you retry the create a second time, it will usually try to create the replicas in the opposite order, resulting in TWO broken cores on each box, one for each attempted replica. beta1-newarch_hive1_v12_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Specified config does not exist in ZooKeeper:hivepoint-unknown beta1-newarch_hive1_v12_shard1_replica2: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Specified config does not exist in ZooKeeper:hivepoint-unknown I already know how to clear this up manually, but this is something where solr is allowing a condition in external service to result in a corrupted/partial configuration. I can see an easy option for resolving this as a workaround - allow a collection CREATE operation to specify reuseCores - i.e. allow it to use an existing core of the proper name if it already exists. Right now you wind up getting: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica1': Could not create a new core in solr/beta1-newarch_hive1_v12_shard1_replica1/as another core is already defined there org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica2': Could not create a new core in solr/beta1-newarch_hive1_v12_shard1_replica2/as another core is already defined there -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5638) Collection creation partially works, but results in unusable configuration due to missing config in ZK
[ https://issues.apache.org/jira/browse/SOLR-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13874134#comment-13874134 ] Nathan Neulinger commented on SOLR-5638: Alternatively/additionally - solr really should be checking for validity of the requested create. If you ask for a configName, and it doesn't exist - error out then instead of proceeding with the create that is guaranteed to fail as a whole. Procedure to reproduce: do a collection create for a config name that doesn't exist in ZK. Collection creation partially works, but results in unusable configuration due to missing config in ZK -- Key: SOLR-5638 URL: https://issues.apache.org/jira/browse/SOLR-5638 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6 Reporter: Nathan Neulinger Need help properly recovering from 'collection gets created without config being defined'. Right now, if you submit a collection create and the config is missing, it will proceed with partially creating cores, but then the cores fail to load. This requires manual intervention on the server to fix unless you pick a new colllection name: What's worse - if you retry the create a second time, it will usually try to create the replicas in the opposite order, resulting in TWO broken cores on each box, one for each attempted replica. beta1-newarch_hive1_v12_shard1_replica1: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Specified config does not exist in ZooKeeper:hivepoint-unknown beta1-newarch_hive1_v12_shard1_replica2: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Specified config does not exist in ZooKeeper:hivepoint-unknown I already know how to clear this up manually, but this is something where solr is allowing a condition in external service to result in a corrupted/partial configuration. I can see an easy option for resolving this as a workaround - allow a collection CREATE operation to specify reuseCores - i.e. allow it to use an existing core of the proper name if it already exists. Right now you wind up getting: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica1': Could not create a new core in solr/beta1-newarch_hive1_v12_shard1_replica1/as another core is already defined there org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:Error CREATEing SolrCore 'beta1-newarch_hive1_v12_shard1_replica2': Could not create a new core in solr/beta1-newarch_hive1_v12_shard1_replica2/as another core is already defined there -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org