Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-22 Thread Hendrik Haddorp

I'm also not really an HDFS expert but I believe it is slightly different:

The HDFS data is replicated, lets say 3 times, between the HDFS data 
nodes but for an HDFS client it looks like one directory and it is 
hidden that the data is replicated. Every client should see the same 
data. Just like every client should see the same data in ZooKeeper 
(every ZK node also has a full replica). So with 2 replicas there should 
only be two disjoint data sets. Thus it should not matter which solr 
node claims the replica and then continues where things were left. Solr 
should only be concerned about the replication between the solr replicas 
but not about the replication between the HDFS data nodes, just as it 
does not have to deal with the replication between the ZK nodes.


Anyhow, for now I would be happy if my patch for SOLR-10092 could get 
included soon as the auto add replica feature does not work without that 
at all for me :-)


On 22.02.2017 16:15, Erick Erickson wrote:

bq: in the none HDFS case that sounds logical but in the HDFS case all
the index data is in the shared HDFS file system

That's not really the point, and it's not quite true. The Solr index
unique _per replica_. So replica1 points to an HDFS directory (that's
triply replicated to be sure). replica2 points to a totally different
set of index files. So with the default replication of 3 your two
replicas will have 6 copies of the index that are totally disjoint in
two sets of three. From Solr's point of view, the fact that HDFS
replicates the data doesn't really alter much.

Autoaddreplica will indeed, to be able to re-use the HDFS data if a
Solr node goes away. But that doesn't change the replication issue I
described.

At least that's my understanding, I admit I'm not an HDFS guy and it
may be out of date.

Erick

On Tue, Feb 21, 2017 at 10:30 PM, Hendrik Haddorp
 wrote:

Hi Erick,

in the none HDFS case that sounds logical but in the HDFS case all the index
data is in the shared HDFS file system. Even the transaction logs should be
in there. So the node that once had the replica should not really have more
information then any other node, especially if legacyClound is set to false
so having ZooKeeper truth.

regards,
Hendrik

On 22.02.2017 02:28, Erick Erickson wrote:

Hendrik:

bq: Not really sure why one replica needs to be up though.

I didn't write the code so I'm guessing a bit, but consider the
situation where you have no replicas for a shard up and add a new one.
Eventually it could become the leader but there would have been no
chance for it to check if it's version of the index was up to date.
But since it would be the leader, when other replicas for that shard
_do_ come on line they'd replicate the index down from the newly added
replica, possibly using very old data.

FWIW,
Erick

On Tue, Feb 21, 2017 at 1:12 PM, Hendrik Haddorp
 wrote:

Hi,

I had opened SOLR-10092
(https://issues.apache.org/jira/browse/SOLR-10092)
for this a while ago. I was now able to gt this feature working with a
very
small code change. After a few seconds Solr reassigns the replica to a
different Solr instance as long as one replica is still up. Not really
sure
why one replica needs to be up though. I added the patch based on Solr
6.3
to the bug report. Would be great if it could be merged soon.

regards,
Hendrik

On 19.01.2017 17:08, Hendrik Haddorp wrote:

HDFS is like a shared filesystem so every Solr Cloud instance can access
the data using the same path or URL. The clusterstate.json looks like
this:

"shards":{"shard1":{
  "range":"8000-7fff",
  "state":"active",
  "replicas":{
"core_node1":{
  "core":"test1.collection-0_shard1_replica1",
"dataDir":"hdfs://master...:8000/test1.collection-0/core_node1/data/",
  "base_url":"http://slave3:9000/solr;,
  "node_name":"slave3:9000_solr",
  "state":"active",


"ulogDir":"hdfs://master:8000/test1.collection-0/core_node1/data/tlog"},
"core_node2":{
  "core":"test1.collection-0_shard1_replica2",
"dataDir":"hdfs://master:8000/test1.collection-0/core_node2/data/",
  "base_url":"http://slave2:9000/solr;,
  "node_name":"slave2:9000_solr",
  "state":"active",


"ulogDir":"hdfs://master:8000/test1.collection-0/core_node2/data/tlog",
  "leader":"true"},
"core_node3":{
  "core":"test1.collection-0_shard1_replica3",
"dataDir":"hdfs://master:8000/test1.collection-0/core_node3/data/",
  "base_url":"http://slave4:9005/solr;,
  "node_name":"slave4:9005_solr",
  "state":"active",


"ulogDir":"hdfs://master:8000/test1.collection-0/core_node3/data/tlog"

So every replica is always assigned to one node and this is being stored
in ZK, pretty much the same as for none HDFS setups. Just as the data is
not

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-22 Thread Erick Erickson
bq: in the none HDFS case that sounds logical but in the HDFS case all
the index data is in the shared HDFS file system

That's not really the point, and it's not quite true. The Solr index
unique _per replica_. So replica1 points to an HDFS directory (that's
triply replicated to be sure). replica2 points to a totally different
set of index files. So with the default replication of 3 your two
replicas will have 6 copies of the index that are totally disjoint in
two sets of three. From Solr's point of view, the fact that HDFS
replicates the data doesn't really alter much.

Autoaddreplica will indeed, to be able to re-use the HDFS data if a
Solr node goes away. But that doesn't change the replication issue I
described.

At least that's my understanding, I admit I'm not an HDFS guy and it
may be out of date.

Erick

On Tue, Feb 21, 2017 at 10:30 PM, Hendrik Haddorp
 wrote:
> Hi Erick,
>
> in the none HDFS case that sounds logical but in the HDFS case all the index
> data is in the shared HDFS file system. Even the transaction logs should be
> in there. So the node that once had the replica should not really have more
> information then any other node, especially if legacyClound is set to false
> so having ZooKeeper truth.
>
> regards,
> Hendrik
>
> On 22.02.2017 02:28, Erick Erickson wrote:
>>
>> Hendrik:
>>
>> bq: Not really sure why one replica needs to be up though.
>>
>> I didn't write the code so I'm guessing a bit, but consider the
>> situation where you have no replicas for a shard up and add a new one.
>> Eventually it could become the leader but there would have been no
>> chance for it to check if it's version of the index was up to date.
>> But since it would be the leader, when other replicas for that shard
>> _do_ come on line they'd replicate the index down from the newly added
>> replica, possibly using very old data.
>>
>> FWIW,
>> Erick
>>
>> On Tue, Feb 21, 2017 at 1:12 PM, Hendrik Haddorp
>>  wrote:
>>>
>>> Hi,
>>>
>>> I had opened SOLR-10092
>>> (https://issues.apache.org/jira/browse/SOLR-10092)
>>> for this a while ago. I was now able to gt this feature working with a
>>> very
>>> small code change. After a few seconds Solr reassigns the replica to a
>>> different Solr instance as long as one replica is still up. Not really
>>> sure
>>> why one replica needs to be up though. I added the patch based on Solr
>>> 6.3
>>> to the bug report. Would be great if it could be merged soon.
>>>
>>> regards,
>>> Hendrik
>>>
>>> On 19.01.2017 17:08, Hendrik Haddorp wrote:

 HDFS is like a shared filesystem so every Solr Cloud instance can access
 the data using the same path or URL. The clusterstate.json looks like
 this:

 "shards":{"shard1":{
  "range":"8000-7fff",
  "state":"active",
  "replicas":{
"core_node1":{
  "core":"test1.collection-0_shard1_replica1",
 "dataDir":"hdfs://master...:8000/test1.collection-0/core_node1/data/",
  "base_url":"http://slave3:9000/solr;,
  "node_name":"slave3:9000_solr",
  "state":"active",


 "ulogDir":"hdfs://master:8000/test1.collection-0/core_node1/data/tlog"},
"core_node2":{
  "core":"test1.collection-0_shard1_replica2",
 "dataDir":"hdfs://master:8000/test1.collection-0/core_node2/data/",
  "base_url":"http://slave2:9000/solr;,
  "node_name":"slave2:9000_solr",
  "state":"active",


 "ulogDir":"hdfs://master:8000/test1.collection-0/core_node2/data/tlog",
  "leader":"true"},
"core_node3":{
  "core":"test1.collection-0_shard1_replica3",
 "dataDir":"hdfs://master:8000/test1.collection-0/core_node3/data/",
  "base_url":"http://slave4:9005/solr;,
  "node_name":"slave4:9005_solr",
  "state":"active",


 "ulogDir":"hdfs://master:8000/test1.collection-0/core_node3/data/tlog"

 So every replica is always assigned to one node and this is being stored
 in ZK, pretty much the same as for none HDFS setups. Just as the data is
 not
 stored locally but on the network and as the path does not contain any
 node
 information you can of course easily take over the work to a different
 Solr
 node. You should just need to update the owner of the replica in ZK and
 you
 should basically be done, I assume. That's why the documentation states
 that
 an advantage of using HDFS is that a failing node can be replaced by a
 different one. The Overseer just has to move the ownership of the
 replica,
 which seems like what the code is trying to do. There just seems to be a
 bug
 in the code so that the core does not get created on the target node.

 

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-21 Thread Hendrik Haddorp

Hi Erick,

in the none HDFS case that sounds logical but in the HDFS case all the 
index data is in the shared HDFS file system. Even the transaction logs 
should be in there. So the node that once had the replica should not 
really have more information then any other node, especially if 
legacyClound is set to false so having ZooKeeper truth.


regards,
Hendrik

On 22.02.2017 02:28, Erick Erickson wrote:

Hendrik:

bq: Not really sure why one replica needs to be up though.

I didn't write the code so I'm guessing a bit, but consider the
situation where you have no replicas for a shard up and add a new one.
Eventually it could become the leader but there would have been no
chance for it to check if it's version of the index was up to date.
But since it would be the leader, when other replicas for that shard
_do_ come on line they'd replicate the index down from the newly added
replica, possibly using very old data.

FWIW,
Erick

On Tue, Feb 21, 2017 at 1:12 PM, Hendrik Haddorp
 wrote:

Hi,

I had opened SOLR-10092 (https://issues.apache.org/jira/browse/SOLR-10092)
for this a while ago. I was now able to gt this feature working with a very
small code change. After a few seconds Solr reassigns the replica to a
different Solr instance as long as one replica is still up. Not really sure
why one replica needs to be up though. I added the patch based on Solr 6.3
to the bug report. Would be great if it could be merged soon.

regards,
Hendrik

On 19.01.2017 17:08, Hendrik Haddorp wrote:

HDFS is like a shared filesystem so every Solr Cloud instance can access
the data using the same path or URL. The clusterstate.json looks like this:

"shards":{"shard1":{
 "range":"8000-7fff",
 "state":"active",
 "replicas":{
   "core_node1":{
 "core":"test1.collection-0_shard1_replica1",
"dataDir":"hdfs://master...:8000/test1.collection-0/core_node1/data/",
 "base_url":"http://slave3:9000/solr;,
 "node_name":"slave3:9000_solr",
 "state":"active",

"ulogDir":"hdfs://master:8000/test1.collection-0/core_node1/data/tlog"},
   "core_node2":{
 "core":"test1.collection-0_shard1_replica2",
"dataDir":"hdfs://master:8000/test1.collection-0/core_node2/data/",
 "base_url":"http://slave2:9000/solr;,
 "node_name":"slave2:9000_solr",
 "state":"active",

"ulogDir":"hdfs://master:8000/test1.collection-0/core_node2/data/tlog",
 "leader":"true"},
   "core_node3":{
 "core":"test1.collection-0_shard1_replica3",
"dataDir":"hdfs://master:8000/test1.collection-0/core_node3/data/",
 "base_url":"http://slave4:9005/solr;,
 "node_name":"slave4:9005_solr",
 "state":"active",

"ulogDir":"hdfs://master:8000/test1.collection-0/core_node3/data/tlog"

So every replica is always assigned to one node and this is being stored
in ZK, pretty much the same as for none HDFS setups. Just as the data is not
stored locally but on the network and as the path does not contain any node
information you can of course easily take over the work to a different Solr
node. You should just need to update the owner of the replica in ZK and you
should basically be done, I assume. That's why the documentation states that
an advantage of using HDFS is that a failing node can be replaced by a
different one. The Overseer just has to move the ownership of the replica,
which seems like what the code is trying to do. There just seems to be a bug
in the code so that the core does not get created on the target node.

Each data directory also contains a lock file. The documentation states
that one should use the HdfsLockFactory, which unfortunately can easily lead
to SOLR-8335, which hopefully will be fixed by SOLR-8169. A manual cleanup
is however also easily done but seems to require a node restart to take
effect. But I'm also only recently playing around with all this ;-)

regards,
Hendrik

On 19.01.2017 16:40, Shawn Heisey wrote:

On 1/19/2017 4:09 AM, Hendrik Haddorp wrote:

Given that the data is on HDFS it shouldn't matter if any active
replica is left as the data does not need to get transferred from
another instance but the new core will just take over the existing
data. Thus a replication factor of 1 should also work just in that
case the shard would be down until the new core is up. Anyhow, it
looks like the above call is missing to set the shard id I guess or
some code is checking wrongly.

I know very little about how SolrCloud interacts with HDFS, so although
I'm reasonably certain about what comes below, I could be wrong.

I have not ever heard of SolrCloud being able to automatically take over
an existing index directory when it creates a replica, or even share
index directories unless the admin fools it into doing so without its
knowledge.  Sharing an index directory for replicas with SolrCloud 

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-21 Thread Erick Erickson
Hendrik:

bq: Not really sure why one replica needs to be up though.

I didn't write the code so I'm guessing a bit, but consider the
situation where you have no replicas for a shard up and add a new one.
Eventually it could become the leader but there would have been no
chance for it to check if it's version of the index was up to date.
But since it would be the leader, when other replicas for that shard
_do_ come on line they'd replicate the index down from the newly added
replica, possibly using very old data.

FWIW,
Erick

On Tue, Feb 21, 2017 at 1:12 PM, Hendrik Haddorp
 wrote:
> Hi,
>
> I had opened SOLR-10092 (https://issues.apache.org/jira/browse/SOLR-10092)
> for this a while ago. I was now able to gt this feature working with a very
> small code change. After a few seconds Solr reassigns the replica to a
> different Solr instance as long as one replica is still up. Not really sure
> why one replica needs to be up though. I added the patch based on Solr 6.3
> to the bug report. Would be great if it could be merged soon.
>
> regards,
> Hendrik
>
> On 19.01.2017 17:08, Hendrik Haddorp wrote:
>>
>> HDFS is like a shared filesystem so every Solr Cloud instance can access
>> the data using the same path or URL. The clusterstate.json looks like this:
>>
>> "shards":{"shard1":{
>> "range":"8000-7fff",
>> "state":"active",
>> "replicas":{
>>   "core_node1":{
>> "core":"test1.collection-0_shard1_replica1",
>> "dataDir":"hdfs://master...:8000/test1.collection-0/core_node1/data/",
>> "base_url":"http://slave3:9000/solr;,
>> "node_name":"slave3:9000_solr",
>> "state":"active",
>>
>> "ulogDir":"hdfs://master:8000/test1.collection-0/core_node1/data/tlog"},
>>   "core_node2":{
>> "core":"test1.collection-0_shard1_replica2",
>> "dataDir":"hdfs://master:8000/test1.collection-0/core_node2/data/",
>> "base_url":"http://slave2:9000/solr;,
>> "node_name":"slave2:9000_solr",
>> "state":"active",
>>
>> "ulogDir":"hdfs://master:8000/test1.collection-0/core_node2/data/tlog",
>> "leader":"true"},
>>   "core_node3":{
>> "core":"test1.collection-0_shard1_replica3",
>> "dataDir":"hdfs://master:8000/test1.collection-0/core_node3/data/",
>> "base_url":"http://slave4:9005/solr;,
>> "node_name":"slave4:9005_solr",
>> "state":"active",
>>
>> "ulogDir":"hdfs://master:8000/test1.collection-0/core_node3/data/tlog"
>>
>> So every replica is always assigned to one node and this is being stored
>> in ZK, pretty much the same as for none HDFS setups. Just as the data is not
>> stored locally but on the network and as the path does not contain any node
>> information you can of course easily take over the work to a different Solr
>> node. You should just need to update the owner of the replica in ZK and you
>> should basically be done, I assume. That's why the documentation states that
>> an advantage of using HDFS is that a failing node can be replaced by a
>> different one. The Overseer just has to move the ownership of the replica,
>> which seems like what the code is trying to do. There just seems to be a bug
>> in the code so that the core does not get created on the target node.
>>
>> Each data directory also contains a lock file. The documentation states
>> that one should use the HdfsLockFactory, which unfortunately can easily lead
>> to SOLR-8335, which hopefully will be fixed by SOLR-8169. A manual cleanup
>> is however also easily done but seems to require a node restart to take
>> effect. But I'm also only recently playing around with all this ;-)
>>
>> regards,
>> Hendrik
>>
>> On 19.01.2017 16:40, Shawn Heisey wrote:
>>>
>>> On 1/19/2017 4:09 AM, Hendrik Haddorp wrote:

 Given that the data is on HDFS it shouldn't matter if any active
 replica is left as the data does not need to get transferred from
 another instance but the new core will just take over the existing
 data. Thus a replication factor of 1 should also work just in that
 case the shard would be down until the new core is up. Anyhow, it
 looks like the above call is missing to set the shard id I guess or
 some code is checking wrongly.
>>>
>>> I know very little about how SolrCloud interacts with HDFS, so although
>>> I'm reasonably certain about what comes below, I could be wrong.
>>>
>>> I have not ever heard of SolrCloud being able to automatically take over
>>> an existing index directory when it creates a replica, or even share
>>> index directories unless the admin fools it into doing so without its
>>> knowledge.  Sharing an index directory for replicas with SolrCloud would
>>> NOT work correctly.  Solr must be able to update all replicas
>>> independently, which means that each of them will lock its index
>>> directory and write to it.
>>>

Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-02-21 Thread Hendrik Haddorp

Hi,

I had opened SOLR-10092 
(https://issues.apache.org/jira/browse/SOLR-10092) for this a while ago. 
I was now able to gt this feature working with a very small code change. 
After a few seconds Solr reassigns the replica to a different Solr 
instance as long as one replica is still up. Not really sure why one 
replica needs to be up though. I added the patch based on Solr 6.3 to 
the bug report. Would be great if it could be merged soon.


regards,
Hendrik

On 19.01.2017 17:08, Hendrik Haddorp wrote:
HDFS is like a shared filesystem so every Solr Cloud instance can 
access the data using the same path or URL. The clusterstate.json 
looks like this:


"shards":{"shard1":{
"range":"8000-7fff",
"state":"active",
"replicas":{
  "core_node1":{
"core":"test1.collection-0_shard1_replica1",
"dataDir":"hdfs://master...:8000/test1.collection-0/core_node1/data/",
"base_url":"http://slave3:9000/solr;,
"node_name":"slave3:9000_solr",
"state":"active",
"ulogDir":"hdfs://master:8000/test1.collection-0/core_node1/data/tlog"}, 


  "core_node2":{
"core":"test1.collection-0_shard1_replica2",
"dataDir":"hdfs://master:8000/test1.collection-0/core_node2/data/",
"base_url":"http://slave2:9000/solr;,
"node_name":"slave2:9000_solr",
"state":"active",
"ulogDir":"hdfs://master:8000/test1.collection-0/core_node2/data/tlog", 


"leader":"true"},
  "core_node3":{
"core":"test1.collection-0_shard1_replica3",
"dataDir":"hdfs://master:8000/test1.collection-0/core_node3/data/",
"base_url":"http://slave4:9005/solr;,
"node_name":"slave4:9005_solr",
"state":"active",
"ulogDir":"hdfs://master:8000/test1.collection-0/core_node3/data/tlog" 



So every replica is always assigned to one node and this is being 
stored in ZK, pretty much the same as for none HDFS setups. Just as 
the data is not stored locally but on the network and as the path does 
not contain any node information you can of course easily take over 
the work to a different Solr node. You should just need to update the 
owner of the replica in ZK and you should basically be done, I assume. 
That's why the documentation states that an advantage of using HDFS is 
that a failing node can be replaced by a different one. The Overseer 
just has to move the ownership of the replica, which seems like what 
the code is trying to do. There just seems to be a bug in the code so 
that the core does not get created on the target node.


Each data directory also contains a lock file. The documentation 
states that one should use the HdfsLockFactory, which unfortunately 
can easily lead to SOLR-8335, which hopefully will be fixed by 
SOLR-8169. A manual cleanup is however also easily done but seems to 
require a node restart to take effect. But I'm also only recently 
playing around with all this ;-)


regards,
Hendrik

On 19.01.2017 16:40, Shawn Heisey wrote:

On 1/19/2017 4:09 AM, Hendrik Haddorp wrote:

Given that the data is on HDFS it shouldn't matter if any active
replica is left as the data does not need to get transferred from
another instance but the new core will just take over the existing
data. Thus a replication factor of 1 should also work just in that
case the shard would be down until the new core is up. Anyhow, it
looks like the above call is missing to set the shard id I guess or
some code is checking wrongly.

I know very little about how SolrCloud interacts with HDFS, so although
I'm reasonably certain about what comes below, I could be wrong.

I have not ever heard of SolrCloud being able to automatically take over
an existing index directory when it creates a replica, or even share
index directories unless the admin fools it into doing so without its
knowledge.  Sharing an index directory for replicas with SolrCloud would
NOT work correctly.  Solr must be able to update all replicas
independently, which means that each of them will lock its index
directory and write to it.

It is my understanding (from reading messages on mailing lists) that
when using HDFS, Solr replicas are all separate and consume additional
disk space, just like on a regular filesystem.

I found the code that generates the "No shard id" exception, but my
knowledge of how the zookeeper code in Solr works is not deep enough to
understand what it means or how to fix it.

Thanks,
Shawn







Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-19 Thread Hendrik Haddorp
HDFS is like a shared filesystem so every Solr Cloud instance can access 
the data using the same path or URL. The clusterstate.json looks like this:


"shards":{"shard1":{
"range":"8000-7fff",
"state":"active",
"replicas":{
  "core_node1":{
"core":"test1.collection-0_shard1_replica1",
"dataDir":"hdfs://master...:8000/test1.collection-0/core_node1/data/",
"base_url":"http://slave3:9000/solr;,
"node_name":"slave3:9000_solr",
"state":"active",
"ulogDir":"hdfs://master:8000/test1.collection-0/core_node1/data/tlog"},
  "core_node2":{
"core":"test1.collection-0_shard1_replica2",
"dataDir":"hdfs://master:8000/test1.collection-0/core_node2/data/",
"base_url":"http://slave2:9000/solr;,
"node_name":"slave2:9000_solr",
"state":"active",
"ulogDir":"hdfs://master:8000/test1.collection-0/core_node2/data/tlog",
"leader":"true"},
  "core_node3":{
"core":"test1.collection-0_shard1_replica3",
"dataDir":"hdfs://master:8000/test1.collection-0/core_node3/data/",
"base_url":"http://slave4:9005/solr;,
"node_name":"slave4:9005_solr",
"state":"active",
"ulogDir":"hdfs://master:8000/test1.collection-0/core_node3/data/tlog"

So every replica is always assigned to one node and this is being stored 
in ZK, pretty much the same as for none HDFS setups. Just as the data is 
not stored locally but on the network and as the path does not contain 
any node information you can of course easily take over the work to a 
different Solr node. You should just need to update the owner of the 
replica in ZK and you should basically be done, I assume. That's why the 
documentation states that an advantage of using HDFS is that a failing 
node can be replaced by a different one. The Overseer just has to move 
the ownership of the replica, which seems like what the code is trying 
to do. There just seems to be a bug in the code so that the core does 
not get created on the target node.


Each data directory also contains a lock file. The documentation states 
that one should use the HdfsLockFactory, which unfortunately can easily 
lead to SOLR-8335, which hopefully will be fixed by SOLR-8169. A manual 
cleanup is however also easily done but seems to require a node restart 
to take effect. But I'm also only recently playing around with all this ;-)


regards,
Hendrik

On 19.01.2017 16:40, Shawn Heisey wrote:

On 1/19/2017 4:09 AM, Hendrik Haddorp wrote:

Given that the data is on HDFS it shouldn't matter if any active
replica is left as the data does not need to get transferred from
another instance but the new core will just take over the existing
data. Thus a replication factor of 1 should also work just in that
case the shard would be down until the new core is up. Anyhow, it
looks like the above call is missing to set the shard id I guess or
some code is checking wrongly.

I know very little about how SolrCloud interacts with HDFS, so although
I'm reasonably certain about what comes below, I could be wrong.

I have not ever heard of SolrCloud being able to automatically take over
an existing index directory when it creates a replica, or even share
index directories unless the admin fools it into doing so without its
knowledge.  Sharing an index directory for replicas with SolrCloud would
NOT work correctly.  Solr must be able to update all replicas
independently, which means that each of them will lock its index
directory and write to it.

It is my understanding (from reading messages on mailing lists) that
when using HDFS, Solr replicas are all separate and consume additional
disk space, just like on a regular filesystem.

I found the code that generates the "No shard id" exception, but my
knowledge of how the zookeeper code in Solr works is not deep enough to
understand what it means or how to fix it.

Thanks,
Shawn





Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-19 Thread Shawn Heisey
On 1/19/2017 4:09 AM, Hendrik Haddorp wrote:
> Given that the data is on HDFS it shouldn't matter if any active
> replica is left as the data does not need to get transferred from
> another instance but the new core will just take over the existing
> data. Thus a replication factor of 1 should also work just in that
> case the shard would be down until the new core is up. Anyhow, it
> looks like the above call is missing to set the shard id I guess or
> some code is checking wrongly. 

I know very little about how SolrCloud interacts with HDFS, so although
I'm reasonably certain about what comes below, I could be wrong.

I have not ever heard of SolrCloud being able to automatically take over
an existing index directory when it creates a replica, or even share
index directories unless the admin fools it into doing so without its
knowledge.  Sharing an index directory for replicas with SolrCloud would
NOT work correctly.  Solr must be able to update all replicas
independently, which means that each of them will lock its index
directory and write to it.

It is my understanding (from reading messages on mailing lists) that
when using HDFS, Solr replicas are all separate and consume additional
disk space, just like on a regular filesystem.

I found the code that generates the "No shard id" exception, but my
knowledge of how the zookeeper code in Solr works is not deep enough to
understand what it means or how to fix it.

Thanks,
Shawn



Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-19 Thread Hendrik Haddorp

Hi,
I'm seeing the same issue on Solr 6.3 using HDFS and a replication 
factor of 3, even though I believe a replication factor of 1 should work 
the same. When I stop a Solr instance this is detected and Solr actually 
wants to create a replica on a different instance. The command for that 
does however fail:


o.a.s.c.OverseerAutoReplicaFailoverThread Exception trying to create new 
replica on 
http://...:9000/solr:org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: 
Error from server at http://...:9000/solr: Error CREATEing SolrCore 
'test2.collection-09_shard1_replica1': Unable to create core 
[test2.collection-09_shard1_replica1] Caused by: No shard id for 
CoreDescriptor[name=test2.collection-09_shard1_replica1;instanceDir=/var/opt/solr/test2.collection-09_shard1_replica1]
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:593)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:262)
at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:251)
at 
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1219)
at 
org.apache.solr.cloud.OverseerAutoReplicaFailoverThread.createSolrCore(OverseerAutoReplicaFailoverThread.java:456)
at 
org.apache.solr.cloud.OverseerAutoReplicaFailoverThread.lambda$addReplica$0(OverseerAutoReplicaFailoverThread.java:251)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)

Given that the data is on HDFS it shouldn't matter if any active replica 
is left as the data does not need to get transferred from another 
instance but the new core will just take over the existing data. Thus a 
replication factor of 1 should also work just in that case the shard 
would be down until the new core is up. Anyhow, it looks like the above 
call is missing to set the shard id I guess or some code is checking 
wrongly.


On 14.01.2017 02:44, Shawn Heisey wrote:

On 1/13/2017 5:46 PM, Chetas Joshi wrote:

One of the things I have observed is: if I use the collection API to
create a replica for that shard, it does not complain about the config
which has been set to ReplicationFactor=1. If replication factor was
the issue as suggested by Shawn, shouldn't it complain?

The replicationFactor value is used by exactly two things:  initial
collection creation, and autoAddReplicas.  It will not affect ANY other
command or operation, including ADDREPLICA.  You can create MORE
replicas than replicationFactor indicates, and there will be no error
messages or warnings.

In order to have a replica automatically added, your replicationFactor
must be at least two, and the number of active replicas in the cloud for
a shard must be less than that number.  If that's the case and the
expiration times have been reached without recovery, then Solr will
automatically add replicas until there are at least as many replicas
operational as specified in replicationFactor.


I would also like to mention that I experience some instance dirs
getting deleted and also found this open bug
(https://issues.apache.org/jira/browse/SOLR-8905)

The description on that issue is incomprehensible.  I can't make any
sense out of it.  It mentions the core.properties file, but the error
message shown doesn't talk about the properties file at all.  The error
and issue description seem to have nothing at all to do with the code
lines that were quoted.  Also, it was reported on version 4.10.3 ... but
this is going to be significantly different from current 6.x versions,
and the 4.x versions will NOT be updated with bugfixes.

Thanks,
Shawn





Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-13 Thread Shawn Heisey
On 1/13/2017 5:46 PM, Chetas Joshi wrote:
> One of the things I have observed is: if I use the collection API to
> create a replica for that shard, it does not complain about the config
> which has been set to ReplicationFactor=1. If replication factor was
> the issue as suggested by Shawn, shouldn't it complain? 

The replicationFactor value is used by exactly two things:  initial
collection creation, and autoAddReplicas.  It will not affect ANY other
command or operation, including ADDREPLICA.  You can create MORE
replicas than replicationFactor indicates, and there will be no error
messages or warnings.

In order to have a replica automatically added, your replicationFactor
must be at least two, and the number of active replicas in the cloud for
a shard must be less than that number.  If that's the case and the
expiration times have been reached without recovery, then Solr will
automatically add replicas until there are at least as many replicas
operational as specified in replicationFactor.

> I would also like to mention that I experience some instance dirs
> getting deleted and also found this open bug
> (https://issues.apache.org/jira/browse/SOLR-8905) 

The description on that issue is incomprehensible.  I can't make any
sense out of it.  It mentions the core.properties file, but the error
message shown doesn't talk about the properties file at all.  The error
and issue description seem to have nothing at all to do with the code
lines that were quoted.  Also, it was reported on version 4.10.3 ... but
this is going to be significantly different from current 6.x versions,
and the 4.x versions will NOT be updated with bugfixes.

Thanks,
Shawn



Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-13 Thread Chetas Joshi
Erick, I have not changed any config. I have autoaddReplica = true for
individual collection config as well as the overall cluster config. Still,
it does not add a replica when I decommission a node.

Adding a replica is overseer's job. I looked at the logs of the overseer of
the solrCloud but could not find anything there as well.

I am doing some testing using different configs. I would be happy to share
my finding.

One of the things I have observed is: if I use the collection API to create
a replica for that shard, it does not complain about the config which has
been set to ReplicationFactor=1. If replication factor was the issue as
suggested by Shawn, shouldn't it complain?

I would also like to mention that I experience some instance dirs getting
deleted and also found this open bug (
https://issues.apache.org/jira/browse/SOLR-8905)

Thanks!

On Thu, Jan 12, 2017 at 9:50 AM, Erick Erickson 
wrote:

> Hmmm, have you changed any of the settings for autoAddReplcia? There
> are several parameters that govern how long before a replica would be
> added.
>
> But I suggest you use the Cloudera resources for this question, not
> only did they write this functionality, but Cloudera support is deeply
> embedded in HDFS and I suspect has _by far_ the most experience with
> it.
>
> And that said, anything you find out that would suggest good ways to
> clarify the docs would be most welcome!
>
> Best,
> Erick
>
> On Thu, Jan 12, 2017 at 8:42 AM, Shawn Heisey  wrote:
> > On 1/11/2017 7:14 PM, Chetas Joshi wrote:
> >> This is what I understand about how Solr works on HDFS. Please correct
> me
> >> if I am wrong.
> >>
> >> Although solr shard replication Factor = 1, HDFS default replication =
> 3.
> >> When the node goes down, the solr server running on that node goes down
> and
> >> hence the instance (core) representing the replica goes down. The data
> in
> >> on HDFS (distributed across all the datanodes of the hadoop cluster
> with 3X
> >> replication).  This is the reason why I have kept replicationFactor=1.
> >>
> >> As per the link:
> >> https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS
> >> One benefit to running Solr in HDFS is the ability to automatically add
> new
> >> replicas when the Overseer notices that a shard has gone down. Because
> the
> >> "gone" index shards are stored in HDFS, a new core will be created and
> the
> >> new core will point to the existing indexes in HDFS.
> >>
> >> This is the expected behavior of Solr overseer which I am not able to
> see.
> >> After a couple of hours a node was assigned to host the shard but the
> >> status of the shard is still "down" and the instance dir is missing on
> that
> >> node for that particular shard_replica.
> >
> > As I said before, I know very little about HDFS, so the following could
> > be wrong, but it makes sense so I'll say it:
> >
> > I would imagine that Solr doesn't know or care what your HDFS
> > replication is ... the only replicas it knows about are the ones that it
> > is managing itself.  The autoAddReplicas feature manages *SolrCloud*
> > replicas, not HDFS replicas.
> >
> > I have seen people say that multiple SolrCloud replicas will take up
> > additional space in HDFS -- they do not point at the same index files.
> > This is because proper Lucene operation requires that it lock an index
> > and prevent any other thread/process from writing to the index at the
> > same time.  When you index, SolrCloud updates all replicas independently
> > -- the only time indexes are replicated is when you add a new replica or
> > a serious problem has occurred and an index needs to be recovered.
> >
> > Thanks,
> > Shawn
> >
>


Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-12 Thread Erick Erickson
Hmmm, have you changed any of the settings for autoAddReplcia? There
are several parameters that govern how long before a replica would be
added.

But I suggest you use the Cloudera resources for this question, not
only did they write this functionality, but Cloudera support is deeply
embedded in HDFS and I suspect has _by far_ the most experience with
it.

And that said, anything you find out that would suggest good ways to
clarify the docs would be most welcome!

Best,
Erick

On Thu, Jan 12, 2017 at 8:42 AM, Shawn Heisey  wrote:
> On 1/11/2017 7:14 PM, Chetas Joshi wrote:
>> This is what I understand about how Solr works on HDFS. Please correct me
>> if I am wrong.
>>
>> Although solr shard replication Factor = 1, HDFS default replication = 3.
>> When the node goes down, the solr server running on that node goes down and
>> hence the instance (core) representing the replica goes down. The data in
>> on HDFS (distributed across all the datanodes of the hadoop cluster with 3X
>> replication).  This is the reason why I have kept replicationFactor=1.
>>
>> As per the link:
>> https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS
>> One benefit to running Solr in HDFS is the ability to automatically add new
>> replicas when the Overseer notices that a shard has gone down. Because the
>> "gone" index shards are stored in HDFS, a new core will be created and the
>> new core will point to the existing indexes in HDFS.
>>
>> This is the expected behavior of Solr overseer which I am not able to see.
>> After a couple of hours a node was assigned to host the shard but the
>> status of the shard is still "down" and the instance dir is missing on that
>> node for that particular shard_replica.
>
> As I said before, I know very little about HDFS, so the following could
> be wrong, but it makes sense so I'll say it:
>
> I would imagine that Solr doesn't know or care what your HDFS
> replication is ... the only replicas it knows about are the ones that it
> is managing itself.  The autoAddReplicas feature manages *SolrCloud*
> replicas, not HDFS replicas.
>
> I have seen people say that multiple SolrCloud replicas will take up
> additional space in HDFS -- they do not point at the same index files.
> This is because proper Lucene operation requires that it lock an index
> and prevent any other thread/process from writing to the index at the
> same time.  When you index, SolrCloud updates all replicas independently
> -- the only time indexes are replicated is when you add a new replica or
> a serious problem has occurred and an index needs to be recovered.
>
> Thanks,
> Shawn
>


Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-12 Thread Shawn Heisey
On 1/11/2017 7:14 PM, Chetas Joshi wrote:
> This is what I understand about how Solr works on HDFS. Please correct me
> if I am wrong.
>
> Although solr shard replication Factor = 1, HDFS default replication = 3.
> When the node goes down, the solr server running on that node goes down and
> hence the instance (core) representing the replica goes down. The data in
> on HDFS (distributed across all the datanodes of the hadoop cluster with 3X
> replication).  This is the reason why I have kept replicationFactor=1.
>
> As per the link:
> https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS
> One benefit to running Solr in HDFS is the ability to automatically add new
> replicas when the Overseer notices that a shard has gone down. Because the
> "gone" index shards are stored in HDFS, a new core will be created and the
> new core will point to the existing indexes in HDFS.
>
> This is the expected behavior of Solr overseer which I am not able to see.
> After a couple of hours a node was assigned to host the shard but the
> status of the shard is still "down" and the instance dir is missing on that
> node for that particular shard_replica.

As I said before, I know very little about HDFS, so the following could
be wrong, but it makes sense so I'll say it:

I would imagine that Solr doesn't know or care what your HDFS
replication is ... the only replicas it knows about are the ones that it
is managing itself.  The autoAddReplicas feature manages *SolrCloud*
replicas, not HDFS replicas.

I have seen people say that multiple SolrCloud replicas will take up
additional space in HDFS -- they do not point at the same index files. 
This is because proper Lucene operation requires that it lock an index
and prevent any other thread/process from writing to the index at the
same time.  When you index, SolrCloud updates all replicas independently
-- the only time indexes are replicated is when you add a new replica or
a serious problem has occurred and an index needs to be recovered.

Thanks,
Shawn



Re: Solr on HDFS: AutoAddReplica does not add a replica

2017-01-11 Thread Shawn Heisey
On 1/11/2017 1:47 PM, Chetas Joshi wrote:
> I have deployed a SolrCloud (solr 5.5.0) on hdfs using cloudera 5.4.7. The
> cloud has 86 nodes.
>
> This is my config for the collection
>
> numShards=80
> ReplicationFactor=1
> maxShardsPerNode=1
> autoAddReplica=true
>
> I recently decommissioned a node to resolve some disk issues. The shard
> that was being hosted on that host is now being shown as "gone" on the solr
> admin UI.
>
> The got the cluster status using the collection API. It says
> shard: active, replica: down
>
> The overseer does not seem to be creating an extra core even though
> autoAddReplica=true (
> https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS).
>
> Is this happening because the overseer sees the shard as active as
> suggested by the cluster status?
> If yes, is "autoAddReplica" not reliable? should I add a replica for this
> shard when such cases arise?

Your replicationFactor is one.  When there's one replica, you have no
redundancy.  If that replica goes down, the shard is completely gone.

As I understand it (I've got no experience with HDFS at all),
autoAddReplicas is designed to automatically add replicas until
replicationFactor is satisfied.  As already mentioned, your
replicationFactor is one.  This means that it will always be satisfied.

If autoAddReplicas were to kick in any time a replica went down, then
Solr would be busy adding replicas anytime you restarted a node ...
which would be a very bad idea.

If your number of replicas is one, and that replica goes down, where
would Solr go to get the data to create another replica?  The single
replica is down, so there's nothing to copy from.  You might be thinking
"from the leader" ... but a leader is nothing more than a replica that
has been temporarily elected to have an extra job.  A replicationFactor
of two doesn't mean a leader and two copies .. it means there are a
total of two replicas, one of which is elected leader.

If you want autoAddReplicas to work, you're going to need to have a
replicationFactor of at least two, and you're probably going to have to
delete the dead replica before another will be created.

Thanks,
Shawn