Re: Seed nodes and bootstrap (was: Re: Initializing a multiple node cluster (multiple datacenters))

2018-02-26 Thread Oleksandr Shulgin
On Mon, Feb 26, 2018 at 7:05 PM, Jeff Jirsa  wrote:

>
> I'll happily click the re-open button (you could have, too), but I'm not
> sure what the 'right' fix is. Feel free to move discussion to 5836.
>

Thanks, Jeff.   Somehow, I don't see any control elements to change issue
status, even though I'm logged in, so I assume only project members / devs
can do that.

--
Alex


Re: Seed nodes and bootstrap (was: Re: Initializing a multiple node cluster (multiple datacenters))

2018-02-26 Thread Jeff Jirsa
That ticket was before I was really active contributing, but I tend to
agree with your assessment: clearly there's pain point there, and we can do
better than the status quo.

The problem (as Jonathan notes) is that its a complicated subsystem, and
the "obvious" fix probably isn't as obvious as it seems.

I'll happily click the re-open button (you could have, too), but I'm not
sure what the 'right' fix is. Feel free to move discussion to 5836.




On Mon, Feb 26, 2018 at 12:51 AM, Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> On Fri, Feb 23, 2018 at 7:35 PM, Jeff Jirsa  wrote:
>
>> It comes up from time to time.  Rob Coli spent years arguing that this
>> behavior was confusing ( https://issues.apache.org/jira
>> /browse/CASSANDRA-5836 ) , especially in the "I'm replacing a failed
>> seed" sense. It also comes up when you're adding the first few hosts to a
>> new DC (where they're new, but they're definitely going to be the seeds for
>> the new DC).
>>
>
> Jeff,
>
> I find the response on this ticket quite terrible: a number of independent
> reports of significant problems caused by this behavior doesn't justify the
> "Won't Fix" status, IMO.
>
> We were also hit by this one time when the expected location of data
> directory has changed in our Docker image.  We were performing a rolling
> update of the cluster and the first two nodes that we've updated happened
> to be seeds.  They started happily with blank data directory and were
> serving read requests.  Ouch.  We only realized there was a problem then
> the next node that we've updated failed to start.  The only reason is that
> it *did* try to bootstrap and failed.
>
> People use to repeat "seed nodes are not different from non-seeds" and
> it's true from the perspective of a client application.  The same people
> would repeat "seeds don't bootstrap" as some kind of magical incantation,
> so seeds *are* different and in a subtle way for the operator.  But I don't
> believe that this difference is justified.  When creating a brand new
> cluster there is no practical difference as to using auto_bootstrap=true or
> false, because there is no data or clients, so the seed nodes behave
> exactly the same way as non-seeds.  When adding a new DC you are supposed
> to set auto_boostrap=false explicitly, so again no difference.
>
> Where it matters however, is node behavior in *unexpected* circumstances.
> If seeds nodes were truly not different from non-seeds in this regard,
> there would be less surprises, because of the total node uniformity within
> the cluster.
>
> Therefore, I argue that the ticket should be reopened.
>
> Regards,
> --
> Alex
>
>


Seed nodes and bootstrap (was: Re: Initializing a multiple node cluster (multiple datacenters))

2018-02-26 Thread Oleksandr Shulgin
On Fri, Feb 23, 2018 at 7:35 PM, Jeff Jirsa  wrote:

> It comes up from time to time.  Rob Coli spent years arguing that this
> behavior was confusing ( https://issues.apache.org/
> jira/browse/CASSANDRA-5836 ) , especially in the "I'm replacing a failed
> seed" sense. It also comes up when you're adding the first few hosts to a
> new DC (where they're new, but they're definitely going to be the seeds for
> the new DC).
>

Jeff,

I find the response on this ticket quite terrible: a number of independent
reports of significant problems caused by this behavior doesn't justify the
"Won't Fix" status, IMO.

We were also hit by this one time when the expected location of data
directory has changed in our Docker image.  We were performing a rolling
update of the cluster and the first two nodes that we've updated happened
to be seeds.  They started happily with blank data directory and were
serving read requests.  Ouch.  We only realized there was a problem then
the next node that we've updated failed to start.  The only reason is that
it *did* try to bootstrap and failed.

People use to repeat "seed nodes are not different from non-seeds" and it's
true from the perspective of a client application.  The same people would
repeat "seeds don't bootstrap" as some kind of magical incantation, so
seeds *are* different and in a subtle way for the operator.  But I don't
believe that this difference is justified.  When creating a brand new
cluster there is no practical difference as to using auto_bootstrap=true or
false, because there is no data or clients, so the seed nodes behave
exactly the same way as non-seeds.  When adding a new DC you are supposed
to set auto_boostrap=false explicitly, so again no difference.

Where it matters however, is node behavior in *unexpected* circumstances.
If seeds nodes were truly not different from non-seeds in this regard,
there would be less surprises, because of the total node uniformity within
the cluster.

Therefore, I argue that the ticket should be reopened.

Regards,
--
Alex