subject:"UnavailableException with 1 node down and RF=2\?"

Re: UnavailableException with 1 node down and RF=2?

2011-10-28 Thread Peter Schuller

>  Thank you for your explanations. Even with a RF=1 and one node down I don't
> understand why I can't at least read the data in the nodes that are still
> up?

You will be able to read data for row keys that do not live on the
node that is down. But for any request to a row which is on the node
that is down, Unavailable is the expected result. If the data simply
does not exist other than on the one single node, and that node is
down, there's nothing Cassandra, or any other system, can do ;)

> Also, why can't I at least perform writes with consistency level ANY and
> failover policy ON_FAIL_TRY_ALL_AVAILABLE...shouldn't the nodes that are up
> be able to take in the writes destined for the node that is down and perform
> hinted handoffs when it comes back again?

You seem to be mixing Hector stuff and Cassandra concepts here. So to
be clear: You can use CL.ANY in order to make writes be accepted even
if the one and only node that owns the data in question is down.
However, that data won't be *readable* until that node (1) comes back
up, and (2) hints are delivered to it. This is all in Cassandra.

The failover policy stuff applies to Hector and how it chooses to
select nodes, and should be orthogonal to whether or not data is
readable as such. Basically, don't try to use that to get around lack
of data due to nodes being down.

(Also, note that while I don't know/remember off hand, I don't think
Unavailable is going to be tried on all available as that indicates
the node responded correctly and that nodes are in fact actually down.
I would expect the policy to apply to cases where communication with
the co-ordinator node fails. But, I am speculating here and this might
be wrong.)

> Unless by construction Cassandra
> behaves in the way you describe (which is perfectly fine and I will use it
> that way from now on) it would be logical for the RF=1 to not affect the
> behaviour I expect from just reading the top level descriptions of Cassandra
> behaviour I found in the documentation.

If you mean that rows that are NOT on the node that is down should be
readable, then that is indeed the case. If you are unable to read data
from other rows, that is definitely unexpected.

In *that* case, the failover policy that you mention might be at play.
I.e., you want the hector client not to fail a request just because a
single node happens to be down. But since you're getting an
"unavailable" exception, that indicates that Hector was able to talk
to the selected Cassandra node, and that the node in question gave an
Unavailable exception back indicating that the read or write could not
be serviced at the given consistency level due to nodes being down.

I would start by double checking exactly which row key(s) are being
written to/read from, and whether they are truly not on the node(s)
that are down.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: UnavailableException with 1 node down and RF=2?

2011-10-28 Thread Alexandru Dan Sicoe

Hi Peter,
 Thank you for your explanations. Even with a RF=1 and one node down I don't
understand why I can't at least read the data in the nodes that are still
up? Also, why can't I at least perform writes with consistency level ANY and
failover policy ON_FAIL_TRY_ALL_AVAILABLE...shouldn't the nodes that are up
be able to take in the writes destined for the node that is down and perform
hinted handoffs when it comes back again? Unless by construction Cassandra
behaves in the way you describe (which is perfectly fine and I will use it
that way from now on) it would be logical for the RF=1 to not affect the
behaviour I expect from just reading the top level descriptions of Cassandra
behaviour I found in the documentation.

Cheers,
Alex

On Fri, Oct 28, 2011 at 10:58 AM, Peter Schuller <
peter.schul...@infidyne.com> wrote:

> > If you want to survive node failures, use an RF above 1. And then make
> > sure to use an appropriate consistency level.
>
> To elaborate a bit: RF, or replication factor, is the *total* number
> of copies of any piece of data in the cluster. So with only one copy,
> the data will not be available when a single node is down.
>
> Consistency levels control how many nodes are required to respond to
> requests before it is considered successful, and this has implications
> on availability. For example, if you want to survive a single node
> going down and you use RF=2, you must use ConsistencyLevel.ONE. If you
> used QUORUM or ALL, any read or write would fail (QUORUM of 2 is 2).
>
> Probably a common setup is to use RF=3 because it allows you to
> survive a node going down, while also allowing you to use QUORUM. But,
> whether that matters will be up to your use-case.
>
> --
> / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
>

-- 
Alexandru Dan Sicoe
MEng, CERN Marie Curie ACEOLE Fellow

Re: UnavailableException with 1 node down and RF=2?

2011-10-28 Thread Peter Schuller

> If you want to survive node failures, use an RF above 1. And then make
> sure to use an appropriate consistency level.

To elaborate a bit: RF, or replication factor, is the *total* number
of copies of any piece of data in the cluster. So with only one copy,
the data will not be available when a single node is down.

Consistency levels control how many nodes are required to respond to
requests before it is considered successful, and this has implications
on availability. For example, if you want to survive a single node
going down and you use RF=2, you must use ConsistencyLevel.ONE. If you
used QUORUM or ALL, any read or write would fail (QUORUM of 2 is 2).

Probably a common setup is to use RF=3 because it allows you to
survive a node going down, while also allowing you to use QUORUM. But,
whether that matters will be up to your use-case.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: UnavailableException with 1 node down and RF=2?

2011-10-28 Thread Peter Schuller

> took a node down to see how it behaves. All of a sudden I couldn't write or
[snip]
> me.prettyprint.hector.api.exceptions.HUnavailableException: : May not be
[snip]
>     Default replication factor = 1

So you have an RF=1 cluster (only one copy of data) and you bring a
node down. This fundamentally and necessarily means that the data on
the node you brought down will be unavailable.

If you want to survive node failures, use an RF above 1. And then make
sure to use an appropriate consistency level.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: UnavailableException with 1 node down and RF=2?

2011-10-28 Thread Alexandru Dan Sicoe

gt;>
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
>> >>
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
>> >>
>> >> --
>> >> View this message in context:
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/UnavailableException-with-1-node-down-and-RF-2-tp5242055p6936767.html
>> >> Sent from the cassandra-u...@incubator.apache.org mailing list archive
>> at Nabble.com.
>> >>
>> >
>> >
>> >
>> > --
>> > Jonathan Ellis
>> > Project Chair, Apache Cassandra
>> > co-founder of DataStax, the source for professional Cassandra support
>> > http://www.datastax.com
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>>
>
>

Re: UnavailableException with 1 node down and RF=2?

2011-10-27 Thread R. Verlangen

Thats correct. It was a read consistency problem, not so smart of me ;-)

Thank you anyway.

2011/10/27 Jonathan Ellis 

> (I see that you did start a new thread and solved it with Jake's help.)
>
> On Thu, Oct 27, 2011 at 11:23 AM, Jonathan Ellis 
> wrote:
> > Ha.  On the one hand, good on you for searching the list archives for
> > similar problems.  On the other hand, after over a year it's probably
> > worth starting a new thread. :)
> >
> > Standard questions:
> >
> > - What Cassandra version are you running?
> > - Are there exceptions in the log for the machine still running?
> > - What does "not responding anymore" mean?  Reporting timeouts,
> > reporting unavailable, refusing client connections, ... ?
> >
> > On Thu, Oct 27, 2011 at 10:22 AM, RobinUs2  wrote:
> >> I'm currently having a similar problem with a 2-node cluster. When 1
> shutdown
> >> one of the nodes, the other isn't responding any more.
> >>
> >> Did you found a solution for your problem?
> >>
> >> /I'm new to mailing lists, if it's inappropriate to reply here, please
> let
> >> me know../
> >>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
> >>
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
> >>
> >> --
> >> View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/UnavailableException-with-1-node-down-and-RF-2-tp5242055p6936767.html
> >> Sent from the cassandra-u...@incubator.apache.org mailing list archive
> at Nabble.com.
> >>
> >
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder of DataStax, the source for professional Cassandra support
> > http://www.datastax.com
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: UnavailableException with 1 node down and RF=2?

2011-10-27 Thread Jonathan Ellis

(I see that you did start a new thread and solved it with Jake's help.)

On Thu, Oct 27, 2011 at 11:23 AM, Jonathan Ellis  wrote:
> Ha.  On the one hand, good on you for searching the list archives for
> similar problems.  On the other hand, after over a year it's probably
> worth starting a new thread. :)
>
> Standard questions:
>
> - What Cassandra version are you running?
> - Are there exceptions in the log for the machine still running?
> - What does "not responding anymore" mean?  Reporting timeouts,
> reporting unavailable, refusing client connections, ... ?
>
> On Thu, Oct 27, 2011 at 10:22 AM, RobinUs2  wrote:
>> I'm currently having a similar problem with a 2-node cluster. When 1 shutdown
>> one of the nodes, the other isn't responding any more.
>>
>> Did you found a solution for your problem?
>>
>> /I'm new to mailing lists, if it's inappropriate to reply here, please let
>> me know../
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
>>
>> --
>> View this message in context: 
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/UnavailableException-with-1-node-down-and-RF-2-tp5242055p6936767.html
>> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
>> Nabble.com.
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: UnavailableException with 1 node down and RF=2?

2011-10-27 Thread Javier Canillas

What the problem might be is that you are setting the Consistency Level to a
value bigger than 1. In such cases, Cassandra will respond you with an
UnavailableException since it can't achieve the level of consistency you are
asking for.

Remember that, when you have RF=2, CS values as ALL and QUORUM are the same.

Regards,

Javier.

On Thu, Oct 27, 2011 at 1:23 PM, Jonathan Ellis  wrote:

> Ha.  On the one hand, good on you for searching the list archives for
> similar problems.  On the other hand, after over a year it's probably
> worth starting a new thread. :)
>
> Standard questions:
>
> - What Cassandra version are you running?
> - Are there exceptions in the log for the machine still running?
> - What does "not responding anymore" mean?  Reporting timeouts,
> reporting unavailable, refusing client connections, ... ?
>
> On Thu, Oct 27, 2011 at 10:22 AM, RobinUs2  wrote:
> > I'm currently having a similar problem with a 2-node cluster. When 1
> shutdown
> > one of the nodes, the other isn't responding any more.
> >
> > Did you found a solution for your problem?
> >
> > /I'm new to mailing lists, if it's inappropriate to reply here, please
> let
> > me know../
> >
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
> >
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
> >
> > --
> > View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/UnavailableException-with-1-node-down-and-RF-2-tp5242055p6936767.html
> > Sent from the cassandra-u...@incubator.apache.org mailing list archive
> at Nabble.com.
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: UnavailableException with 1 node down and RF=2?

2011-10-27 Thread Jonathan Ellis

Ha.  On the one hand, good on you for searching the list archives for
similar problems.  On the other hand, after over a year it's probably
worth starting a new thread. :)

Standard questions:

- What Cassandra version are you running?
- Are there exceptions in the log for the machine still running?
- What does "not responding anymore" mean?  Reporting timeouts,
reporting unavailable, refusing client connections, ... ?

On Thu, Oct 27, 2011 at 10:22 AM, RobinUs2  wrote:
> I'm currently having a similar problem with a 2-node cluster. When 1 shutdown
> one of the nodes, the other isn't responding any more.
>
> Did you found a solution for your problem?
>
> /I'm new to mailing lists, if it's inappropriate to reply here, please let
> me know../
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
>
> --
> View this message in context: 
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/UnavailableException-with-1-node-down-and-RF-2-tp5242055p6936767.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
> Nabble.com.
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: UnavailableException with 1 node down and RF=2?

2011-10-27 Thread RobinUs2

I'm currently having a similar problem with a 2-node cluster. When 1 shutdown
one of the nodes, the other isn't responding any more. 

Did you found a solution for your problem?

/I'm new to mailing lists, if it's inappropriate to reply here, please let
me know../
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/2-node-cluster-1-node-down-overall-failure-td6936722.html
 

--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/UnavailableException-with-1-node-down-and-RF-2-tp5242055p6936767.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Re: UnavailableException with 1 node down and RF=2?

2010-07-01 Thread Jonathan Ellis

Then either you have at least one machine that thinks RF=1 or you found a bug.

On Thu, Jul 1, 2010 at 7:08 AM, James Golick  wrote:
> It's happening consistently when I take any node out of rotation.
>
> On Thu, Jul 1, 2010 at 2:24 AM, Jonathan Ellis  wrote:
>>
>> Presumably the failure detector generated a false positive for a
>> second node temporarily
>>
>> On Wed, Jun 30, 2010 at 10:55 PM, James Golick 
>> wrote:
>> > Oops. I meant to say that I'm reading with CL.ONE.
>> >
>> > J.
>> >
>> > Sent from my iPhone.
>> >
>> > On 2010-07-01, at 1:39 AM, Benjamin Black  wrote:
>> >
>> >> .QUORUM or .ALL (they are the same with RF=2).
>> >>
>> >> On Wed, Jun 30, 2010 at 10:22 PM, James Golick 
>> >> wrote:
>> >>> 4 nodes, RF=2, 1 node down.
>> >>> How can I get an UnavailableException in that scenario?
>> >>> - J.
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: UnavailableException with 1 node down and RF=2?

2010-07-01 Thread James Golick

It's happening consistently when I take any node out of rotation.

On Thu, Jul 1, 2010 at 2:24 AM, Jonathan Ellis  wrote:

> Presumably the failure detector generated a false positive for a
> second node temporarily
>
> On Wed, Jun 30, 2010 at 10:55 PM, James Golick 
> wrote:
> > Oops. I meant to say that I'm reading with CL.ONE.
> >
> > J.
> >
> > Sent from my iPhone.
> >
> > On 2010-07-01, at 1:39 AM, Benjamin Black  wrote:
> >
> >> .QUORUM or .ALL (they are the same with RF=2).
> >>
> >> On Wed, Jun 30, 2010 at 10:22 PM, James Golick 
> wrote:
> >>> 4 nodes, RF=2, 1 node down.
> >>> How can I get an UnavailableException in that scenario?
> >>> - J.
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Re: UnavailableException with 1 node down and RF=2?

2010-06-30 Thread Jonathan Ellis

Presumably the failure detector generated a false positive for a
second node temporarily

On Wed, Jun 30, 2010 at 10:55 PM, James Golick  wrote:
> Oops. I meant to say that I'm reading with CL.ONE.
>
> J.
>
> Sent from my iPhone.
>
> On 2010-07-01, at 1:39 AM, Benjamin Black  wrote:
>
>> .QUORUM or .ALL (they are the same with RF=2).
>>
>> On Wed, Jun 30, 2010 at 10:22 PM, James Golick  wrote:
>>> 4 nodes, RF=2, 1 node down.
>>> How can I get an UnavailableException in that scenario?
>>> - J.
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: UnavailableException with 1 node down and RF=2?

2010-06-30 Thread James Golick

Oops. I meant to say that I'm reading with CL.ONE. 

J.

Sent from my iPhone.

On 2010-07-01, at 1:39 AM, Benjamin Black  wrote:

> .QUORUM or .ALL (they are the same with RF=2).
> 
> On Wed, Jun 30, 2010 at 10:22 PM, James Golick  wrote:
>> 4 nodes, RF=2, 1 node down.
>> How can I get an UnavailableException in that scenario?
>> - J.

Re: UnavailableException with 1 node down and RF=2?

2010-06-30 Thread Benjamin Black

.QUORUM or .ALL (they are the same with RF=2).

On Wed, Jun 30, 2010 at 10:22 PM, James Golick  wrote:
> 4 nodes, RF=2, 1 node down.
> How can I get an UnavailableException in that scenario?
> - J.

UnavailableException with 1 node down and RF=2?

2010-06-30 Thread James Golick

4 nodes, RF=2, 1 node down.

How can I get an UnavailableException in that scenario?

- J.

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

Re: UnavailableException with 1 node down and RF=2?

UnavailableException with 1 node down and RF=2?

16 matches

Site Navigation

Mail list logo

Footer information