Error returned that means unknown

2019-04-26 Thread Long Quanzheng
Hi

We found that Cassandra can return timeout error but the actual operation
succeeded.
https://github.com/gocql/gocql/blob/master/conn.go#L1441

Is there a way to know all those kind of errors?

Here is the background why we need this:
We are using two phase commit:
1) append data to tableA
2) execute LWT on tableB
3) So if 2) fails we need to clean up the data written into tableA.
Otherwise we lost data.

Previously we always clean up  in 3) if we got any error from 2), which is
wrong for ErrorTimeoutNoReponse. This error simply means the result is
unknown. So we should whitelist this error. But we don't know what are the
other errors that should also be whitelisted.

*Alternative:*
What we can also do is instead of whitelisting, we do backlisting. We only
clean up in 3) for certain error that we know that 2) doesn't succeed. But
from the gocql code, I don't know how to find all of them. If we don't
clean up correctly in those case, we may leak some data that will never be
deleted.

Thanks


Re: Backup Restore

2019-04-26 Thread Alain RODRIGUEZ
Hello Ivan,

Is there a way I can do one command to backup and one to restore a backup?



Handling backups and restore automatically is not an easy task to work on.
It's not straight forward. But it's doable and some did some tools (with
both open source and commercial licences) do this process (or part of it)
for you.

I wrote a post last year, aiming at presenting existing 'ways' to do
backups, I then evaluated and compared them. Even if it's getting old, you
might find it interesting:
http://thelastpickle.com/blog/2018/04/03/cassandra-backup-and-restore-aws-ebs.html

Or the only way is to create snapshots of all the tables and then restore
> one by one?


You can also 'just' copy data over (process described in the post above)
but using snapshots reduces the chances of inconsistencies, especially when
ran at the same time on all nodes.
Also for restore, nothing obliges you to act node by node. When using
restore, it often means the service is off. Restoring all nodes at once is
possible and a good thing to do imho.

Hope that helps!

C*heers,
---
Alain Rodriguez - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

Le ven. 26 avr. 2019 à 05:49, Naman Gupta  a
écrit :

> You should take a look into ssTableLoader cassandra utility,
>
> https://docs.datastax.com/en/cassandra/3.0/cassandra/tools/toolsBulkloader.html
>
> On Fri, Apr 26, 2019 at 1:33 AM Ivan Junckes Filho 
> wrote:
>
>> Hi guys,
>>
>> I am trying do a bakup and restore script in a simple way. Is there a way
>> I can do one command to backup and one to restore a backup?
>>
>> Or the only way is to create snapshots of all the tables and then restore
>> one by one?
>>
>


Re: Re: Re: how to configure the Token Allocation Algorithm

2019-04-26 Thread Jean Carlo
Creating a fresh new cluster in aws using this procedure, I got this
problem once I am bootstrapping the second rack of the cluster of 6
machines with 3 racks and a keyspace of rf 3

WARN  [main] 2019-04-26 11:37:43,845 TokenAllocation.java:63 - Selected
tokens [-5106267594614944625, 623001446449719390, 7048665031315327212,
3265006217757525070, 5054577454645148534, 314677103601736696,
7660890915606146375, -5329427405842523680]
ERROR [main] 2019-04-26 11:37:43,860 CassandraDaemon.java:749 - Fatal
configuration error
org.apache.cassandra.exceptions.ConfigurationException: Token allocation
failed: the number of racks 2 in datacenter eu-west-3 is lower than its
replication factor 3.

Someone got this problem ?

I am not quite sure why I have this, since my cluster has 3 racks.

Cluster Information:
Name: test
Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
3bf63440-fad7-3371-9c14-4855ad11ee83: [192.0.0.1, 192.0.0.2]



Jean Carlo

"The best way to predict the future is to invent it" Alan Kay


On Thu, Jan 24, 2019 at 10:32 AM Ahmed Eljami 
wrote:

> Hi folks,
>
> What about adding new keyspaces in the existing cluster, test_2 with the
> same RF.
>
> It will use the same logic as the existing kesypace test ? Or I should
> restart nodes and add the new keyspace to the cassandra.yaml ?
>
> Thanks.
>
> Le mar. 2 oct. 2018 à 10:28, Varun Barala  a
> écrit :
>
>> Hi,
>>
>> Managing `initial_token` by yourself will give you more control over
>> scale-in and scale-out.
>> Let's say you have three node cluster with `num_token: 1`
>>
>> And your initial range looks like:-
>>
>> Datacenter: datacenter1
>> ==
>> AddressRackStatus State   LoadOwns
>>  Token
>>
>>3074457345618258602
>> 127.0.0.1  rack1   Up Normal  98.96 KiB   66.67%
>>  -9223372036854775808
>> 127.0.0.2  rack1   Up Normal  98.96 KiB   66.67%
>>  -3074457345618258603
>> 127.0.0.3  rack1   Up Normal  98.96 KiB   66.67%
>>  3074457345618258602
>>
>> Now let's say you want to scale out the cluster to twice the current
>> throughput(means you are adding 3 more nodes)
>>
>> If you are using AWS EBS volumes then you can use the same volumes and
>> spin three more nodes by selecting midpoints of existing ranges which means
>> your new nodes are already having data.
>> Once you have mounted volumes on your new nodes:-
>> * You need to delete every system table except schema related tables.
>> * You need to generate system/local table by yourself which has
>> `Bootstrap state` as completed and schema-version same as other existing
>> nodes.
>> * You need to remove extra data on all the machines using cleanup commands
>>
>> This is how you can scale out Cassandra cluster in the minutes. In case
>> you want to add nodes one by one then you need to write some small tool
>> which will always figure out the bigger range in the existing cluster and
>> will split it into the half.
>>
>> However, I never tested it thoroughly but this should work conceptually.
>> So here we are taking advantage of the fact that we have volumes(data) for
>> the new node beforehand so we no need to bootstrap them.
>>
>> Thanks & Regards,
>> Varun Barala
>>
>> On Tue, Oct 2, 2018 at 2:31 PM onmstester onmstester 
>> wrote:
>>
>>>
>>>
>>> Sent using Zoho Mail 
>>>
>>>
>>>  On Mon, 01 Oct 2018 18:36:03 +0330 *Alain RODRIGUEZ
>>> >* wrote 
>>>
>>> Hello again :),
>>>
>>> I thought a little bit more about this question, and I was actually
>>> wondering if something like this would work:
>>>
>>> Imagine 3 node cluster, and create them using:
>>> For the 3 nodes: `num_token: 4`
>>> Node 1: `intial_token: -9223372036854775808, -4611686018427387905, -2,
>>> 4611686018427387901`
>>> Node 2: `intial_token: -7686143364045646507, -3074457345618258604,
>>> 1537228672809129299, 6148914691236517202`
>>> Node 3: `intial_token: -6148914691236517206, -1537228672809129303,
>>> 3074457345618258600, 7686143364045646503`
>>>
>>>  If you know the initial size of your cluster, you can calculate the
>>> total number of tokens: number of nodes * vnodes and use the
>>> formula/python code above to get the tokens. Then use the first token for
>>> the first node, move to the second node, use the second token and repeat.
>>> In my case there is a total of 12 tokens (3 nodes, 4 tokens each)
>>> ```
>>> >>> number_of_tokens = 12
>>> >>> [str(((2**64 / number_of_tokens) * i) - 2**63) for i in
>>> range(number_of_tokens)]
>>> ['-9223372036854775808', '-7686143364045646507', '-6148914691236517206',
>>> '-4611686018427387905', '-3074457345618258604', '-1537228672809129303',
>>> '-2', '1537228672809129299', '3074457345618258600', '4611686018427387901',
>>> '6148914691236517202', '7686143364045646503']
>>> ```
>>>
>>>
>>> Using manual initial_token 

Re: A cluster (RF=3) not recovering after two nodes are stopped

2019-04-26 Thread Hiroyuki Yamada
Hello,

Thank you for some feedbacks.

>Ben
Thank you.
I've tested with lower concurrency in my side, the issue still occurs.
We are using 3 x T3.xlarge instances for C* and small and separate instance
for the client program.
But if we tried with 1 host with 3 C* nodes, the issue didn't occur.

> Alok
We also thought so and tested with hints disabled, but it doesn't make any
difference. (the issue still occurs)

Thanks,
Hiro




On Fri, Apr 26, 2019 at 8:19 AM Alok Dwivedi 
wrote:

> Could it be related to hinted hand offs being stored in Node1 and then
> attempted to be replayed in Node2 when it comes back causing more load as
> new mutations are also being applied from cassandra-stress at same time?
>
> Alok Dwivedi
> Senior Consultant
> https://www.instaclustr.com/
>
>
>
>
> On 26 Apr 2019, at 09:04, Ben Slater  wrote:
>
> In the absence of anyone else having any bright ideas - it still sounds to
> me like the kind of scenario that can occur in a heavily overloaded
> cluster. I would try again with a lower load.
>
> What size machines are you using for stress client and the nodes? Are they
> all on separate machines?
>
> Cheers
> Ben
>
> ---
>
>
> *Ben Slater**Chief Product Officer*
>
> 
>
>    
>
>
> Read our latest technical blog posts here
> .
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
> On Thu, 25 Apr 2019 at 17:26, Hiroyuki Yamada  wrote:
>
>> Hello,
>>
>> Sorry again.
>> We found yet another weird thing in this.
>> If we stop nodes with systemctl or just kill (TERM), it causes the
>> problem,
>> but if we kill -9, it doesn't cause the problem.
>>
>> Thanks,
>> Hiro
>>
>> On Wed, Apr 24, 2019 at 11:31 PM Hiroyuki Yamada 
>> wrote:
>>
>>> Sorry, I didn't write the version and the configurations.
>>> I've tested with C* 3.11.4, and
>>> the configurations are mostly set to default except for the replication
>>> factor and listen_address for proper networking.
>>>
>>> Thanks,
>>> Hiro
>>>
>>> On Wed, Apr 24, 2019 at 5:12 PM Hiroyuki Yamada 
>>> wrote:
>>>
 Hello Ben,

 Thank you for the quick reply.
 I haven't tried that case, but it does't recover even if I stopped the
 stress.

 Thanks,
 Hiro

 On Wed, Apr 24, 2019 at 3:36 PM Ben Slater 
 wrote:

> Is it possible that stress is overloading node 1 so it’s not
> recovering state properly when node 2 comes up? Have you tried running 
> with
> a lower load (say 2 or 3 threads)?
>
> Cheers
> Ben
>
> ---
>
>
> *Ben Slater*
> *Chief Product Officer*
>
>
> 
> 
> 
>
> Read our latest technical blog posts here
> .
>
> This email has been sent on behalf of Instaclustr Pty. Limited
> (Australia) and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not 
> copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>
> On Wed, 24 Apr 2019 at 16:28, Hiroyuki Yamada 
> wrote:
>
>> Hello,
>>
>> I faced a weird issue when recovering a cluster after two nodes are
>> stopped.
>> It is easily reproduce-able and looks like a bug or an issue to fix,
>> so let me write down the steps to reproduce.
>>
>> === STEPS TO REPRODUCE ===
>> * Create a 3-node cluster with RF=3
>>- node1(seed), node2, node3
>> * Start requests to the cluster with cassandra-stress (it continues
>> until the end)
>>- what we did: cassandra-stress mixed cl=QUORUM duration=10m
>> -errors ignore -node node1,node2,node3 -rate threads\>=16
>> threads\<=256
>> * Stop node3 normally (with systemctl stop)
>>- the system is still available because the quorum of nodes is
>> still available
>> * Stop node2 normally (with systemctl stop)
>>- the system is NOT available after it's stopped.
>>- the client gets `UnavailableException: Not enough replicas
>> available for query at consistency QUORUM`
>>- the client gets errors right away (so few ms)
>>- so far it's all expected
>> * Wait for 1