Re: Reads not returning data after adding node

2023-04-04 Thread David Tinker
The Datastax doc says to run cleanup one node at a time after bootstrapping
has completed. The myadventuresincoding post says to run a repair on each
node first. Is it necessary to run the repairs first? Thanks.

On Tue, Apr 4, 2023 at 1:11 PM Bowen Song via user <
user@cassandra.apache.org> wrote:

> Perhaps have a read here?
> https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsAddNodeToCluster.html
>
>
> On 04/04/2023 06:41, David Tinker wrote:
>
> Ok. Have to psych myself up to the add node task a bit. Didn't go well the
> first time round!
>
> Tasks
> - Make sure the new node is not in seeds list!
> - Check cluster name, listen address, rpc address
> - Give it its own rack in cassandra-rackdc.properties
> - Delete cassandra-topology.properties if it exists
> - Make sure no compactions are on the go
> - rm -rf /var/lib/cassandra/*
> - rm /data/cassandra/commitlog/* (this is on different disk)
> - systemctl start cassandra
>
> And it should start streaming data from the other nodes and join the
> cluster. Anything else I have to watch out for? Tx.
>
>
> On Tue, Apr 4, 2023 at 5:25 AM Jeff Jirsa  wrote:
>
>> Because executing “removenode” streamed extra data from live nodes to the
>> “gaining” replica
>>
>> Oversimplified (if you had one token per node)
>>
>> If you  start with A B C
>>
>> Then add D
>>
>> D should bootstrap a range from each of A B and C, but at the end, some
>> of the data that was A B C becomes B C D
>>
>> When you removenode, you tell B and C to send data back to A.
>>
>> A B and C will eventually contact that data away. Eventually.
>>
>> If you get around to adding D again, running “cleanup” when you’re done
>> (successfully) will remove a lot of it.
>>
>>
>>
>> On Apr 3, 2023, at 8:14 PM, David Tinker  wrote:
>>
>> 
>> Looks like the remove has sorted things out. Thanks.
>>
>> One thing I am wondering about is why the nodes are carrying a lot more
>> data? The loads were about 2.7T before, now 3.4T.
>>
>> # nodetool status
>> Datacenter: dc1
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address  Load  Tokens  Owns (effective)  Host ID
>>   Rack
>> UN  xxx.xxx.xxx.105  3.4 TiB   256 100.0%
>>  afd02287-3f88-4c6f-8b27-06f7a8192402  rack3
>> UN  xxx.xxx.xxx.253  3.34 TiB  256 100.0%
>>  e1af72be-e5df-4c6b-a124-c7bc48c6602a  rack2
>> UN  xxx.xxx.xxx.107  3.44 TiB  256 100.0%
>>  ab72f017-be96-41d2-9bef-a551dec2c7b5  rack1
>>
>> On Mon, Apr 3, 2023 at 5:42 PM Bowen Song via user <
>> user@cassandra.apache.org> wrote:
>>
>>> That's correct. nodetool removenode is strongly preferred when your
>>> node is already down. If the node is still functional, use nodetool
>>> decommission on the node instead.
>>> On 03/04/2023 16:32, Jeff Jirsa wrote:
>>>
>>> FWIW, `nodetool decommission` is strongly preferred. `nodetool
>>> removenode` is designed to be run when a host is offline. Only decommission
>>> is guaranteed to maintain consistency / correctness, and removemode
>>> probably streams a lot more data around than decommission.
>>>
>>>
>>> On Mon, Apr 3, 2023 at 6:47 AM Bowen Song via user <
>>> user@cassandra.apache.org> wrote:
>>>
>>>> Use nodetool removenode is strongly preferred in most circumstances,
>>>> and only resort to assassinate if you do not care about data
>>>> consistency or you know there won't be any consistency issue (e.g. no new
>>>> writes and did not run nodetool cleanup).
>>>>
>>>> Since the size of data on the new node is small, nodetool removenode
>>>> should finish fairly quickly and bring your cluster back.
>>>>
>>>> Next time when you are doing something like this again, please test it
>>>> out on a non-production environment, make sure everything works as expected
>>>> before moving onto the production.
>>>>
>>>>
>>>> On 03/04/2023 06:28, David Tinker wrote:
>>>>
>>>> Should I use assassinate or removenode? Given that there is some data
>>>> on the node. Or will that be found on the other nodes? Sorry for all the
>>>> questions but I really don't want to mess up.
>>>>
>>>> On Mon, Apr 3, 2023 at 7:21 AM Carlos Diaz  wrote:
>>>>
>>>>> That's what nodetool assassinte will do.
>>>>>
>>>>

Re: Reads not returning data after adding node

2023-04-04 Thread David Tinker
Thanks. I also found this useful:
https://myadventuresincoding.wordpress.com/2020/08/03/cassandra-how-to-add-a-new-node-to-an-existing-cluster/

The node seems to be joining fine and is streaming in lots of data. Cluster
is still operating normally.



On Tue, Apr 4, 2023 at 1:11 PM Bowen Song via user <
user@cassandra.apache.org> wrote:

> Perhaps have a read here?
> https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsAddNodeToCluster.html
>
>
> On 04/04/2023 06:41, David Tinker wrote:
>
> Ok. Have to psych myself up to the add node task a bit. Didn't go well the
> first time round!
>
> Tasks
> - Make sure the new node is not in seeds list!
> - Check cluster name, listen address, rpc address
> - Give it its own rack in cassandra-rackdc.properties
> - Delete cassandra-topology.properties if it exists
> - Make sure no compactions are on the go
> - rm -rf /var/lib/cassandra/*
> - rm /data/cassandra/commitlog/* (this is on different disk)
> - systemctl start cassandra
>
> And it should start streaming data from the other nodes and join the
> cluster. Anything else I have to watch out for? Tx.
>
>
> On Tue, Apr 4, 2023 at 5:25 AM Jeff Jirsa  wrote:
>
>> Because executing “removenode” streamed extra data from live nodes to the
>> “gaining” replica
>>
>> Oversimplified (if you had one token per node)
>>
>> If you  start with A B C
>>
>> Then add D
>>
>> D should bootstrap a range from each of A B and C, but at the end, some
>> of the data that was A B C becomes B C D
>>
>> When you removenode, you tell B and C to send data back to A.
>>
>> A B and C will eventually contact that data away. Eventually.
>>
>> If you get around to adding D again, running “cleanup” when you’re done
>> (successfully) will remove a lot of it.
>>
>>
>>
>> On Apr 3, 2023, at 8:14 PM, David Tinker  wrote:
>>
>> 
>> Looks like the remove has sorted things out. Thanks.
>>
>> One thing I am wondering about is why the nodes are carrying a lot more
>> data? The loads were about 2.7T before, now 3.4T.
>>
>> # nodetool status
>> Datacenter: dc1
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address  Load  Tokens  Owns (effective)  Host ID
>>   Rack
>> UN  xxx.xxx.xxx.105  3.4 TiB   256 100.0%
>>  afd02287-3f88-4c6f-8b27-06f7a8192402  rack3
>> UN  xxx.xxx.xxx.253  3.34 TiB  256 100.0%
>>  e1af72be-e5df-4c6b-a124-c7bc48c6602a  rack2
>> UN  xxx.xxx.xxx.107  3.44 TiB  256 100.0%
>>  ab72f017-be96-41d2-9bef-a551dec2c7b5  rack1
>>
>> On Mon, Apr 3, 2023 at 5:42 PM Bowen Song via user <
>> user@cassandra.apache.org> wrote:
>>
>>> That's correct. nodetool removenode is strongly preferred when your
>>> node is already down. If the node is still functional, use nodetool
>>> decommission on the node instead.
>>> On 03/04/2023 16:32, Jeff Jirsa wrote:
>>>
>>> FWIW, `nodetool decommission` is strongly preferred. `nodetool
>>> removenode` is designed to be run when a host is offline. Only decommission
>>> is guaranteed to maintain consistency / correctness, and removemode
>>> probably streams a lot more data around than decommission.
>>>
>>>
>>> On Mon, Apr 3, 2023 at 6:47 AM Bowen Song via user <
>>> user@cassandra.apache.org> wrote:
>>>
>>>> Use nodetool removenode is strongly preferred in most circumstances,
>>>> and only resort to assassinate if you do not care about data
>>>> consistency or you know there won't be any consistency issue (e.g. no new
>>>> writes and did not run nodetool cleanup).
>>>>
>>>> Since the size of data on the new node is small, nodetool removenode
>>>> should finish fairly quickly and bring your cluster back.
>>>>
>>>> Next time when you are doing something like this again, please test it
>>>> out on a non-production environment, make sure everything works as expected
>>>> before moving onto the production.
>>>>
>>>>
>>>> On 03/04/2023 06:28, David Tinker wrote:
>>>>
>>>> Should I use assassinate or removenode? Given that there is some data
>>>> on the node. Or will that be found on the other nodes? Sorry for all the
>>>> questions but I really don't want to mess up.
>>>>
>>>> On Mon, Apr 3, 2023 at 7:21 AM Carlos Diaz  wrote:
>>>>
>>>>> That's what nodetool assassinte will 

Re: Reads not returning data after adding node

2023-04-03 Thread David Tinker
Ok. Have to psych myself up to the add node task a bit. Didn't go well the
first time round!

Tasks
- Make sure the new node is not in seeds list!
- Check cluster name, listen address, rpc address
- Give it its own rack in cassandra-rackdc.properties
- Delete cassandra-topology.properties if it exists
- Make sure no compactions are on the go
- rm -rf /var/lib/cassandra/*
- rm /data/cassandra/commitlog/* (this is on different disk)
- systemctl start cassandra

And it should start streaming data from the other nodes and join the
cluster. Anything else I have to watch out for? Tx.


On Tue, Apr 4, 2023 at 5:25 AM Jeff Jirsa  wrote:

> Because executing “removenode” streamed extra data from live nodes to the
> “gaining” replica
>
> Oversimplified (if you had one token per node)
>
> If you  start with A B C
>
> Then add D
>
> D should bootstrap a range from each of A B and C, but at the end, some of
> the data that was A B C becomes B C D
>
> When you removenode, you tell B and C to send data back to A.
>
> A B and C will eventually contact that data away. Eventually.
>
> If you get around to adding D again, running “cleanup” when you’re done
> (successfully) will remove a lot of it.
>
>
>
> On Apr 3, 2023, at 8:14 PM, David Tinker  wrote:
>
> 
> Looks like the remove has sorted things out. Thanks.
>
> One thing I am wondering about is why the nodes are carrying a lot more
> data? The loads were about 2.7T before, now 3.4T.
>
> # nodetool status
> Datacenter: dc1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address  Load  Tokens  Owns (effective)  Host ID
> Rack
> UN  xxx.xxx.xxx.105  3.4 TiB   256 100.0%
>  afd02287-3f88-4c6f-8b27-06f7a8192402  rack3
> UN  xxx.xxx.xxx.253  3.34 TiB  256 100.0%
>  e1af72be-e5df-4c6b-a124-c7bc48c6602a  rack2
> UN  xxx.xxx.xxx.107  3.44 TiB  256 100.0%
>  ab72f017-be96-41d2-9bef-a551dec2c7b5  rack1
>
> On Mon, Apr 3, 2023 at 5:42 PM Bowen Song via user <
> user@cassandra.apache.org> wrote:
>
>> That's correct. nodetool removenode is strongly preferred when your node
>> is already down. If the node is still functional, use nodetool
>> decommission on the node instead.
>> On 03/04/2023 16:32, Jeff Jirsa wrote:
>>
>> FWIW, `nodetool decommission` is strongly preferred. `nodetool
>> removenode` is designed to be run when a host is offline. Only decommission
>> is guaranteed to maintain consistency / correctness, and removemode
>> probably streams a lot more data around than decommission.
>>
>>
>> On Mon, Apr 3, 2023 at 6:47 AM Bowen Song via user <
>> user@cassandra.apache.org> wrote:
>>
>>> Use nodetool removenode is strongly preferred in most circumstances,
>>> and only resort to assassinate if you do not care about data
>>> consistency or you know there won't be any consistency issue (e.g. no new
>>> writes and did not run nodetool cleanup).
>>>
>>> Since the size of data on the new node is small, nodetool removenode
>>> should finish fairly quickly and bring your cluster back.
>>>
>>> Next time when you are doing something like this again, please test it
>>> out on a non-production environment, make sure everything works as expected
>>> before moving onto the production.
>>>
>>>
>>> On 03/04/2023 06:28, David Tinker wrote:
>>>
>>> Should I use assassinate or removenode? Given that there is some data on
>>> the node. Or will that be found on the other nodes? Sorry for all the
>>> questions but I really don't want to mess up.
>>>
>>> On Mon, Apr 3, 2023 at 7:21 AM Carlos Diaz  wrote:
>>>
>>>> That's what nodetool assassinte will do.
>>>>
>>>> On Sun, Apr 2, 2023 at 10:19 PM David Tinker 
>>>> wrote:
>>>>
>>>>> Is it possible for me to remove the node from the cluster i.e. to undo
>>>>> this mess and get the cluster operating again?
>>>>>
>>>>> On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz 
>>>>> wrote:
>>>>>
>>>>>> You can leave it in the seed list of the other nodes, just make sure
>>>>>> it's not included in this node's seed list.  However, if you do decide to
>>>>>> fix the issue with the racks first assassinate this node (nodetool
>>>>>> assassinate ), and update the rack name before you restart.
>>>>>>
>>>>>> On Sun, Apr 2, 2023 at 10:06 PM David Tinker 
>>>>>> wrote:
>>>>>>
>>>

Re: Reads not returning data after adding node

2023-04-03 Thread David Tinker
Looks like the remove has sorted things out. Thanks.

One thing I am wondering about is why the nodes are carrying a lot more
data? The loads were about 2.7T before, now 3.4T.

# nodetool status
Datacenter: dc1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load  Tokens  Owns (effective)  Host ID
Rack
UN  xxx.xxx.xxx.105  3.4 TiB   256 100.0%
 afd02287-3f88-4c6f-8b27-06f7a8192402  rack3
UN  xxx.xxx.xxx.253  3.34 TiB  256 100.0%
 e1af72be-e5df-4c6b-a124-c7bc48c6602a  rack2
UN  xxx.xxx.xxx.107  3.44 TiB  256 100.0%
 ab72f017-be96-41d2-9bef-a551dec2c7b5  rack1

On Mon, Apr 3, 2023 at 5:42 PM Bowen Song via user <
user@cassandra.apache.org> wrote:

> That's correct. nodetool removenode is strongly preferred when your node
> is already down. If the node is still functional, use nodetool
> decommission on the node instead.
> On 03/04/2023 16:32, Jeff Jirsa wrote:
>
> FWIW, `nodetool decommission` is strongly preferred. `nodetool removenode`
> is designed to be run when a host is offline. Only decommission is
> guaranteed to maintain consistency / correctness, and removemode probably
> streams a lot more data around than decommission.
>
>
> On Mon, Apr 3, 2023 at 6:47 AM Bowen Song via user <
> user@cassandra.apache.org> wrote:
>
>> Use nodetool removenode is strongly preferred in most circumstances, and
>> only resort to assassinate if you do not care about data consistency or
>> you know there won't be any consistency issue (e.g. no new writes and did
>> not run nodetool cleanup).
>>
>> Since the size of data on the new node is small, nodetool removenode
>> should finish fairly quickly and bring your cluster back.
>>
>> Next time when you are doing something like this again, please test it
>> out on a non-production environment, make sure everything works as expected
>> before moving onto the production.
>>
>>
>> On 03/04/2023 06:28, David Tinker wrote:
>>
>> Should I use assassinate or removenode? Given that there is some data on
>> the node. Or will that be found on the other nodes? Sorry for all the
>> questions but I really don't want to mess up.
>>
>> On Mon, Apr 3, 2023 at 7:21 AM Carlos Diaz  wrote:
>>
>>> That's what nodetool assassinte will do.
>>>
>>> On Sun, Apr 2, 2023 at 10:19 PM David Tinker 
>>> wrote:
>>>
>>>> Is it possible for me to remove the node from the cluster i.e. to undo
>>>> this mess and get the cluster operating again?
>>>>
>>>> On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz  wrote:
>>>>
>>>>> You can leave it in the seed list of the other nodes, just make sure
>>>>> it's not included in this node's seed list.  However, if you do decide to
>>>>> fix the issue with the racks first assassinate this node (nodetool
>>>>> assassinate ), and update the rack name before you restart.
>>>>>
>>>>> On Sun, Apr 2, 2023 at 10:06 PM David Tinker 
>>>>> wrote:
>>>>>
>>>>>> It is also in the seeds list for the other nodes. Should I remove it
>>>>>> from those, restart them one at a time, then restart it?
>>>>>>
>>>>>> /etc/cassandra # grep -i bootstrap *
>>>>>> doesn't show anything so I don't think I have auto_bootstrap false.
>>>>>>
>>>>>> Thanks very much for the help.
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 3, 2023 at 7:01 AM Carlos Diaz 
>>>>>> wrote:
>>>>>>
>>>>>>> Just remove it from the seed list in the cassandra.yaml file and
>>>>>>> restart the node.  Make sure that auto_bootstrap is set to true first
>>>>>>> though.
>>>>>>>
>>>>>>> On Sun, Apr 2, 2023 at 9:59 PM David Tinker 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> So likely because I made it a seed node when I added it to the
>>>>>>>> cluster it didn't do the bootstrap process. How can I recover this?
>>>>>>>>
>>>>>>>> On Mon, Apr 3, 2023 at 6:41 AM David Tinker 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Yes replication factor is 3.
>>>>>>>>>
>>>>>>>>> I ran nodetool repair -pr on all the nodes (one at a time) and am
>>>>>>>>> still having issues getting data back

Re: Reads not returning data after adding node

2023-04-03 Thread David Tinker
Thanks. Yes my big screwup here was to make the new node a seed node so it
didn't get any data. I am going to add 3 more nodes, one at a time when the
cluster has finished with the remove and everything seems stable. Should I
do a full repair first or is the remove node operation basically doing that?

Re the racks. I just asked that question on this list and the answer was
that adding the new nodes as rack4, rack5 and rack6 is fine. They are all
on separate physical racks. Is that ok?

On Mon, Apr 3, 2023 at 5:16 PM Aaron Ploetz  wrote:

> The time it takes to stream data off of a node varies by network, cloud
> region, and other factors.  So it's not unheard of for it to take a bit to
> finish.
>
> Just thought I'd mention that auto_bootstrap is true by default.  So if
> you're not setting it, the node should bootstrap as long as it's not a seed
> node.
>
> As for the rack issue, yes, it's a good idea to keep your racks in
> multiples of your RF.  When performing token ownership calculations,
> Cassandra takes rack designation into consideration.  It tries to ensure
> that multiple replicas for a row are not placed in the same rack.  TBH -
> I'd build out two more nodes to have 6 nodes across 3 racks (2 in each),
> just to ensure even distribution.  Otherwise, you might notice that the
> nodes sharing a rack will consume disk at a different rate than the nodes
> which have their own rack.
>
> On Mon, Apr 3, 2023 at 8:57 AM David Tinker 
> wrote:
>
>> Thanks. Hmm, the remove has been busy for hours but seems to be
>> progressing.
>>
>> I have been running this on the nodes to monitor progress:
>> # nodetool netstats | grep Already
>> Receiving 92 files, 843934103369 bytes total. Already received 82
>> files (89.13%), 590204687299 bytes total (69.93%)
>> Sending 84 files, 860198753783 bytes total. Already sent 56 files
>> (66.67%), 307038785732 bytes total (35.69%)
>> Sending 78 files, 815573435637 bytes total. Already sent 56 files
>> (71.79%), 313079823738 bytes total (38.39%)
>>
>> The percentages are ticking up.
>>
>> # nodetool ring | head -20
>> Datacenter: dc1
>> ==
>> Address   RackStatus State   LoadOwns
>>Token
>>
>>9189523899826545641
>> xxx.xxx.xxx..24rack4   Down   Leaving 26.62 GiB   79.95%
>>  -9194674091837769168
>> xxx.xxx.xxx.107   rack1   Up Normal  2.68 TiB73.25%
>>-9168781258594813088
>> xxx.xxx.xxx.253   rack2   Up Normal  2.63 TiB73.92%
>>-9163037340977721917
>> xxx.xxx.xxx.105   rack3   Up Normal  2.68 TiB72.88%
>>-9148860739730046229
>>
>>
>> On Mon, Apr 3, 2023 at 3:46 PM Bowen Song via user <
>> user@cassandra.apache.org> wrote:
>>
>>> Use nodetool removenode is strongly preferred in most circumstances,
>>> and only resort to assassinate if you do not care about data
>>> consistency or you know there won't be any consistency issue (e.g. no new
>>> writes and did not run nodetool cleanup).
>>>
>>> Since the size of data on the new node is small, nodetool removenode
>>> should finish fairly quickly and bring your cluster back.
>>>
>>> Next time when you are doing something like this again, please test it
>>> out on a non-production environment, make sure everything works as expected
>>> before moving onto the production.
>>>
>>>
>>> On 03/04/2023 06:28, David Tinker wrote:
>>>
>>> Should I use assassinate or removenode? Given that there is some data on
>>> the node. Or will that be found on the other nodes? Sorry for all the
>>> questions but I really don't want to mess up.
>>>
>>> On Mon, Apr 3, 2023 at 7:21 AM Carlos Diaz  wrote:
>>>
>>>> That's what nodetool assassinte will do.
>>>>
>>>> On Sun, Apr 2, 2023 at 10:19 PM David Tinker 
>>>> wrote:
>>>>
>>>>> Is it possible for me to remove the node from the cluster i.e. to undo
>>>>> this mess and get the cluster operating again?
>>>>>
>>>>> On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz 
>>>>> wrote:
>>>>>
>>>>>> You can leave it in the seed list of the other nodes, just make sure
>>>>>> it's not included in this node's seed list.  However, if you do decide to
>>>>>> fix the issue with the racks first assassinate this node (nodetool
>>>>>> assassinate ), 

Understanding rack in cassandra-rackdc.properties

2023-04-03 Thread David Tinker
I have a 3 node cluster using the GossipingPropertyFileSnitch and
replication factor of 3. All nodes are leased hardware and more or less the
same. The cassandra-rackdc.properties files look like this:

dc=dc1
rack=rack1
(rack2 and rack3 for the other nodes)

Now I need to expand the cluster. I was going to use rack4 for the next
node, then rack5 and rack6 because the nodes are physically all on
different racks. Elsewhere on this list someone mentioned that I should use
rack1, rack2 and rack3 again.

Why is that?

Thanks
David


Re: Reads not returning data after adding node

2023-04-03 Thread David Tinker
Thanks. Hmm, the remove has been busy for hours but seems to be progressing.

I have been running this on the nodes to monitor progress:
# nodetool netstats | grep Already
Receiving 92 files, 843934103369 bytes total. Already received 82
files (89.13%), 590204687299 bytes total (69.93%)
Sending 84 files, 860198753783 bytes total. Already sent 56 files
(66.67%), 307038785732 bytes total (35.69%)
Sending 78 files, 815573435637 bytes total. Already sent 56 files
(71.79%), 313079823738 bytes total (38.39%)

The percentages are ticking up.

# nodetool ring | head -20
Datacenter: dc1
==
Address   RackStatus State   LoadOwns
 Token

 9189523899826545641
xxx.xxx.xxx..24rack4   Down   Leaving 26.62 GiB   79.95%
   -9194674091837769168
xxx.xxx.xxx.107   rack1   Up Normal  2.68 TiB73.25%
 -9168781258594813088
xxx.xxx.xxx.253   rack2   Up Normal  2.63 TiB73.92%
 -9163037340977721917
xxx.xxx.xxx.105   rack3   Up Normal  2.68 TiB72.88%
 -9148860739730046229


On Mon, Apr 3, 2023 at 3:46 PM Bowen Song via user <
user@cassandra.apache.org> wrote:

> Use nodetool removenode is strongly preferred in most circumstances, and
> only resort to assassinate if you do not care about data consistency or
> you know there won't be any consistency issue (e.g. no new writes and did
> not run nodetool cleanup).
>
> Since the size of data on the new node is small, nodetool removenode
> should finish fairly quickly and bring your cluster back.
>
> Next time when you are doing something like this again, please test it out
> on a non-production environment, make sure everything works as expected
> before moving onto the production.
>
>
> On 03/04/2023 06:28, David Tinker wrote:
>
> Should I use assassinate or removenode? Given that there is some data on
> the node. Or will that be found on the other nodes? Sorry for all the
> questions but I really don't want to mess up.
>
> On Mon, Apr 3, 2023 at 7:21 AM Carlos Diaz  wrote:
>
>> That's what nodetool assassinte will do.
>>
>> On Sun, Apr 2, 2023 at 10:19 PM David Tinker 
>> wrote:
>>
>>> Is it possible for me to remove the node from the cluster i.e. to undo
>>> this mess and get the cluster operating again?
>>>
>>> On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz  wrote:
>>>
>>>> You can leave it in the seed list of the other nodes, just make sure
>>>> it's not included in this node's seed list.  However, if you do decide to
>>>> fix the issue with the racks first assassinate this node (nodetool
>>>> assassinate ), and update the rack name before you restart.
>>>>
>>>> On Sun, Apr 2, 2023 at 10:06 PM David Tinker 
>>>> wrote:
>>>>
>>>>> It is also in the seeds list for the other nodes. Should I remove it
>>>>> from those, restart them one at a time, then restart it?
>>>>>
>>>>> /etc/cassandra # grep -i bootstrap *
>>>>> doesn't show anything so I don't think I have auto_bootstrap false.
>>>>>
>>>>> Thanks very much for the help.
>>>>>
>>>>>
>>>>> On Mon, Apr 3, 2023 at 7:01 AM Carlos Diaz 
>>>>> wrote:
>>>>>
>>>>>> Just remove it from the seed list in the cassandra.yaml file and
>>>>>> restart the node.  Make sure that auto_bootstrap is set to true first
>>>>>> though.
>>>>>>
>>>>>> On Sun, Apr 2, 2023 at 9:59 PM David Tinker 
>>>>>> wrote:
>>>>>>
>>>>>>> So likely because I made it a seed node when I added it to the
>>>>>>> cluster it didn't do the bootstrap process. How can I recover this?
>>>>>>>
>>>>>>> On Mon, Apr 3, 2023 at 6:41 AM David Tinker 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Yes replication factor is 3.
>>>>>>>>
>>>>>>>> I ran nodetool repair -pr on all the nodes (one at a time) and am
>>>>>>>> still having issues getting data back from queries.
>>>>>>>>
>>>>>>>> I did make the new node a seed node.
>>>>>>>>
>>>>>>>> Re "rack4": I assumed that was just an indication as to the
>>>>>>>> physical location of the server for redundancy. This one is separate 
>>>>>>>> from
>>>>>>&g

Re: Reads not returning data after adding node

2023-04-02 Thread David Tinker
If I have messed up with the rack thing I would like to get this node out
of the cluster so the cluster is functioning as quickly as possible. Then
do some more research and try again. So I am looking for the safest way to
do that.

On Mon, Apr 3, 2023 at 7:27 AM Jeff Jirsa  wrote:

> Just run nodetool rebuild on the new node
>
> If you assassinate it now you violate consistency for your most recent
> writes
>
>
>
> On Apr 2, 2023, at 10:22 PM, Carlos Diaz  wrote:
>
> 
> That's what nodetool assassinte will do.
>
> On Sun, Apr 2, 2023 at 10:19 PM David Tinker 
> wrote:
>
>> Is it possible for me to remove the node from the cluster i.e. to undo
>> this mess and get the cluster operating again?
>>
>> On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz  wrote:
>>
>>> You can leave it in the seed list of the other nodes, just make sure
>>> it's not included in this node's seed list.  However, if you do decide to
>>> fix the issue with the racks first assassinate this node (nodetool
>>> assassinate ), and update the rack name before you restart.
>>>
>>> On Sun, Apr 2, 2023 at 10:06 PM David Tinker 
>>> wrote:
>>>
>>>> It is also in the seeds list for the other nodes. Should I remove it
>>>> from those, restart them one at a time, then restart it?
>>>>
>>>> /etc/cassandra # grep -i bootstrap *
>>>> doesn't show anything so I don't think I have auto_bootstrap false.
>>>>
>>>> Thanks very much for the help.
>>>>
>>>>
>>>> On Mon, Apr 3, 2023 at 7:01 AM Carlos Diaz  wrote:
>>>>
>>>>> Just remove it from the seed list in the cassandra.yaml file and
>>>>> restart the node.  Make sure that auto_bootstrap is set to true first
>>>>> though.
>>>>>
>>>>> On Sun, Apr 2, 2023 at 9:59 PM David Tinker 
>>>>> wrote:
>>>>>
>>>>>> So likely because I made it a seed node when I added it to the
>>>>>> cluster it didn't do the bootstrap process. How can I recover this?
>>>>>>
>>>>>> On Mon, Apr 3, 2023 at 6:41 AM David Tinker 
>>>>>> wrote:
>>>>>>
>>>>>>> Yes replication factor is 3.
>>>>>>>
>>>>>>> I ran nodetool repair -pr on all the nodes (one at a time) and am
>>>>>>> still having issues getting data back from queries.
>>>>>>>
>>>>>>> I did make the new node a seed node.
>>>>>>>
>>>>>>> Re "rack4": I assumed that was just an indication as to the physical
>>>>>>> location of the server for redundancy. This one is separate from the 
>>>>>>> others
>>>>>>> so I used rack4.
>>>>>>>
>>>>>>> On Mon, Apr 3, 2023 at 6:30 AM Carlos Diaz 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I'm assuming that your replication factor is 3.  If that's the
>>>>>>>> case, did you intentionally put this node in rack 4?  Typically, you 
>>>>>>>> want
>>>>>>>> to add nodes in multiples of your replication factor in order to keep 
>>>>>>>> the
>>>>>>>> "racks" balanced.  In other words, this node should have been added to 
>>>>>>>> rack
>>>>>>>> 1, 2 or 3.
>>>>>>>>
>>>>>>>> Having said that, you should be able to easily fix your problem by
>>>>>>>> running a nodetool repair -pr on the new node.
>>>>>>>>
>>>>>>>> On Sun, Apr 2, 2023 at 8:16 PM David Tinker 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi All
>>>>>>>>>
>>>>>>>>> I recently added a node to my 3 node Cassandra 4.0.5 cluster and
>>>>>>>>> now many reads are not returning rows! What do I need to do to fix 
>>>>>>>>> this?
>>>>>>>>> There weren't any errors in the logs or other problems that I could 
>>>>>>>>> see. I
>>>>>>>>> expected the cluster to balance itself but this hasn't happened 
>>>>>>>>> (yet?). The
>>>>>>>>> nodes are similar so I have num_tokens=256 for each. I am using the
>

Re: Reads not returning data after adding node

2023-04-02 Thread David Tinker
Should I use assassinate or removenode? Given that there is some data on
the node. Or will that be found on the other nodes? Sorry for all the
questions but I really don't want to mess up.

On Mon, Apr 3, 2023 at 7:21 AM Carlos Diaz  wrote:

> That's what nodetool assassinte will do.
>
> On Sun, Apr 2, 2023 at 10:19 PM David Tinker 
> wrote:
>
>> Is it possible for me to remove the node from the cluster i.e. to undo
>> this mess and get the cluster operating again?
>>
>> On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz  wrote:
>>
>>> You can leave it in the seed list of the other nodes, just make sure
>>> it's not included in this node's seed list.  However, if you do decide to
>>> fix the issue with the racks first assassinate this node (nodetool
>>> assassinate ), and update the rack name before you restart.
>>>
>>> On Sun, Apr 2, 2023 at 10:06 PM David Tinker 
>>> wrote:
>>>
>>>> It is also in the seeds list for the other nodes. Should I remove it
>>>> from those, restart them one at a time, then restart it?
>>>>
>>>> /etc/cassandra # grep -i bootstrap *
>>>> doesn't show anything so I don't think I have auto_bootstrap false.
>>>>
>>>> Thanks very much for the help.
>>>>
>>>>
>>>> On Mon, Apr 3, 2023 at 7:01 AM Carlos Diaz  wrote:
>>>>
>>>>> Just remove it from the seed list in the cassandra.yaml file and
>>>>> restart the node.  Make sure that auto_bootstrap is set to true first
>>>>> though.
>>>>>
>>>>> On Sun, Apr 2, 2023 at 9:59 PM David Tinker 
>>>>> wrote:
>>>>>
>>>>>> So likely because I made it a seed node when I added it to the
>>>>>> cluster it didn't do the bootstrap process. How can I recover this?
>>>>>>
>>>>>> On Mon, Apr 3, 2023 at 6:41 AM David Tinker 
>>>>>> wrote:
>>>>>>
>>>>>>> Yes replication factor is 3.
>>>>>>>
>>>>>>> I ran nodetool repair -pr on all the nodes (one at a time) and am
>>>>>>> still having issues getting data back from queries.
>>>>>>>
>>>>>>> I did make the new node a seed node.
>>>>>>>
>>>>>>> Re "rack4": I assumed that was just an indication as to the physical
>>>>>>> location of the server for redundancy. This one is separate from the 
>>>>>>> others
>>>>>>> so I used rack4.
>>>>>>>
>>>>>>> On Mon, Apr 3, 2023 at 6:30 AM Carlos Diaz 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I'm assuming that your replication factor is 3.  If that's the
>>>>>>>> case, did you intentionally put this node in rack 4?  Typically, you 
>>>>>>>> want
>>>>>>>> to add nodes in multiples of your replication factor in order to keep 
>>>>>>>> the
>>>>>>>> "racks" balanced.  In other words, this node should have been added to 
>>>>>>>> rack
>>>>>>>> 1, 2 or 3.
>>>>>>>>
>>>>>>>> Having said that, you should be able to easily fix your problem by
>>>>>>>> running a nodetool repair -pr on the new node.
>>>>>>>>
>>>>>>>> On Sun, Apr 2, 2023 at 8:16 PM David Tinker 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi All
>>>>>>>>>
>>>>>>>>> I recently added a node to my 3 node Cassandra 4.0.5 cluster and
>>>>>>>>> now many reads are not returning rows! What do I need to do to fix 
>>>>>>>>> this?
>>>>>>>>> There weren't any errors in the logs or other problems that I could 
>>>>>>>>> see. I
>>>>>>>>> expected the cluster to balance itself but this hasn't happened 
>>>>>>>>> (yet?). The
>>>>>>>>> nodes are similar so I have num_tokens=256 for each. I am using the
>>>>>>>>> Murmur3Partitioner.
>>>>>>>>>
>>>>>>>>> # nodetool status
>>>>>>>>> Datacenter: dc1
>>>>>>>>> ===
&g

Re: Reads not returning data after adding node

2023-04-02 Thread David Tinker
Is it possible for me to remove the node from the cluster i.e. to undo this
mess and get the cluster operating again?

On Mon, Apr 3, 2023 at 7:13 AM Carlos Diaz  wrote:

> You can leave it in the seed list of the other nodes, just make sure it's
> not included in this node's seed list.  However, if you do decide to fix
> the issue with the racks first assassinate this node (nodetool assassinate
> ), and update the rack name before you restart.
>
> On Sun, Apr 2, 2023 at 10:06 PM David Tinker 
> wrote:
>
>> It is also in the seeds list for the other nodes. Should I remove it from
>> those, restart them one at a time, then restart it?
>>
>> /etc/cassandra # grep -i bootstrap *
>> doesn't show anything so I don't think I have auto_bootstrap false.
>>
>> Thanks very much for the help.
>>
>>
>> On Mon, Apr 3, 2023 at 7:01 AM Carlos Diaz  wrote:
>>
>>> Just remove it from the seed list in the cassandra.yaml file and restart
>>> the node.  Make sure that auto_bootstrap is set to true first though.
>>>
>>> On Sun, Apr 2, 2023 at 9:59 PM David Tinker 
>>> wrote:
>>>
>>>> So likely because I made it a seed node when I added it to the cluster
>>>> it didn't do the bootstrap process. How can I recover this?
>>>>
>>>> On Mon, Apr 3, 2023 at 6:41 AM David Tinker 
>>>> wrote:
>>>>
>>>>> Yes replication factor is 3.
>>>>>
>>>>> I ran nodetool repair -pr on all the nodes (one at a time) and am
>>>>> still having issues getting data back from queries.
>>>>>
>>>>> I did make the new node a seed node.
>>>>>
>>>>> Re "rack4": I assumed that was just an indication as to the physical
>>>>> location of the server for redundancy. This one is separate from the 
>>>>> others
>>>>> so I used rack4.
>>>>>
>>>>> On Mon, Apr 3, 2023 at 6:30 AM Carlos Diaz 
>>>>> wrote:
>>>>>
>>>>>> I'm assuming that your replication factor is 3.  If that's the case,
>>>>>> did you intentionally put this node in rack 4?  Typically, you want to 
>>>>>> add
>>>>>> nodes in multiples of your replication factor in order to keep the 
>>>>>> "racks"
>>>>>> balanced.  In other words, this node should have been added to rack 1, 2 
>>>>>> or
>>>>>> 3.
>>>>>>
>>>>>> Having said that, you should be able to easily fix your problem by
>>>>>> running a nodetool repair -pr on the new node.
>>>>>>
>>>>>> On Sun, Apr 2, 2023 at 8:16 PM David Tinker 
>>>>>> wrote:
>>>>>>
>>>>>>> Hi All
>>>>>>>
>>>>>>> I recently added a node to my 3 node Cassandra 4.0.5 cluster and now
>>>>>>> many reads are not returning rows! What do I need to do to fix this? 
>>>>>>> There
>>>>>>> weren't any errors in the logs or other problems that I could see. I
>>>>>>> expected the cluster to balance itself but this hasn't happened (yet?). 
>>>>>>> The
>>>>>>> nodes are similar so I have num_tokens=256 for each. I am using the
>>>>>>> Murmur3Partitioner.
>>>>>>>
>>>>>>> # nodetool status
>>>>>>> Datacenter: dc1
>>>>>>> ===
>>>>>>> Status=Up/Down
>>>>>>> |/ State=Normal/Leaving/Joining/Moving
>>>>>>> --  Address  Load   Tokens  Owns (effective)  Host ID
>>>>>>> Rack
>>>>>>> UN  xxx.xxx.xxx.105  2.65 TiB   256 72.9%
>>>>>>> afd02287-3f88-4c6f-8b27-06f7a8192402  rack3
>>>>>>> UN  xxx.xxx.xxx.253  2.6 TiB256 73.9%
>>>>>>> e1af72be-e5df-4c6b-a124-c7bc48c6602a  rack2
>>>>>>> UN  xxx.xxx.xxx.24   93.82 KiB  256 80.0%
>>>>>>> c4e8b4a0-f014-45e6-afb4-648aad4f8500  rack4
>>>>>>> UN  xxx.xxx.xxx.107  2.65 TiB   256 73.2%
>>>>>>> ab72f017-be96-41d2-9bef-a551dec2c7b5  rack1
>>>>>>>
>>>>>>> # nodetool netstats
>>>>>>> Mode: NORMAL
>>>>>>> Not sending any streams.
>>>>>>> Read Repair 

Re: Reads not returning data after adding node

2023-04-02 Thread David Tinker
I should add that the new node does have some data.

On Mon, Apr 3, 2023 at 7:04 AM David Tinker  wrote:

> It is also in the seeds list for the other nodes. Should I remove it from
> those, restart them one at a time, then restart it?
>
> /etc/cassandra # grep -i bootstrap *
> doesn't show anything so I don't think I have auto_bootstrap false.
>
> Thanks very much for the help.
>
>
> On Mon, Apr 3, 2023 at 7:01 AM Carlos Diaz  wrote:
>
>> Just remove it from the seed list in the cassandra.yaml file and restart
>> the node.  Make sure that auto_bootstrap is set to true first though.
>>
>> On Sun, Apr 2, 2023 at 9:59 PM David Tinker 
>> wrote:
>>
>>> So likely because I made it a seed node when I added it to the cluster
>>> it didn't do the bootstrap process. How can I recover this?
>>>
>>> On Mon, Apr 3, 2023 at 6:41 AM David Tinker 
>>> wrote:
>>>
>>>> Yes replication factor is 3.
>>>>
>>>> I ran nodetool repair -pr on all the nodes (one at a time) and am
>>>> still having issues getting data back from queries.
>>>>
>>>> I did make the new node a seed node.
>>>>
>>>> Re "rack4": I assumed that was just an indication as to the physical
>>>> location of the server for redundancy. This one is separate from the others
>>>> so I used rack4.
>>>>
>>>> On Mon, Apr 3, 2023 at 6:30 AM Carlos Diaz  wrote:
>>>>
>>>>> I'm assuming that your replication factor is 3.  If that's the case,
>>>>> did you intentionally put this node in rack 4?  Typically, you want to add
>>>>> nodes in multiples of your replication factor in order to keep the "racks"
>>>>> balanced.  In other words, this node should have been added to rack 1, 2 
>>>>> or
>>>>> 3.
>>>>>
>>>>> Having said that, you should be able to easily fix your problem by
>>>>> running a nodetool repair -pr on the new node.
>>>>>
>>>>> On Sun, Apr 2, 2023 at 8:16 PM David Tinker 
>>>>> wrote:
>>>>>
>>>>>> Hi All
>>>>>>
>>>>>> I recently added a node to my 3 node Cassandra 4.0.5 cluster and now
>>>>>> many reads are not returning rows! What do I need to do to fix this? 
>>>>>> There
>>>>>> weren't any errors in the logs or other problems that I could see. I
>>>>>> expected the cluster to balance itself but this hasn't happened (yet?). 
>>>>>> The
>>>>>> nodes are similar so I have num_tokens=256 for each. I am using the
>>>>>> Murmur3Partitioner.
>>>>>>
>>>>>> # nodetool status
>>>>>> Datacenter: dc1
>>>>>> ===
>>>>>> Status=Up/Down
>>>>>> |/ State=Normal/Leaving/Joining/Moving
>>>>>> --  Address  Load   Tokens  Owns (effective)  Host ID
>>>>>>   Rack
>>>>>> UN  xxx.xxx.xxx.105  2.65 TiB   256 72.9%
>>>>>> afd02287-3f88-4c6f-8b27-06f7a8192402  rack3
>>>>>> UN  xxx.xxx.xxx.253  2.6 TiB256 73.9%
>>>>>> e1af72be-e5df-4c6b-a124-c7bc48c6602a  rack2
>>>>>> UN  xxx.xxx.xxx.24   93.82 KiB  256 80.0%
>>>>>> c4e8b4a0-f014-45e6-afb4-648aad4f8500  rack4
>>>>>> UN  xxx.xxx.xxx.107  2.65 TiB   256 73.2%
>>>>>> ab72f017-be96-41d2-9bef-a551dec2c7b5  rack1
>>>>>>
>>>>>> # nodetool netstats
>>>>>> Mode: NORMAL
>>>>>> Not sending any streams.
>>>>>> Read Repair Statistics:
>>>>>> Attempted: 0
>>>>>> Mismatch (Blocking): 0
>>>>>> Mismatch (Background): 0
>>>>>> Pool NameActive   Pending  Completed   Dropped
>>>>>> Large messages  n/a 0  71754 0
>>>>>> Small messages  n/a 0839818414
>>>>>> Gossip messages n/a 01303634 0
>>>>>>
>>>>>> # nodetool ring
>>>>>> Datacenter: dc1
>>>>>> ==
>>>>>> Address   RackStatus State   LoadOwns
>>>>>>Token
>>>>>>
>>>>>>9189523899826545641
>>>>>> xxx.xxx.xxx.24rack4   Up Normal  93.82 KiB
>>>>>> 79.95%  -9194674091837769168
>>>>>> xxx.xxx.xxx.107   rack1   Up Normal  2.65 TiB
>>>>>>  73.25%  -9168781258594813088
>>>>>> xxx.xxx.xxx.253   rack2   Up Normal  2.6 TiB
>>>>>> 73.92%  -9163037340977721917
>>>>>> xxx.xxx.xxx.105   rack3   Up Normal  2.65 TiB
>>>>>>  72.88%  -9148860739730046229
>>>>>> xxx.xxx.xxx.107   rack1   Up Normal  2.65 TiB
>>>>>>  73.25%  -9125240034139323535
>>>>>> xxx.xxx.xxx.253   rack2   Up Normal  2.6 TiB
>>>>>> 73.92%  -9112518853051755414
>>>>>> xxx.xxx.xxx.105   rack3   Up Normal  2.65 TiB
>>>>>>  72.88%  -9100516173422432134
>>>>>> ...
>>>>>>
>>>>>> This is causing a serious production issue. Please help if you can.
>>>>>>
>>>>>> Thanks
>>>>>> David
>>>>>>
>>>>>>
>>>>>>
>>>>>>


Re: Reads not returning data after adding node

2023-04-02 Thread David Tinker
It is also in the seeds list for the other nodes. Should I remove it from
those, restart them one at a time, then restart it?

/etc/cassandra # grep -i bootstrap *
doesn't show anything so I don't think I have auto_bootstrap false.

Thanks very much for the help.


On Mon, Apr 3, 2023 at 7:01 AM Carlos Diaz  wrote:

> Just remove it from the seed list in the cassandra.yaml file and restart
> the node.  Make sure that auto_bootstrap is set to true first though.
>
> On Sun, Apr 2, 2023 at 9:59 PM David Tinker 
> wrote:
>
>> So likely because I made it a seed node when I added it to the cluster it
>> didn't do the bootstrap process. How can I recover this?
>>
>> On Mon, Apr 3, 2023 at 6:41 AM David Tinker 
>> wrote:
>>
>>> Yes replication factor is 3.
>>>
>>> I ran nodetool repair -pr on all the nodes (one at a time) and am still
>>> having issues getting data back from queries.
>>>
>>> I did make the new node a seed node.
>>>
>>> Re "rack4": I assumed that was just an indication as to the physical
>>> location of the server for redundancy. This one is separate from the others
>>> so I used rack4.
>>>
>>> On Mon, Apr 3, 2023 at 6:30 AM Carlos Diaz  wrote:
>>>
>>>> I'm assuming that your replication factor is 3.  If that's the case,
>>>> did you intentionally put this node in rack 4?  Typically, you want to add
>>>> nodes in multiples of your replication factor in order to keep the "racks"
>>>> balanced.  In other words, this node should have been added to rack 1, 2 or
>>>> 3.
>>>>
>>>> Having said that, you should be able to easily fix your problem by
>>>> running a nodetool repair -pr on the new node.
>>>>
>>>> On Sun, Apr 2, 2023 at 8:16 PM David Tinker 
>>>> wrote:
>>>>
>>>>> Hi All
>>>>>
>>>>> I recently added a node to my 3 node Cassandra 4.0.5 cluster and now
>>>>> many reads are not returning rows! What do I need to do to fix this? There
>>>>> weren't any errors in the logs or other problems that I could see. I
>>>>> expected the cluster to balance itself but this hasn't happened (yet?). 
>>>>> The
>>>>> nodes are similar so I have num_tokens=256 for each. I am using the
>>>>> Murmur3Partitioner.
>>>>>
>>>>> # nodetool status
>>>>> Datacenter: dc1
>>>>> ===
>>>>> Status=Up/Down
>>>>> |/ State=Normal/Leaving/Joining/Moving
>>>>> --  Address  Load   Tokens  Owns (effective)  Host ID
>>>>>   Rack
>>>>> UN  xxx.xxx.xxx.105  2.65 TiB   256 72.9%
>>>>> afd02287-3f88-4c6f-8b27-06f7a8192402  rack3
>>>>> UN  xxx.xxx.xxx.253  2.6 TiB256 73.9%
>>>>> e1af72be-e5df-4c6b-a124-c7bc48c6602a  rack2
>>>>> UN  xxx.xxx.xxx.24   93.82 KiB  256 80.0%
>>>>> c4e8b4a0-f014-45e6-afb4-648aad4f8500  rack4
>>>>> UN  xxx.xxx.xxx.107  2.65 TiB   256 73.2%
>>>>> ab72f017-be96-41d2-9bef-a551dec2c7b5  rack1
>>>>>
>>>>> # nodetool netstats
>>>>> Mode: NORMAL
>>>>> Not sending any streams.
>>>>> Read Repair Statistics:
>>>>> Attempted: 0
>>>>> Mismatch (Blocking): 0
>>>>> Mismatch (Background): 0
>>>>> Pool NameActive   Pending  Completed   Dropped
>>>>> Large messages  n/a 0  71754 0
>>>>> Small messages  n/a 0839818414
>>>>> Gossip messages n/a 01303634 0
>>>>>
>>>>> # nodetool ring
>>>>> Datacenter: dc1
>>>>> ==
>>>>> Address   RackStatus State   LoadOwns
>>>>>Token
>>>>>
>>>>>9189523899826545641
>>>>> xxx.xxx.xxx.24rack4   Up Normal  93.82 KiB
>>>>> 79.95%  -9194674091837769168
>>>>> xxx.xxx.xxx.107   rack1   Up Normal  2.65 TiB
>>>>>  73.25%  -9168781258594813088
>>>>> xxx.xxx.xxx.253   rack2   Up Normal  2.6 TiB
>>>>> 73.92%  -9163037340977721917
>>>>> xxx.xxx.xxx.105   rack3   Up Normal  2.65 TiB
>>>>>  72.88%  -9148860739730046229
>>>>> xxx.xxx.xxx.107   rack1   Up Normal  2.65 TiB
>>>>>  73.25%  -9125240034139323535
>>>>> xxx.xxx.xxx.253   rack2   Up Normal  2.6 TiB
>>>>> 73.92%  -9112518853051755414
>>>>> xxx.xxx.xxx.105   rack3   Up Normal  2.65 TiB
>>>>>  72.88%  -9100516173422432134
>>>>> ...
>>>>>
>>>>> This is causing a serious production issue. Please help if you can.
>>>>>
>>>>> Thanks
>>>>> David
>>>>>
>>>>>
>>>>>
>>>>>


Re: Reads not returning data after adding node

2023-04-02 Thread David Tinker
So likely because I made it a seed node when I added it to the cluster it
didn't do the bootstrap process. How can I recover this?

On Mon, Apr 3, 2023 at 6:41 AM David Tinker  wrote:

> Yes replication factor is 3.
>
> I ran nodetool repair -pr on all the nodes (one at a time) and am still
> having issues getting data back from queries.
>
> I did make the new node a seed node.
>
> Re "rack4": I assumed that was just an indication as to the physical
> location of the server for redundancy. This one is separate from the others
> so I used rack4.
>
> On Mon, Apr 3, 2023 at 6:30 AM Carlos Diaz  wrote:
>
>> I'm assuming that your replication factor is 3.  If that's the case, did
>> you intentionally put this node in rack 4?  Typically, you want to add
>> nodes in multiples of your replication factor in order to keep the "racks"
>> balanced.  In other words, this node should have been added to rack 1, 2 or
>> 3.
>>
>> Having said that, you should be able to easily fix your problem by
>> running a nodetool repair -pr on the new node.
>>
>> On Sun, Apr 2, 2023 at 8:16 PM David Tinker 
>> wrote:
>>
>>> Hi All
>>>
>>> I recently added a node to my 3 node Cassandra 4.0.5 cluster and now
>>> many reads are not returning rows! What do I need to do to fix this? There
>>> weren't any errors in the logs or other problems that I could see. I
>>> expected the cluster to balance itself but this hasn't happened (yet?). The
>>> nodes are similar so I have num_tokens=256 for each. I am using the
>>> Murmur3Partitioner.
>>>
>>> # nodetool status
>>> Datacenter: dc1
>>> ===
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> --  Address  Load   Tokens  Owns (effective)  Host ID
>>> Rack
>>> UN  xxx.xxx.xxx.105  2.65 TiB   256 72.9%
>>> afd02287-3f88-4c6f-8b27-06f7a8192402  rack3
>>> UN  xxx.xxx.xxx.253  2.6 TiB256 73.9%
>>> e1af72be-e5df-4c6b-a124-c7bc48c6602a  rack2
>>> UN  xxx.xxx.xxx.24   93.82 KiB  256 80.0%
>>> c4e8b4a0-f014-45e6-afb4-648aad4f8500  rack4
>>> UN  xxx.xxx.xxx.107  2.65 TiB   256 73.2%
>>> ab72f017-be96-41d2-9bef-a551dec2c7b5  rack1
>>>
>>> # nodetool netstats
>>> Mode: NORMAL
>>> Not sending any streams.
>>> Read Repair Statistics:
>>> Attempted: 0
>>> Mismatch (Blocking): 0
>>> Mismatch (Background): 0
>>> Pool NameActive   Pending  Completed   Dropped
>>> Large messages  n/a 0  71754 0
>>> Small messages  n/a 0839818414
>>> Gossip messages n/a 01303634 0
>>>
>>> # nodetool ring
>>> Datacenter: dc1
>>> ==
>>> Address   RackStatus State   LoadOwns
>>>  Token
>>>
>>>  9189523899826545641
>>> xxx.xxx.xxx.24rack4   Up Normal  93.82 KiB   79.95%
>>>  -9194674091837769168
>>> xxx.xxx.xxx.107   rack1   Up Normal  2.65 TiB73.25%
>>>  -9168781258594813088
>>> xxx.xxx.xxx.253   rack2   Up Normal  2.6 TiB 73.92%
>>>  -9163037340977721917
>>> xxx.xxx.xxx.105   rack3   Up Normal  2.65 TiB72.88%
>>>  -9148860739730046229
>>> xxx.xxx.xxx.107   rack1   Up Normal  2.65 TiB73.25%
>>>  -9125240034139323535
>>> xxx.xxx.xxx.253   rack2   Up Normal  2.6 TiB 73.92%
>>>  -9112518853051755414
>>> xxx.xxx.xxx.105   rack3   Up Normal  2.65 TiB72.88%
>>>  -9100516173422432134
>>> ...
>>>
>>> This is causing a serious production issue. Please help if you can.
>>>
>>> Thanks
>>> David
>>>
>>>
>>>
>>>


Re: Reads not returning data after adding node

2023-04-02 Thread David Tinker
Yes replication factor is 3.

I ran nodetool repair -pr on all the nodes (one at a time) and am still
having issues getting data back from queries.

I did make the new node a seed node.

Re "rack4": I assumed that was just an indication as to the physical
location of the server for redundancy. This one is separate from the others
so I used rack4.

On Mon, Apr 3, 2023 at 6:30 AM Carlos Diaz  wrote:

> I'm assuming that your replication factor is 3.  If that's the case, did
> you intentionally put this node in rack 4?  Typically, you want to add
> nodes in multiples of your replication factor in order to keep the "racks"
> balanced.  In other words, this node should have been added to rack 1, 2 or
> 3.
>
> Having said that, you should be able to easily fix your problem by running
> a nodetool repair -pr on the new node.
>
> On Sun, Apr 2, 2023 at 8:16 PM David Tinker 
> wrote:
>
>> Hi All
>>
>> I recently added a node to my 3 node Cassandra 4.0.5 cluster and now many
>> reads are not returning rows! What do I need to do to fix this? There
>> weren't any errors in the logs or other problems that I could see. I
>> expected the cluster to balance itself but this hasn't happened (yet?). The
>> nodes are similar so I have num_tokens=256 for each. I am using the
>> Murmur3Partitioner.
>>
>> # nodetool status
>> Datacenter: dc1
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address  Load   Tokens  Owns (effective)  Host ID
>>   Rack
>> UN  xxx.xxx.xxx.105  2.65 TiB   256 72.9%
>> afd02287-3f88-4c6f-8b27-06f7a8192402  rack3
>> UN  xxx.xxx.xxx.253  2.6 TiB256 73.9%
>> e1af72be-e5df-4c6b-a124-c7bc48c6602a  rack2
>> UN  xxx.xxx.xxx.24   93.82 KiB  256 80.0%
>> c4e8b4a0-f014-45e6-afb4-648aad4f8500  rack4
>> UN  xxx.xxx.xxx.107  2.65 TiB   256 73.2%
>> ab72f017-be96-41d2-9bef-a551dec2c7b5  rack1
>>
>> # nodetool netstats
>> Mode: NORMAL
>> Not sending any streams.
>> Read Repair Statistics:
>> Attempted: 0
>> Mismatch (Blocking): 0
>> Mismatch (Background): 0
>> Pool NameActive   Pending  Completed   Dropped
>> Large messages  n/a 0  71754 0
>> Small messages  n/a 0839818414
>> Gossip messages n/a 01303634 0
>>
>> # nodetool ring
>> Datacenter: dc1
>> ==
>> Address   RackStatus State   LoadOwns
>>Token
>>
>>9189523899826545641
>> xxx.xxx.xxx.24rack4   Up Normal  93.82 KiB   79.95%
>>-9194674091837769168
>> xxx.xxx.xxx.107   rack1   Up Normal  2.65 TiB73.25%
>>-9168781258594813088
>> xxx.xxx.xxx.253   rack2   Up Normal  2.6 TiB 73.92%
>>-9163037340977721917
>> xxx.xxx.xxx.105   rack3   Up Normal  2.65 TiB72.88%
>>-9148860739730046229
>> xxx.xxx.xxx.107   rack1   Up Normal  2.65 TiB73.25%
>>-9125240034139323535
>> xxx.xxx.xxx.253   rack2   Up Normal  2.6 TiB 73.92%
>>-9112518853051755414
>> xxx.xxx.xxx.105   rack3   Up Normal  2.65 TiB72.88%
>>-9100516173422432134
>> ...
>>
>> This is causing a serious production issue. Please help if you can.
>>
>> Thanks
>> David
>>
>>
>>
>>


Reads not returning data after adding node

2023-04-02 Thread David Tinker
Hi All

I recently added a node to my 3 node Cassandra 4.0.5 cluster and now many
reads are not returning rows! What do I need to do to fix this? There
weren't any errors in the logs or other problems that I could see. I
expected the cluster to balance itself but this hasn't happened (yet?). The
nodes are similar so I have num_tokens=256 for each. I am using the
Murmur3Partitioner.

# nodetool status
Datacenter: dc1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  Owns (effective)  Host ID
Rack
UN  xxx.xxx.xxx.105  2.65 TiB   256 72.9%
afd02287-3f88-4c6f-8b27-06f7a8192402  rack3
UN  xxx.xxx.xxx.253  2.6 TiB256 73.9%
e1af72be-e5df-4c6b-a124-c7bc48c6602a  rack2
UN  xxx.xxx.xxx.24   93.82 KiB  256 80.0%
c4e8b4a0-f014-45e6-afb4-648aad4f8500  rack4
UN  xxx.xxx.xxx.107  2.65 TiB   256 73.2%
ab72f017-be96-41d2-9bef-a551dec2c7b5  rack1

# nodetool netstats
Mode: NORMAL
Not sending any streams.
Read Repair Statistics:
Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool NameActive   Pending  Completed   Dropped
Large messages  n/a 0  71754 0
Small messages  n/a 0839818414
Gossip messages n/a 01303634 0

# nodetool ring
Datacenter: dc1
==
Address   RackStatus State   LoadOwns
 Token

 9189523899826545641
xxx.xxx.xxx.24rack4   Up Normal  93.82 KiB   79.95%
 -9194674091837769168
xxx.xxx.xxx.107   rack1   Up Normal  2.65 TiB73.25%
 -9168781258594813088
xxx.xxx.xxx.253   rack2   Up Normal  2.6 TiB 73.92%
 -9163037340977721917
xxx.xxx.xxx.105   rack3   Up Normal  2.65 TiB72.88%
 -9148860739730046229
xxx.xxx.xxx.107   rack1   Up Normal  2.65 TiB73.25%
 -9125240034139323535
xxx.xxx.xxx.253   rack2   Up Normal  2.6 TiB 73.92%
 -9112518853051755414
xxx.xxx.xxx.105   rack3   Up Normal  2.65 TiB72.88%
 -9100516173422432134
...

This is causing a serious production issue. Please help if you can.

Thanks
David


Barman equivalent for Cassandra?

2021-03-12 Thread David Tinker
Hi Guys

I need to backup my 3 node Cassandra cluster to a remote machine. Is there
a tool like Barman (really nice streaming backup tool for Postgresql) for
Cassandra? Or does everyone roll their own scripts using snapshots and so
on?

The data is on all 3 nodes using about 900G of space on each.

It would be difficult for me to recover even a day of lost data. An hour
might be ok.

Thanks
David


Re: Recovery after server crash 4.0b3

2021-03-01 Thread David Tinker
Thanks guys. The IP address hasn't changed so I will go ahead and start the
server and repair.

On Mon, Mar 1, 2021 at 1:50 PM Erick Ramirez 
wrote:

> If the node's only been down for less than gc_grace_seconds and the data
> in the drives are intact, you should be fine just booting the server and it
> will join the cluster. You will need to run a repair so it picks up the
> missed mutations.
>
> @Bowen FWIW no need to do a "replace" -- the node will just take over the
> new IP. You'll just see a warning in the system.log that looks like:
>
> Not updating host ID  for  because it's mine
>
>
> See
> https://github.com/apache/cassandra/blob/cassandra-4.0-beta3/src/java/org/apache/cassandra/service/StorageService.java#L2620.
> Cheers!
>


Recovery after server crash 4.0b3

2021-03-01 Thread David Tinker
Hi Guys

I have a 3 node cluster running 4.0b3 with all data replicated to all 3
nodes. This morning one of the servers started randomly rebooting (up for a
minute or two then reboot) for a couple of hours. The cluster continued
running normally during this time (nice!).

My hosting company has replaced the server and moved the drives across. Is
it safe for me to boot the machine and let it join the cluster?

Thanks
David


How beta is 4.0-beta3?

2020-11-24 Thread David Tinker
I could really use zstd compression! So if it's not too buggy I will take a
chance :) Tx


Re: Tracking word frequencies

2014-01-20 Thread David Tinker
I haven't actually tried to use that schema yet, it was just my first idea.
If we use that solution our app would have to read the whole table once a
day or so to find the top 5000'ish words.


On Fri, Jan 17, 2014 at 2:49 PM, Jonathan Lacefield jlacefi...@datastax.com
 wrote:

 Hi David,

   How do you know that you are receiving a seek for each row?  Are you
 querying for a specific word at a time or do the queries span multiple
 words, i.e. what's the query pattern? Also, what is your goal for read
 latency?  Most customers can achieve microsecond partition key base query
 reads with Cassanda.  This can be done through tuning, data modeling,
 and/or scaling.  Please post a cfhistograms for this table as well as
 provide some details on the specific queries you are running.

 Thanks,

 Jonathan

 Jonathan Lacefield
 Solutions Architect, DataStax
 (404) 822 3487
  http://www.linkedin.com/in/jlacefield



 http://www.datastax.com/what-we-offer/products-services/training/virtual-training


 On Fri, Jan 17, 2014 at 1:41 AM, David Tinker david.tin...@gmail.comwrote:

 I have an app that stores lots of bits of text in Cassandra. One of
 the things I need to do is keep a global word frequency table.
 Something like this:

 CREATE TABLE IF NOT EXISTS word_count (
   word text,
   count value,
   PRIMARY KEY (word)
 );

 This is slow to read as the rows (100's of thousands of them) each
 need a seek. Is there a better way to model this in Cassandra? I could
 periodically snapshot the rows into a fat row in another table I
 suppose.

 Or should I use Redis or something instead? I would prefer to keep it
 all Cassandra if possible.





-- 
http://qdb.io/ Persistent Message Queues With Replay and #RabbitMQ
Integration


Tracking word frequencies

2014-01-16 Thread David Tinker
I have an app that stores lots of bits of text in Cassandra. One of
the things I need to do is keep a global word frequency table.
Something like this:

CREATE TABLE IF NOT EXISTS word_count (
  word text,
  count value,
  PRIMARY KEY (word)
);

This is slow to read as the rows (100's of thousands of them) each
need a seek. Is there a better way to model this in Cassandra? I could
periodically snapshot the rows into a fat row in another table I
suppose.

Or should I use Redis or something instead? I would prefer to keep it
all Cassandra if possible.


Re: Crash with TombstoneOverwhelmingException

2014-01-13 Thread David Tinker
We are seeing the exact same exception in our logs. Is there any workaround?

We never delete rows but we do a lot of updates. Is that where the
tombstones are coming from?

On Wed, Dec 25, 2013 at 5:24 PM, Sanjeeth Kumar sanje...@exotel.in wrote:
 Hi all,
   One of my cassandra nodes crashes with the following exception
 periodically -
 ERROR [HintedHandoff:33] 2013-12-25 20:29:22,276 SliceQueryFilter.java (line
 200) Scanned over 10 tombstones; query aborted (see tombstone_fail_thr
 eshold)
 ERROR [HintedHandoff:33] 2013-12-25 20:29:22,278 CassandraDaemon.java (line
 187) Exception in thread Thread[HintedHandoff:33,1,main]
 org.apache.cassandra.db.filter.TombstoneOverwhelmingException
 at
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:201)
 at
 org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
 at
 org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
 at
 org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
 at
 org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
 at
 org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
 at
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1487)
 at
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1306)
 at
 org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:351)
 at
 org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:309)
 at
 org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:92)
 at
 org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:530)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)

 Why does this happen? Does this relate to any incorrect config value?

 The Cassandra Version I'm running is
 ReleaseVersion: 2.0.3

 - Sanjeeth




-- 
http://qdb.io/ Persistent Message Queues With Replay and #RabbitMQ Integration


Occasional NPE using DataStax Java driver

2013-12-18 Thread David Tinker
We are using Cassandra 2.0.3-1 installed on Ubuntu 12.04 from the
DataStax repo with the DataStax Java driver version 2.0.0-rc1. Every
now and then we get the following exception:

2013-12-19 06:56:34,619 [sql-2-t15] ERROR core.RequestHandler  -
Unexpected error while querying /x.x.x.x
java.lang.NullPointerException
at 
com.datastax.driver.core.HostConnectionPool.waitForConnection(HostConnectionPool.java:203)
at 
com.datastax.driver.core.HostConnectionPool.borrowConnection(HostConnectionPool.java:107)
at com.datastax.driver.core.RequestHandler.query(RequestHandler.java:112)
at com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:93)
at com.datastax.driver.core.Session$Manager.execute(Session.java:513)
at com.datastax.driver.core.Session$Manager.executeQuery(Session.java:549)
at com.datastax.driver.core.Session.executeAsync(Session.java:172)

This happens during a big data load process which will do up to 256
executeAsync's in parallel.

Any ideas? Its not causing huge problems because the operation is just
retried by our code but it would be nice to eliminate it.


Re: What is the fastest way to get data into Cassandra 2 from a Java application?

2013-12-13 Thread David Tinker
I wrote some scripts to test this: https://github.com/davidtinker/cassandra-perf

3 node cluster, each node: Intel® Xeon® E3-1270 v3 Quadcore Haswell
32GB RAM, 1 x 2TB commit log disk, 2 x 4TB data disks (RAID0)

Using a batch of prepared statements is about 5% faster than inline parameters:

InsertBatchOfPreparedStatements: Inserted 2551704 rows in 10
batches using 256 concurrent operations in 15.785 secs, 161653 rows/s,
6335 batches/s

InsertInlineBatch: Inserted 2551704 rows in 10 batches using 256
concurrent operations in 16.712 secs, 152686 rows/s, 5983 batches/s

On Wed, Dec 11, 2013 at 2:40 PM, Sylvain Lebresne sylv...@datastax.com wrote:
 Then I suspect that this is artifact of your test methodology. Prepared
 statements *are* faster than non prepared ones in general. They save some
 parsing and some bytes on the wire. The savings will tend to be bigger for
 bigger queries, and it's possible that for very small queries (like the one
 you
 are testing) the performance difference is somewhat negligible, but seeing
 non
 prepared statement being significantly faster than prepared ones almost
 surely
 means you're doing wrong (of course, a bug in either the driver or C* is
 always
 possible, and always make sure to test recent versions, but I'm not aware of
 any such bug).

 Are you sure you are warming up the JVMs (client and drivers) properly for
 instance. 1000 iterations is *really small*, if you're not warming things
 up properly, you're not measuring anything relevant. Also, are you including
 the preparation of the query itself in the timing? Preparing a query is not
 particulary fast, but it's meant to be done just once at the begining of the
 application lifetime. But with only 1000 iterations, if you include the
 preparation in the timing, it's entirely possible it's eating a good chunk
 of
 the whole time.

 But other prepared versus non-prepared, you won't get proper performance
 unless
 you parallelize your inserts. Unlogged batches is one way to do it (it's
 really
 all Cassandra does with unlogged batch, parallelizing). But as John Sanda
 mentioned, another option is to do the parallelization client side, with
 executeAsync.

 --
 Sylvain



 On Wed, Dec 11, 2013 at 11:37 AM, David Tinker david.tin...@gmail.com
 wrote:

 Yes thats what I found.

 This is faster:

 for (int i = 0; i  1000; i++) session.execute(INSERT INTO
 test.wibble (id, info) VALUES ('${ + i}', '${aa + i}'))

 Than this:

 def ps = session.prepare(INSERT INTO test.wibble (id, info) VALUES (?,
 ?))
 for (int i = 0; i  1000; i++) session.execute(ps.bind([ + i, aa +
 i] as Object[]))

 This is the fastest option of all (hand rolled batch):

 StringBuilder b = new StringBuilder()
 b.append(BEGIN UNLOGGED BATCH\n)
 for (int i = 0; i  1000; i++) {
 b.append(INSERT INTO ).append(ks).append(.wibble (id, info)
 VALUES (').append(i).append(',')
 .append(aa).append(i).append(')\n)
 }
 b.append(APPLY BATCH\n)
 session.execute(b.toString())


 On Wed, Dec 11, 2013 at 10:56 AM, Sylvain Lebresne sylv...@datastax.com
 wrote:
 
  This loop takes 2500ms or so on my test cluster:
 
  PreparedStatement ps = session.prepare(INSERT INTO perf_test.wibble
  (id, info) VALUES (?, ?))
  for (int i = 0; i  1000; i++) session.execute(ps.bind( + i, aa +
  i));
 
  The same loop with the parameters inline is about 1300ms. It gets
  worse if there are many parameters.
 
 
  Do you mean that:
for (int i = 0; i  1000; i++)
session.execute(INSERT INTO perf_test.wibble (id, info) VALUES (
  + i
  + , aa + i + ));
  is twice as fast as using a prepared statement? And that the difference
  is even greater if you add more columns than id and info?
 
  That would certainly be unexpected, are you sure you're not re-preparing
  the
  statement every time in the loop?
 
  --
  Sylvain
 
  I know I can use batching to
  insert all the rows at once but thats not the purpose of this test. I
  also tried using session.execute(cql, params) and it is faster but
  still doesn't match inline values.
 
  Composing CQL strings is certainly convenient and simple but is there
  a much faster way?
 
  Thanks
  David
 
  I have also posted this on Stackoverflow if anyone wants the points:
 
 
  http://stackoverflow.com/questions/20491090/what-is-the-fastest-way-to-get-data-into-cassandra-2-from-a-java-application
 
 



 --
 http://qdb.io/ Persistent Message Queues With Replay and #RabbitMQ
 Integration





-- 
http://qdb.io/ Persistent Message Queues With Replay and #RabbitMQ Integration


Re: What is the fastest way to get data into Cassandra 2 from a Java application?

2013-12-11 Thread David Tinker
I didn't do any warming up etc. I am new to Cassandra and was just
poking around with some scripts to try to find the fastest way to do
things. That said all the mini-tests ran under the same conditions.

In our case the batches will have a variable number of different
inserts/updates in them so doing a whole batch as a PreparedStatement
won't help. However using BatchStatement and stuffing it full of
repeated PreparedStatement's might be better than a batch with inlined
parameters. I will do a test of that and see. I will also let the VM
warm up and whatnot this time.



On Wed, Dec 11, 2013 at 2:40 PM, Sylvain Lebresne sylv...@datastax.com wrote:
 Then I suspect that this is artifact of your test methodology. Prepared
 statements *are* faster than non prepared ones in general. They save some
 parsing and some bytes on the wire. The savings will tend to be bigger for
 bigger queries, and it's possible that for very small queries (like the one
 you
 are testing) the performance difference is somewhat negligible, but seeing
 non
 prepared statement being significantly faster than prepared ones almost
 surely
 means you're doing wrong (of course, a bug in either the driver or C* is
 always
 possible, and always make sure to test recent versions, but I'm not aware of
 any such bug).

 Are you sure you are warming up the JVMs (client and drivers) properly for
 instance. 1000 iterations is *really small*, if you're not warming things
 up properly, you're not measuring anything relevant. Also, are you including
 the preparation of the query itself in the timing? Preparing a query is not
 particulary fast, but it's meant to be done just once at the begining of the
 application lifetime. But with only 1000 iterations, if you include the
 preparation in the timing, it's entirely possible it's eating a good chunk
 of
 the whole time.

 But other prepared versus non-prepared, you won't get proper performance
 unless
 you parallelize your inserts. Unlogged batches is one way to do it (it's
 really
 all Cassandra does with unlogged batch, parallelizing). But as John Sanda
 mentioned, another option is to do the parallelization client side, with
 executeAsync.

 --
 Sylvain



 On Wed, Dec 11, 2013 at 11:37 AM, David Tinker david.tin...@gmail.com
 wrote:

 Yes thats what I found.

 This is faster:

 for (int i = 0; i  1000; i++) session.execute(INSERT INTO
 test.wibble (id, info) VALUES ('${ + i}', '${aa + i}'))

 Than this:

 def ps = session.prepare(INSERT INTO test.wibble (id, info) VALUES (?,
 ?))
 for (int i = 0; i  1000; i++) session.execute(ps.bind([ + i, aa +
 i] as Object[]))

 This is the fastest option of all (hand rolled batch):

 StringBuilder b = new StringBuilder()
 b.append(BEGIN UNLOGGED BATCH\n)
 for (int i = 0; i  1000; i++) {
 b.append(INSERT INTO ).append(ks).append(.wibble (id, info)
 VALUES (').append(i).append(',')
 .append(aa).append(i).append(')\n)
 }
 b.append(APPLY BATCH\n)
 session.execute(b.toString())


 On Wed, Dec 11, 2013 at 10:56 AM, Sylvain Lebresne sylv...@datastax.com
 wrote:
 
  This loop takes 2500ms or so on my test cluster:
 
  PreparedStatement ps = session.prepare(INSERT INTO perf_test.wibble
  (id, info) VALUES (?, ?))
  for (int i = 0; i  1000; i++) session.execute(ps.bind( + i, aa +
  i));
 
  The same loop with the parameters inline is about 1300ms. It gets
  worse if there are many parameters.
 
 
  Do you mean that:
for (int i = 0; i  1000; i++)
session.execute(INSERT INTO perf_test.wibble (id, info) VALUES (
  + i
  + , aa + i + ));
  is twice as fast as using a prepared statement? And that the difference
  is even greater if you add more columns than id and info?
 
  That would certainly be unexpected, are you sure you're not re-preparing
  the
  statement every time in the loop?
 
  --
  Sylvain
 
  I know I can use batching to
  insert all the rows at once but thats not the purpose of this test. I
  also tried using session.execute(cql, params) and it is faster but
  still doesn't match inline values.
 
  Composing CQL strings is certainly convenient and simple but is there
  a much faster way?
 
  Thanks
  David
 
  I have also posted this on Stackoverflow if anyone wants the points:
 
 
  http://stackoverflow.com/questions/20491090/what-is-the-fastest-way-to-get-data-into-cassandra-2-from-a-java-application
 
 



 --
 http://qdb.io/ Persistent Message Queues With Replay and #RabbitMQ
 Integration





-- 
http://qdb.io/ Persistent Message Queues With Replay and #RabbitMQ Integration


What is the fastest way to get data into Cassandra 2 from a Java application?

2013-12-10 Thread David Tinker
I have tried the DataStax Java driver and it seems the fastest way to
insert data is to compose a CQL string with all parameters inline.

This loop takes 2500ms or so on my test cluster:

PreparedStatement ps = session.prepare(INSERT INTO perf_test.wibble
(id, info) VALUES (?, ?))
for (int i = 0; i  1000; i++) session.execute(ps.bind( + i, aa + i));

The same loop with the parameters inline is about 1300ms. It gets
worse if there are many parameters. I know I can use batching to
insert all the rows at once but thats not the purpose of this test. I
also tried using session.execute(cql, params) and it is faster but
still doesn't match inline values.

Composing CQL strings is certainly convenient and simple but is there
a much faster way?

Thanks
David

I have also posted this on Stackoverflow if anyone wants the points:
 
http://stackoverflow.com/questions/20491090/what-is-the-fastest-way-to-get-data-into-cassandra-2-from-a-java-application


Re: What is the fastest way to get data into Cassandra 2 from a Java application?

2013-12-10 Thread David Tinker
Hmm. I have read that the thrift interface to Cassandra is out of
favour and the CQL interface is in. Where does that leave Astyanax?

On Tue, Dec 10, 2013 at 1:14 PM, graham sanderson gra...@vast.com wrote:
 Perhaps not the way forward, however I can bulk insert data via astyanax at a 
 rate that maxes out our (fast) networks. That said for our next release (of 
 this part of our product - our other current is node.js via binary protocol) 
 we will be looking at insert speed via java driver, and also alternative 
 scala/java implementations of the binary protocol.

 On Dec 10, 2013, at 4:49 AM, David Tinker david.tin...@gmail.com wrote:

 I have tried the DataStax Java driver and it seems the fastest way to
 insert data is to compose a CQL string with all parameters inline.

 This loop takes 2500ms or so on my test cluster:

 PreparedStatement ps = session.prepare(INSERT INTO perf_test.wibble
 (id, info) VALUES (?, ?))
 for (int i = 0; i  1000; i++) session.execute(ps.bind( + i, aa + i));

 The same loop with the parameters inline is about 1300ms. It gets
 worse if there are many parameters. I know I can use batching to
 insert all the rows at once but thats not the purpose of this test. I
 also tried using session.execute(cql, params) and it is faster but
 still doesn't match inline values.

 Composing CQL strings is certainly convenient and simple but is there
 a much faster way?

 Thanks
 David

 I have also posted this on Stackoverflow if anyone wants the points:
 http://stackoverflow.com/questions/20491090/what-is-the-fastest-way-to-get-data-into-cassandra-2-from-a-java-application




-- 
http://qdb.io/ Persistent Message Queues With Replay and #RabbitMQ Integration


Re: Commit log on USB flash disk?

2013-11-17 Thread David Tinker
Not using a commit log at all isn't something I had considered. We may
very well be able to do that for our application. Thanks.

On Sat, Nov 16, 2013 at 6:17 PM, Tupshin Harper tups...@tupshin.com wrote:
 It's conceivable that one of the faster USB 3.0 sticks would be sufficient
 for this. I wouldn't exactly call it an enterprise configuration, but it's
 worth considering. Keep in mind that if you are comfortable using your RF
 for durability, you can turn off durable_writes on your keyspace and not
 write to the commitlog at all.



 On Sat, Nov 16, 2013 at 11:05 AM, Philippe watche...@gmail.com wrote:

 Hi david, we tried it two years ago and the performance of the USB stick
 was so dismal we stopped.
 Cheers

 Le 16 nov. 2013 15:13, David Tinker david.tin...@gmail.com a écrit :

 Our hosting provider has a cost effective server with 2 x 4TB disks
 with a 16G (or 64G) USB thumb drive option. Would it make sense to put
 the Cassandra commit log on the USB thumb disk and use RAID0 to use
 both 4TB disks for data (and Ubuntu 12.04)?

 Anyone know how long USB flash disks last when use for a write heavy
 workload like this?

 Please tell me if this is a really bad idea.

 Our alternative is to use one 4TB disk for commit log and one for
 data. Of course this will give us only half the space.

 Thanks
 David





-- 
http://qdb.io/ Persistent Message Queues With Replay and #RabbitMQ Integration


Re: Commit log on USB flash disk?

2013-11-17 Thread David Tinker
Hmm. That device about to die write latency signature is
interesting. I have pinged our hosting company asking for specifics as
to exactly what USB thumb drive they supply.

On Sat, Nov 16, 2013 at 6:30 PM, Dan Simpson dan.simp...@gmail.com wrote:
 It doesn't seem like a great idea.  The USB drives typically use dynamic
 wear leveling.  See this analysis on wear:
 https://www.google.com/url?sa=trct=jq=esrc=ssource=webcd=3ved=0CD8QFjACurl=http%3A%2F%2Fwww.usenix.org%2Fevent%2Ffast10%2Ftech%2Ffull_papers%2Fboboila.pdfei=qZyHUrizFtKAygGs9YGoCgusg=AFQjCNHTC7d6fcI1CNWmjbHMwgXI1nUWcQsig2=BaWgHj3ib-cQOBPQsoCadAbvm=bv.56643336,d.aWccad=rjt

 If you you do end up using it, make sure to monitor write latency so you
 don't get hit by the bus.


 On Sat, Nov 16, 2013 at 6:12 AM, David Tinker david.tin...@gmail.com
 wrote:

 Our hosting provider has a cost effective server with 2 x 4TB disks
 with a 16G (or 64G) USB thumb drive option. Would it make sense to put
 the Cassandra commit log on the USB thumb disk and use RAID0 to use
 both 4TB disks for data (and Ubuntu 12.04)?

 Anyone know how long USB flash disks last when use for a write heavy
 workload like this?

 Please tell me if this is a really bad idea.

 Our alternative is to use one 4TB disk for commit log and one for
 data. Of course this will give us only half the space.

 Thanks
 David





-- 
http://qdb.io/ Persistent Message Queues With Replay and #RabbitMQ Integration


Commit log on USB flash disk?

2013-11-16 Thread David Tinker
Our hosting provider has a cost effective server with 2 x 4TB disks
with a 16G (or 64G) USB thumb drive option. Would it make sense to put
the Cassandra commit log on the USB thumb disk and use RAID0 to use
both 4TB disks for data (and Ubuntu 12.04)?

Anyone know how long USB flash disks last when use for a write heavy
workload like this?

Please tell me if this is a really bad idea.

Our alternative is to use one 4TB disk for commit log and one for
data. Of course this will give us only half the space.

Thanks
David