Re: upgrading 2.1.x cluster with ec2multiregionsnitch system.peers "corruption"

2019-03-26 Thread Oleksandr Shulgin
On Mon, Mar 25, 2019 at 11:13 PM Carl Mueller
 wrote:

>
> Since the internal IPs are given when the client app connects to the
> cluster, the client app cannot communicate with other nodes in other
> datacenters.
>

Why should it?  The client should only connect to its local data center and
leave communication with remote DCs to the query coordinator.


> They seem to be able to communicate within its own datacenter of the
> initial connection.
>

Did you configure address translation on the client?  See:
https://docs.datastax.com/en/developer/java-driver/3.0/manual/address_resolution/#ec2-multi-region

It appears we fixed this by manually updating the system.peers table's
> rpc_address column back to the public IP. This appears to survive a restart
> of the cassandra nodes without being switched back to private IPs.
>

I don't think updating system tables is a supported solution.  I'm
surprised that even doesn't give you an error.

Our cassandra.yaml (these parameters are the same in our confs for 2.1 and
> 2.2) has:
>
> listen_address: internal aws vpc ip
> rpc_address: 0.0.0.0
> broadcast_rpc_address: internal aws vpc ip
>

It is not straightforward to find the docs for version 2.x anymore, but at
least for 3.0 it is documented that you should set broadcast_rpc_address to
the public IP:
https://docs.datastax.com/en/cassandra/3.0/cassandra/architecture/archSnitchEC2MultiRegion.html

Regards,
--
Alex


Re: Merging two cluster's in to one without any downtime

2019-03-26 Thread Rahul Singh
In my experience,

I'd use two methods to make sure that you are covering your ass.
1. "old school" methodology would be to do the SStable load from old to new
cluster -- if you do incremental snapshots, then you could technically
minimize downtime and just load the latest increments with a little
downtime. This is your fall back.
2. "new school" methodology would be to do all your insert/updates through
event sourcing , in which you use the CQRS (command/query request
segregation) which makes all updates into a series of commands, processed
by a processor. If you have this architecture already, this means you have
a durable message queue from which you can either a) replay all the
mutations or b) do old school method 1 from above to get the bulk of the
data, and then c) use a simultaneous processor for writing data to both old
and new clusters.

Triggers can work, but it's super clugy in C* 2. Also, you don't have CDC
in C* 2. Event sourcing + CQRS is the _literal_ best approach. Period. You
can do a true blue / green test on both clusters (old and new) to see if
your shit is consistent.

Pardon the language, but you get the message.


rahul.xavier.si...@gmail.com

http://cassandra.link



On Mon, Mar 25, 2019 at 7:31 PM Carl Mueller
 wrote:

> Either:
>
> double-write at the driver level from one of the apps and perform an
> initial and a subsequent sstable loads (or whatever ETL method you want to
> use) to merge the data with good assurances.
>
> use a trigger to replicate the writes, with some sstable loads / ETL.
>
> use change data capture with some sstable loads/ETL
>
> On Mon, Mar 25, 2019 at 5:48 PM Nick Hatfield 
> wrote:
>
>> Maybe others will have a different or better solution but, in my
>> experience to accomplish HA we simply y write from our application to the
>> new cluster. You then export the data from the old cluster using cql2json
>> or any method you choose, to the new cluster. That will cover all live(now)
>> data via y write, while supplying the old data from the copy you run. Once
>> complete, set up a single reader that reads data from the new cluster and
>> verify all is as expected!
>>
>>
>> Sent from my BlackBerry 10 smartphone on the Verizon Wireless 4G LTE network.
>> *From: *Nandakishore Tokala
>> *Sent: *Monday, March 25, 2019 18:39
>> *To: *user@cassandra.apache.org
>> *Reply To: *user@cassandra.apache.org
>> *Subject: *Merging two cluster's in to one without any downtime
>>
>> Please let me know the best practices to combine 2 different cluster's
>> into one without having any downtime.
>>
>> Thanks & Regards,
>> Nanda Kishore
>>
>


TWCS Compactions & Tombstones

2019-03-26 Thread Nick Hatfield
How does one properly rid of sstables that have fallen victim to overlapping 
timestamps? I realized that we had TWCS set in our CF which also had a 
read_repair = 0.1 and after correcting this to 0.0 I can clearly see the 
affects over time on the new sstables. However, I still have old sstables that 
date back some time last year, and I need to remove them:

Max: 09/05/2018 Min: 09/04/2018 Estimated droppable tombstones: 
0.883205790993204613G Mar 26 11:34 mc-254400-big-Data.db


What is the best way to do this? This is on a production system so any help 
would be greatly appreciated.

Thanks,


Re: upgrading 2.1.x cluster with ec2multiregionsnitch system.peers "corruption"

2019-03-26 Thread Carl Mueller
Looking at the code it appears it shouldn't matter what we set the yaml
params to. The Ec2MultiRegionSnitch should be using the aws metadata
169.254.169.254 to pick up the internal/external ips as needed.

I think I'll just have to dig in to the code differences between 2.1 and
2.2. We don't want to specify the glboal IP in any of the yaml fields
because the global IP for the instance changes if we do an aws instance
restart. Don't want yaml editing to be a part of the instance restart
process.

And I was misinformed, an instance restart in our 2.2 cluster does
overwrite the manual system.peers entries, which I expected to happen.

On Tue, Mar 26, 2019 at 3:33 AM Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> On Mon, Mar 25, 2019 at 11:13 PM Carl Mueller
>  wrote:
>
>>
>> Since the internal IPs are given when the client app connects to the
>> cluster, the client app cannot communicate with other nodes in other
>> datacenters.
>>
>
> Why should it?  The client should only connect to its local data center
> and leave communication with remote DCs to the query coordinator.
>
>
>> They seem to be able to communicate within its own datacenter of the
>> initial connection.
>>
>
> Did you configure address translation on the client?  See:
> https://docs.datastax.com/en/developer/java-driver/3.0/manual/address_resolution/#ec2-multi-region
>
> It appears we fixed this by manually updating the system.peers table's
>> rpc_address column back to the public IP. This appears to survive a restart
>> of the cassandra nodes without being switched back to private IPs.
>>
>
> I don't think updating system tables is a supported solution.  I'm
> surprised that even doesn't give you an error.
>
> Our cassandra.yaml (these parameters are the same in our confs for 2.1 and
>> 2.2) has:
>>
>> listen_address: internal aws vpc ip
>> rpc_address: 0.0.0.0
>> broadcast_rpc_address: internal aws vpc ip
>>
>
> It is not straightforward to find the docs for version 2.x anymore, but at
> least for 3.0 it is documented that you should set broadcast_rpc_address to
> the public IP:
> https://docs.datastax.com/en/cassandra/3.0/cassandra/architecture/archSnitchEC2MultiRegion.html
>
> Regards,
> --
> Alex
>
>


Re: upgrading 2.1.x cluster with ec2multiregionsnitch system.peers "corruption"

2019-03-26 Thread Oleksandr Shulgin
On Tue, Mar 26, 2019 at 5:49 PM Carl Mueller
 wrote:

> Looking at the code it appears it shouldn't matter what we set the yaml
> params to. The Ec2MultiRegionSnitch should be using the aws metadata
> 169.254.169.254 to pick up the internal/external ips as needed.
>

This is somehow my expectation as well, so maybe the docs are just outdated.

I think I'll just have to dig in to the code differences between 2.1 and
> 2.2. We don't want to specify the glboal IP in any of the yaml fields
> because the global IP for the instance changes if we do an aws instance
> restart. Don't want yaml editing to be a part of the instance restart
> process.
>

We did solve this in the past by using Elastic IPs: anything prevents you
from using those?

--
Alex


Re: upgrading 2.1.x cluster with ec2multiregionsnitch system.peers "corruption"

2019-03-26 Thread Carl Mueller
- the AWS people say EIPs are a PITA.
- if we hardcode the global IPs in the yaml, then yaml editing is required
for the occaisional hard instance reboot in aws and its attendant global ip
reassignment
- if we try leaving broadcast_rpc_address blank, null , or commented out
with rpc_address set to 0.0.0.0 then cassandra refuses to start
- if we take out rpc_address and broadcast_rpc_address, then cqlsh doesn't
work with localhost anymore and that fucks up some of our cluster
managemetn tooling

- we kind of are being lazy and just want what worked in 2.1 to work in 2.2

Ok, the code in 2.1:
https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java

Of interest

DatabaseDescriptor.setBroadcastAddress(localPublicAddress);
DatabaseDescriptor.setBroadcastRpcAddress(localPublicAddress);

The code in 2.2+:
https://github.com/apache/cassandra/blob/cassandra-2.2/src/java/org/apache/cassandra/locator/Ec2MultiRegionSnitch.java


Becomes

DatabaseDescriptor.setBroadcastAddress(localPublicAddress); if
(DatabaseDescriptor.getBroadcastRpcAddress() == null) {
logger.info("broadcast_rpc_address
unset, broadcasting public IP as rpc_address: {}", localPublicAddress);
DatabaseDescriptor.setBroadcastRpcAddress(localPublicAddress); }
And that if clause added as part of a CASSANDRA-11356 patch, is what is
submarining us. I don't know the otherwise intricacies of the various
address settings in the yaml vis-a-vis EC2MRS, but since we can't configure
it the good-old-2.1-way in 2.2+, this seems broken to us.

I'll try to track down where cassandra startup is complaining to us about
rpc_address: 0.0.0.0 and broadcast_rpc_address being blank/null/commented
out. That section of code may need an exception for EC2MRS.



On Tue, Mar 26, 2019 at 12:01 PM Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> On Tue, Mar 26, 2019 at 5:49 PM Carl Mueller
>  wrote:
>
>> Looking at the code it appears it shouldn't matter what we set the yaml
>> params to. The Ec2MultiRegionSnitch should be using the aws metadata
>> 169.254.169.254 to pick up the internal/external ips as needed.
>>
>
> This is somehow my expectation as well, so maybe the docs are just
> outdated.
>
> I think I'll just have to dig in to the code differences between 2.1 and
>> 2.2. We don't want to specify the glboal IP in any of the yaml fields
>> because the global IP for the instance changes if we do an aws instance
>> restart. Don't want yaml editing to be a part of the instance restart
>> process.
>>
>
> We did solve this in the past by using Elastic IPs: anything prevents you
> from using those?
>
> --
> Alex
>
>


Re: TWCS Compactions & Tombstones

2019-03-26 Thread Rahul Singh
What's your timewindow? Roughly how much data is in each window?

If you examine the sstable data and see that is truly old data with little
chance that it has any new data, you can just remove the SStables. You can
do a rolling restart -- take down a node, remove mc-254400-* and then start
it up.


rahul.xavier.si...@gmail.com

http://cassandra.link



On Tue, Mar 26, 2019 at 8:01 AM Nick Hatfield 
wrote:

> How does one properly rid of sstables that have fallen victim to
> overlapping timestamps? I realized that we had TWCS set in our CF which
> also had a read_repair = 0.1 and after correcting this to 0.0 I can clearly
> see the affects over time on the new sstables. However, I still have old
> sstables that date back some time last year, and I need to remove them:
>
> Max: 09/05/2018 Min: 09/04/2018 Estimated droppable tombstones:
> 0.883205790993204613G Mar 26 11:34 mc-254400-big-Data.db
>
>
> What is the best way to do this? This is on a production system so any
> help would be greatly appreciated.
>
> Thanks,
>


Re: TWCS Compactions & Tombstones

2019-03-26 Thread Jeff Jirsa
Or Upgrade to a version with 
https://issues.apache.org/jira/browse/CASSANDRA-13418 and enable that feature 

-- 
Jeff Jirsa


> On Mar 26, 2019, at 6:23 PM, Rahul Singh  wrote:
> 
> What's your timewindow? Roughly how much data is in each window? 
> 
> If you examine the sstable data and see that is truly old data with little 
> chance that it has any new data, you can just remove the SStables. You can do 
> a rolling restart -- take down a node, remove mc-254400-* and then start it 
> up. 
> 
> 
> rahul.xavier.si...@gmail.com
> 
> http://cassandra.link 
> 
> 
> 
>> On Tue, Mar 26, 2019 at 8:01 AM Nick Hatfield  
>> wrote:
>> How does one properly rid of sstables that have fallen victim to overlapping 
>> timestamps? I realized that we had TWCS set in our CF which also had a 
>> read_repair = 0.1 and after correcting this to 0.0 I can clearly see the 
>> affects over time on the new sstables.  However, I still have old sstables 
>> that date back some time last year, and I need to remove them:
>> 
>> Max: 09/05/2018 Min: 09/04/2018 Estimated droppable tombstones: 
>> 0.8832057909932046  
>>  13G Mar 26 11:34 mc-254400-big-Data.db
>> 
>> 
>> What is the best way to do this? This is on a production system so any help 
>> would be greatly appreciated.
>> 
>> Thanks,


RE: TWCS Compactions & Tombstones

2019-03-26 Thread Nick Hatfield
Thanks for the insight, Rahul. We’re using 1 day for the time window.

compaction = {'class': 
'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy',
  'compaction_window_size': '1',
  'compaction_window_unit': 'DAYS',
  'max_threshold': '32',
  'min_threshold': '4',
  'timestamp_resolution': 'MILLISECONDS',
  'tombstone_compaction_interval': '86400',
  'tombstone_threshold': '0.2',
  'unchecked_tombstone_compaction': 'true'}'

  AND
default_time_to_live = 7884009
  AND
gc_grace_seconds = 86400
  AND
read_repair_chance = 0


Whats the best way to examine the sstable data so that I can verify that it is 
old data, other than by the min / max timestamps?

Thanks for your help

From: Rahul Singh [mailto:rahul.xavier.si...@gmail.com]
Sent: Tuesday, March 26, 2019 9:24 PM
To: user 
Subject: Re: TWCS Compactions & Tombstones

What's your timewindow? Roughly how much data is in each window?

If you examine the sstable data and see that is truly old data with little 
chance that it has any new data, you can just remove the SStables. You can do a 
rolling restart -- take down a node, remove mc-254400-* and then start it up.


rahul.xavier.si...@gmail.com

http://cassandra.link



On Tue, Mar 26, 2019 at 8:01 AM Nick Hatfield 
mailto:nick.hatfi...@metricly.com>> wrote:
How does one properly rid of sstables that have fallen victim to overlapping 
timestamps? I realized that we had TWCS set in our CF which also had a 
read_repair = 0.1 and after correcting this to 0.0 I can clearly see the 
affects over time on the new sstables. However, I still have old sstables that 
date back some time last year, and I need to remove them:

Max: 09/05/2018 Min: 09/04/2018 Estimated droppable tombstones: 
0.883205790993204613G Mar 26 11:34 mc-254400-big-Data.db


What is the best way to do this? This is on a production system so any help 
would be greatly appreciated.

Thanks,


Re: TWCS Compactions & Tombstones

2019-03-26 Thread James Brown
Have you tried enabling 'unchecked_tombstone_compaction' on the affected
tables?

On Tue, Mar 26, 2019 at 5:01 AM Nick Hatfield 
wrote:

> How does one properly rid of sstables that have fallen victim to
> overlapping timestamps? I realized that we had TWCS set in our CF which
> also had a read_repair = 0.1 and after correcting this to 0.0 I can clearly
> see the affects over time on the new sstables. However, I still have old
> sstables that date back some time last year, and I need to remove them:
>
> Max: 09/05/2018 Min: 09/04/2018 Estimated droppable tombstones:
> 0.883205790993204613G Mar 26 11:34 mc-254400-big-Data.db
>
>
> What is the best way to do this? This is on a production system so any
> help would be greatly appreciated.
>
> Thanks,
>


-- 
James Brown
Engineer