Re: regarding drain process

2016-07-27 Thread Jeff Jirsa
Are you running chef/puppet or similar? 

 

From: Varun Barala 
Reply-To: "user@cassandra.apache.org" 
Date: Tuesday, July 26, 2016 at 10:15 PM
To: "user@cassandra.apache.org" 
Subject: regarding drain process

 

Hi all,


Recently I'm facing a problem with cassandra nodes. Nodes go down very 
frequently.
I went through system.log and found the reason that somehow c* triggers 
draining process. 

I know the purpose of nodetool drain but it should not trigger automatically, 
right ?

or is there any specific settings for the same ?

we are using C*-2.1.13.


please let me know if you need more info.

Thanking you!!

Regards,
Varun Barala



smime.p7s
Description: S/MIME cryptographic signature


Approximate row count

2016-07-27 Thread Luke Jolly
I have a table that I'm storing ad impression data in with every row being
an impression.  I want to get a count of total rows / impressions.  I know
that there is in the ball park of 200-400 million rows in this table and
from my reading "Number of keys" in the output of cfstats should be a
reasonably accurate estimate. However, it is 39434. Am I misunderstanding
something? Every node in my cluster has a complete copy of the keyspace.


Table: impressions_2
SSTable count: 22
Space used (live): 51255709817
Space used (total): 51255709817
Space used by snapshots (total): 49415721741
Off heap memory used (total): 30824975
SSTable Compression Ratio: 0.20347134631246266
Number of keys (estimate): 39434
Memtable cell count: 18279
Memtable data size: 15897457
Memtable off heap memory used: 0
Memtable switch count: 1294
Local read count: 347016
Local read latency: 12.573 ms
Local write count: 109226238
Local write latency: 0.023 ms
Pending flushes: 0
Bloom filter false positives: 655
Bloom filter false ratio: 0.0
Bloom filter space used: 97552
Bloom filter off heap memory used: 97376
Index summary off heap memory used: 26719
Compression metadata off heap memory used: 30700880
Compacted partition minimum bytes: 311
Compacted partition maximum bytes: 386857368
Compacted partition mean bytes: 6424107
Average live cells per slice (last five minutes): 
1027.9502011434631
Maximum live cells per slice (last five minutes): 5722
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1


Re: Approximate row count

2016-07-27 Thread Chris Lohfink
the number of keys are the number of *partition keys, *not row keys. You
have ~39434 partitions, ranging from 311 bytes to 386mb. Looks like you
have some wide partitions that contain many of your rows.

Chris Lohfink

On Wed, Jul 27, 2016 at 1:44 PM, Luke Jolly  wrote:

> I have a table that I'm storing ad impression data in with every row being
> an impression.  I want to get a count of total rows / impressions.  I know
> that there is in the ball park of 200-400 million rows in this table and
> from my reading "Number of keys" in the output of cfstats should be a
> reasonably accurate estimate. However, it is 39434. Am I misunderstanding
> something? Every node in my cluster has a complete copy of the keyspace.
>
>
>   Table: impressions_2
>   SSTable count: 22
>   Space used (live): 51255709817
>   Space used (total): 51255709817
>   Space used by snapshots (total): 49415721741
>   Off heap memory used (total): 30824975
>   SSTable Compression Ratio: 0.20347134631246266
>   Number of keys (estimate): 39434
>   Memtable cell count: 18279
>   Memtable data size: 15897457
>   Memtable off heap memory used: 0
>   Memtable switch count: 1294
>   Local read count: 347016
>   Local read latency: 12.573 ms
>   Local write count: 109226238
>   Local write latency: 0.023 ms
>   Pending flushes: 0
>   Bloom filter false positives: 655
>   Bloom filter false ratio: 0.0
>   Bloom filter space used: 97552
>   Bloom filter off heap memory used: 97376
>   Index summary off heap memory used: 26719
>   Compression metadata off heap memory used: 30700880
>   Compacted partition minimum bytes: 311
>   Compacted partition maximum bytes: 386857368
>   Compacted partition mean bytes: 6424107
>   Average live cells per slice (last five minutes): 
> 1027.9502011434631
>   Maximum live cells per slice (last five minutes): 5722
>   Average tombstones per slice (last five minutes): 1.0
>   Maximum tombstones per slice (last five minutes): 1
>
>


Re: Re : Purging tombstones from a particular row in SSTable

2016-07-27 Thread DuyHai Doan
This feature is also exposed directly in nodetool from version Cassandra 3.4

nodetool compact --user-defined 

On Wed, Jul 27, 2016 at 9:58 PM, Vinay Chella  wrote:

> You can run file level compaction using JMX to get rid of tombstones in
> one SSTable. Ensure you set GC_Grace_seconds such that
>
> current time >= deletion(tombstone time)+ GC_Grace_seconds
>
>
> File level compaction
>
> /usr/bin/java -jar cmdline-jmxclient-0.10.3.jar - localhost:
>> ​{​
>> ​port}
>>  org.apache.cassandra.db:type=CompactionManager 
>> forceUserDefinedCompaction="'${KEYSPACE}','${
>> ​SSTABLEFILENAME
>> }'""
>>
>>
>
>
>
> On Wed, Jul 27, 2016 at 11:59 AM, sai krishnam raju potturi <
> pskraj...@gmail.com> wrote:
>
>> hi;
>>   we have a columnfamily that has around 1000 rows, with one row is
>> really huge (million columns). 95% of the row contains tombstones. Since
>> there exists just one SSTable , there is going to be no compaction kicked
>> in. Any way we can get rid of the tombstones in that row?
>>
>> Userdefined compaction nor nodetool compact had no effect. Any ideas
>> folks?
>>
>> thanks
>>
>>
>>
>
>


Re: Re : Purging tombstones from a particular row in SSTable

2016-07-27 Thread Vinay Chella
You can run file level compaction using JMX to get rid of tombstones in one
SSTable. Ensure you set GC_Grace_seconds such that

current time >= deletion(tombstone time)+ GC_Grace_seconds


File level compaction

/usr/bin/java -jar cmdline-jmxclient-0.10.3.jar - localhost:
> ​{​
> ​port}
>  org.apache.cassandra.db:type=CompactionManager 
> forceUserDefinedCompaction="'${KEYSPACE}','${
> ​SSTABLEFILENAME
> }'""
>
>



On Wed, Jul 27, 2016 at 11:59 AM, sai krishnam raju potturi <
pskraj...@gmail.com> wrote:

> hi;
>   we have a columnfamily that has around 1000 rows, with one row is really
> huge (million columns). 95% of the row contains tombstones. Since there
> exists just one SSTable , there is going to be no compaction kicked in. Any
> way we can get rid of the tombstones in that row?
>
> Userdefined compaction nor nodetool compact had no effect. Any ideas folks?
>
> thanks
>
>
>


Re : Purging tombstones from a particular row in SSTable

2016-07-27 Thread sai krishnam raju potturi
hi;
  we have a columnfamily that has around 1000 rows, with one row is really
huge (million columns). 95% of the row contains tombstones. Since there
exists just one SSTable , there is going to be no compaction kicked in. Any
way we can get rid of the tombstones in that row?

Userdefined compaction nor nodetool compact had no effect. Any ideas folks?

thanks


Re: Re : Purging tombstones from a particular row in SSTable

2016-07-27 Thread sai krishnam raju potturi
thanks Vinay and DuyHai.

we are using verison 2.0.14. I did "user defined compaction" following
the instructions in the below link, The tombstones still persist even after
that.

https://gist.github.com/jeromatron/e238e5795b3e79866b83

Also, we changed the tombstone_compaction_interval : 1800 and
tombstone_threshold
: 0.1, but it did not help.

thanks



On Wed, Jul 27, 2016 at 4:05 PM, DuyHai Doan  wrote:

> This feature is also exposed directly in nodetool from version Cassandra
> 3.4
>
> nodetool compact --user-defined 
>
> On Wed, Jul 27, 2016 at 9:58 PM, Vinay Chella  wrote:
>
>> You can run file level compaction using JMX to get rid of tombstones in
>> one SSTable. Ensure you set GC_Grace_seconds such that
>>
>> current time >= deletion(tombstone time)+ GC_Grace_seconds
>>
>>
>> File level compaction
>>
>> /usr/bin/java -jar cmdline-jmxclient-0.10.3.jar - localhost:
>>> ​{​
>>> ​port}
>>>  org.apache.cassandra.db:type=CompactionManager 
>>> forceUserDefinedCompaction="'${KEYSPACE}','${
>>> ​SSTABLEFILENAME
>>> }'""
>>>
>>>
>>
>>
>>
>> On Wed, Jul 27, 2016 at 11:59 AM, sai krishnam raju potturi <
>> pskraj...@gmail.com> wrote:
>>
>>> hi;
>>>   we have a columnfamily that has around 1000 rows, with one row is
>>> really huge (million columns). 95% of the row contains tombstones. Since
>>> there exists just one SSTable , there is going to be no compaction kicked
>>> in. Any way we can get rid of the tombstones in that row?
>>>
>>> Userdefined compaction nor nodetool compact had no effect. Any ideas
>>> folks?
>>>
>>> thanks
>>>
>>>
>>>
>>
>>
>


Re: Node after restart sees other nodes down for 10 minutes

2016-07-27 Thread Farzad Panahi
Thanks Paulo for the reply.

Cassandra version is 3.0.8. I will test what you said and share the results.

On Wed, Jul 27, 2016 at 2:01 PM, Paulo Motta 
wrote:

> This looks somewhat related to CASSANDRA-9630. What is the C* version?
>
> Can you check with netstats if other nodes keep connections with the
> stopped node in the CLOSE_WAIT state? And also if the problem disappears if
> you run nodetool disablegossip before stopping the node?
>
> 2016-07-26 16:54 GMT-03:00 Farzad Panahi :
>
>> I am new to Cassandra and trying to figure out how the cluster behaves
>> when things go south.
>>
>> I have a 6-node cluster, RF=3.
>>
>> I stop Cassandra service on a node for a while. All nodes see the node as
>> DN. After a while I start the Cassandra service on DN. Interesting point is
>> that all other nodes see the node now as UN but the node itself sees 4
>> nodes as DN and only one node as UN. After about 10 minutes the node sees
>> other nodes as up as well.
>>
>> I am trying to figure out where this delay is coming from.
>>
>> I have attached part of system.log that looks interesting. Looks like
>> after Gossiper logs InetAddress  is now UP the node is actually seeing
>> that node as up even though the node has already handshaked with that node
>> before.
>>
>> Any ideas?
>>
>> Cheers
>>
>> Farzad
>>
>> --
>> INFO  [main] 2016-07-25 21:58:46,044 StorageService.java:533 - Cassandra
>> version: 3.0.8
>> INFO  [main] 2016-07-25 21:58:46,098 StorageService.java:534 - Thrift API
>> version: 20.1.0
>> INFO  [main] 2016-07-25 21:58:46,150 StorageService.java:535 - CQL
>> supported versions: 3.4.0 (default: 3.4.0)
>> INFO  [main] 2016-07-25 21:58:46,284 IndexSummaryManager.java:85 -
>> Initializing index summary manager with a memory pool size of 198 MB and a
>> resize interval of 60 minutes
>> INFO  [main] 2016-07-25 21:58:46,343 StorageService.java:554 - Loading
>> persisted ring state
>> INFO  [main] 2016-07-25 21:58:46,418 StorageService.java:743 - Starting
>> up server gossip
>> INFO  [main] 2016-07-25 21:58:46,680 TokenMetadata.java:429 - Updating
>> topology for ip-10-4-43-66.ec2.internal/10.4.43.66
>> INFO  [main] 2016-07-25 21:58:46,707 TokenMetadata.java:429 - Updating
>> topology for ip-10-4-43-66.ec2.internal/10.4.43.66
>> INFO  [main] 2016-07-25 21:58:46,792 MessagingService.java:557 - Starting
>> Messaging Service on ip-10-4-43-66.ec2.internal/10.4.43.66:7000 (eth0)
>>
>> INFO  [HANDSHAKE-/10.4.68.222] 2016-07-25 21:58:46,920
>> OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.222
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,011 Gossiper.java:1028 - Node /
>> 10.4.68.221 has restarted, now UP
>> INFO  [HANDSHAKE-/10.4.68.222] 2016-07-25 21:58:47,007
>> OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.222
>> INFO  [main] 2016-07-25 21:58:47,030 StorageService.java:1902 - Node
>> ip-10-4-43-66.ec2.internal/10.4.43.66 state jump to NORMAL
>> INFO  [main] 2016-07-25 21:58:47,096 CassandraDaemon.java:644 - Waiting
>> for gossip to settle before accepting client requests...
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,134 StorageService.java:1902 -
>> Node /10.4.68.221 state jump to NORMAL
>> INFO  [HANDSHAKE-/10.4.68.221] 2016-07-25 21:58:47,137
>> OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.221
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,211 TokenMetadata.java:429 -
>> Updating topology for /10.4.68.221
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,261 TokenMetadata.java:429 -
>> Updating topology for /10.4.68.221
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,295 Gossiper.java:1028 - Node /
>> 10.4.68.222 has restarted, now UP
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,337 StorageService.java:1902 -
>> Node /10.4.68.222 state jump to NORMAL
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,385 TokenMetadata.java:429 -
>> Updating topology for /10.4.68.222
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,452 TokenMetadata.java:429 -
>> Updating topology for /10.4.68.222
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,497 Gossiper.java:1028 - Node /
>> 10.4.54.176 has restarted, now UP
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,544 StorageService.java:1902 -
>> Node /10.4.54.176 state jump to NORMAL
>> INFO  [HANDSHAKE-/10.4.54.176] 2016-07-25 21:58:47,548
>> OutboundTcpConnection.java:515 - Handshaking version with /10.4.54.176
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,594 TokenMetadata.java:429 -
>> Updating topology for /10.4.54.176
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,639 TokenMetadata.java:429 -
>> Updating topology for /10.4.54.176
>> WARN  [GossipTasks:1] 2016-07-25 21:58:47,678 FailureDetector.java:287 -
>> Not marking nodes down due to local pause of 43226235115 > 50
>> INFO  [HANDSHAKE-/10.4.43.65] 2016-07-25 21:58:47,679
>> OutboundTcpConnection.java:515 - Handshaking version with /10.4.43.65
>> INFO  [GossipStage:1] 2016-07-25 21:58:47,757 

Re: Re : Purging tombstones from a particular row in SSTable

2016-07-27 Thread sai krishnam raju potturi
The read queries are continuously failing though because of the tombstones.
"Request did not complete within rpc_timeout."

thanks


On Wed, Jul 27, 2016 at 5:51 PM, Jeff Jirsa 
wrote:

> 220kb worth of tombstones doesn’t seem like enough to worry about.
>
>
>
>
>
> *From: *sai krishnam raju potturi 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Wednesday, July 27, 2016 at 2:43 PM
> *To: *Cassandra Users 
> *Subject: *Re: Re : Purging tombstones from a particular row in SSTable
>
>
>
> and also the sstable size in question is like 220 kb in size.
>
>
>
> thanks
>
>
>
>
>
> On Wed, Jul 27, 2016 at 5:41 PM, sai krishnam raju potturi <
> pskraj...@gmail.com> wrote:
>
> it's set to 1800 Vinay.
>
>
>
>  bloom_filter_fp_chance=0.01 AND
>
>   caching='KEYS_ONLY' AND
>
>   comment='' AND
>
>   dclocal_read_repair_chance=0.10 AND
>
>   gc_grace_seconds=1800 AND
>
>   index_interval=128 AND
>
>   read_repair_chance=0.00 AND
>
>   replicate_on_write='true' AND
>
>   populate_io_cache_on_flush='false' AND
>
>   default_time_to_live=0 AND
>
>   speculative_retry='99.0PERCENTILE' AND
>
>   memtable_flush_period_in_ms=0 AND
>
>   compaction={'min_sstable_size': '1024', 'tombstone_threshold': '0.01',
> 'tombstone_compaction_interval': '1800', 'class':
> 'SizeTieredCompactionStrategy'} AND
>
>   compression={'sstable_compression': 'LZ4Compressor'};
>
>
>
> thanks
>
>
>
>
>
> On Wed, Jul 27, 2016 at 5:34 PM, Vinay Kumar Chella <
> vinaykumar...@gmail.com> wrote:
>
> What is your GC_grace_seconds set to?
>
>
>
> On Wed, Jul 27, 2016 at 1:13 PM, sai krishnam raju potturi <
> pskraj...@gmail.com> wrote:
>
> thanks Vinay and DuyHai.
>
>
>
> we are using verison 2.0.14. I did "user defined compaction" following
> the instructions in the below link, The tombstones still persist even after
> that.
>
>
>
> https://gist.github.com/jeromatron/e238e5795b3e79866b83
> 
>
>
>
> Also, we changed the tombstone_compaction_interval : 1800
> and tombstone_threshold : 0.1, but it did not help.
>
>
>
> thanks
>
>
>
>
>
>
>
> On Wed, Jul 27, 2016 at 4:05 PM, DuyHai Doan  wrote:
>
> This feature is also exposed directly in nodetool from version Cassandra
> 3.4
>
>
>
> nodetool compact --user-defined 
>
>
>
> On Wed, Jul 27, 2016 at 9:58 PM, Vinay Chella  wrote:
>
> You can run file level compaction using JMX to get rid of tombstones in
> one SSTable. Ensure you set GC_Grace_seconds such that
>
>
>
> current time >= deletion(tombstone time)+ GC_Grace_seconds
>
>
>
> File level compaction
>
>
>
> /usr/bin/java -jar cmdline-jmxclient-0.10.3.jar - localhost:
>
> ​{​
>
> ​port}
>
>  org.apache.cassandra.db:type=CompactionManager 
> forceUserDefinedCompaction="'${KEYSPACE}','${
>
> ​SSTABLEFILENAME
>
> }'""
>
>
>
>
>
>
>
>
> On Wed, Jul 27, 2016 at 11:59 AM, sai krishnam raju potturi <
> pskraj...@gmail.com> wrote:
>
> hi;
>
>   we have a columnfamily that has around 1000 rows, with one row is really
> huge (million columns). 95% of the row contains tombstones. Since there
> exists just one SSTable , there is going to be no compaction kicked in. Any
> way we can get rid of the tombstones in that row?
>
>
>
> Userdefined compaction nor nodetool compact had no effect. Any ideas folks?
>
>
>
> thanks
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


Re: use private ip for internode and public IP for seeds

2016-07-27 Thread Paulo Motta
Were you able to troubleshoot this yet? Private IPs for listen_address,
public IP for broadcast_address, and prefer_local=true on
cassandra-rackdc.properties should be sufficient to make nodes in the same
DC communicate over private address, so something must be going on there.

Can you check in your system.log which address is the messaging service
binding to and the values of listen_address and broadcast_address? Did you
try telnet  7000?

> Using "prefer_local=true" on cassandra-rackdc.properties just makes the
clients try to connect to the private IPs which fails.

Do you see any error in the logs related to this?

2016-07-14 10:15 GMT-03:00 Spiros Ioannou :

> Hello,
>
> Let's say we have a 3-node (linux) cassandra 3.7 cluster on GCE (same
> probably for EC2). VMs know their private IPs, and also have a public IP.
>
> Nodes were configured according to doc: multiple network interfaces which
> in short says to use private IPs for listen_address, public IP for
> broadcast_address, and public ip for seeds.
>
> According to the doc above, "Cassandra switches to the private IP after
> establishing a connection." but this does not happen, tcpdump shows one end
> of traffic to port 7000 is always a public IP.
>
> Using "prefer_local=true" on cassandra-rackdc.properties just makes the
> clients try to connect to the private IPs which fails.
>
> All this works, clients connect and nodes see each other, but
> communication between nodes happen through their pubic IPs. We want clients
> to connect to the public IP, get a list of public IPs as contact points
> (endpoints) from the coordinator, but coordinator to forward requests
> through the private IPs. Can this be done?
>
>
>
>
>
>
>
>
>
>
> *Spiros Ioannou Infrastructure Lead EngineerinAccesswww.inaccess.com
> M: +30 6973-903808W: +30 210-6802-358*
>


Re: Approximate row count

2016-07-27 Thread Luke Jolly
Is there any other way to get an estimate of rows?

On Wed, Jul 27, 2016 at 2:49 PM Chris Lohfink  wrote:

> the number of keys are the number of *partition keys, *not row keys. You
> have ~39434 partitions, ranging from 311 bytes to 386mb. Looks like you
> have some wide partitions that contain many of your rows.
>
> Chris Lohfink
>
> On Wed, Jul 27, 2016 at 1:44 PM, Luke Jolly  wrote:
>
>> I have a table that I'm storing ad impression data in with every row
>> being an impression.  I want to get a count of total rows / impressions.  I
>> know that there is in the ball park of 200-400 million rows in this
>> table and from my reading "Number of keys" in the output of cfstats
>> should be a reasonably accurate estimate. However, it is 39434. Am I
>> misunderstanding something? Every node in my cluster has a complete copy of
>> the keyspace.
>>
>>
>>  Table: impressions_2
>>  SSTable count: 22
>>  Space used (live): 51255709817
>>  Space used (total): 51255709817
>>  Space used by snapshots (total): 49415721741
>>  Off heap memory used (total): 30824975
>>  SSTable Compression Ratio: 0.20347134631246266
>>  Number of keys (estimate): 39434
>>  Memtable cell count: 18279
>>  Memtable data size: 15897457
>>  Memtable off heap memory used: 0
>>  Memtable switch count: 1294
>>  Local read count: 347016
>>  Local read latency: 12.573 ms
>>  Local write count: 109226238
>>  Local write latency: 0.023 ms
>>  Pending flushes: 0
>>  Bloom filter false positives: 655
>>  Bloom filter false ratio: 0.0
>>  Bloom filter space used: 97552
>>  Bloom filter off heap memory used: 97376
>>  Index summary off heap memory used: 26719
>>  Compression metadata off heap memory used: 30700880
>>  Compacted partition minimum bytes: 311
>>  Compacted partition maximum bytes: 386857368
>>  Compacted partition mean bytes: 6424107
>>  Average live cells per slice (last five minutes): 
>> 1027.9502011434631
>>  Maximum live cells per slice (last five minutes): 5722
>>  Average tombstones per slice (last five minutes): 1.0
>>  Maximum tombstones per slice (last five minutes): 1
>>
>>
>


Re: Node after restart sees other nodes down for 10 minutes

2016-07-27 Thread Paulo Motta
This looks somewhat related to CASSANDRA-9630. What is the C* version?

Can you check with netstats if other nodes keep connections with the
stopped node in the CLOSE_WAIT state? And also if the problem disappears if
you run nodetool disablegossip before stopping the node?

2016-07-26 16:54 GMT-03:00 Farzad Panahi :

> I am new to Cassandra and trying to figure out how the cluster behaves
> when things go south.
>
> I have a 6-node cluster, RF=3.
>
> I stop Cassandra service on a node for a while. All nodes see the node as
> DN. After a while I start the Cassandra service on DN. Interesting point is
> that all other nodes see the node now as UN but the node itself sees 4
> nodes as DN and only one node as UN. After about 10 minutes the node sees
> other nodes as up as well.
>
> I am trying to figure out where this delay is coming from.
>
> I have attached part of system.log that looks interesting. Looks like
> after Gossiper logs InetAddress  is now UP the node is actually seeing
> that node as up even though the node has already handshaked with that node
> before.
>
> Any ideas?
>
> Cheers
>
> Farzad
>
> --
> INFO  [main] 2016-07-25 21:58:46,044 StorageService.java:533 - Cassandra
> version: 3.0.8
> INFO  [main] 2016-07-25 21:58:46,098 StorageService.java:534 - Thrift API
> version: 20.1.0
> INFO  [main] 2016-07-25 21:58:46,150 StorageService.java:535 - CQL
> supported versions: 3.4.0 (default: 3.4.0)
> INFO  [main] 2016-07-25 21:58:46,284 IndexSummaryManager.java:85 -
> Initializing index summary manager with a memory pool size of 198 MB and a
> resize interval of 60 minutes
> INFO  [main] 2016-07-25 21:58:46,343 StorageService.java:554 - Loading
> persisted ring state
> INFO  [main] 2016-07-25 21:58:46,418 StorageService.java:743 - Starting up
> server gossip
> INFO  [main] 2016-07-25 21:58:46,680 TokenMetadata.java:429 - Updating
> topology for ip-10-4-43-66.ec2.internal/10.4.43.66
> INFO  [main] 2016-07-25 21:58:46,707 TokenMetadata.java:429 - Updating
> topology for ip-10-4-43-66.ec2.internal/10.4.43.66
> INFO  [main] 2016-07-25 21:58:46,792 MessagingService.java:557 - Starting
> Messaging Service on ip-10-4-43-66.ec2.internal/10.4.43.66:7000 (eth0)
>
> INFO  [HANDSHAKE-/10.4.68.222] 2016-07-25 21:58:46,920
> OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.222
> INFO  [GossipStage:1] 2016-07-25 21:58:47,011 Gossiper.java:1028 - Node /
> 10.4.68.221 has restarted, now UP
> INFO  [HANDSHAKE-/10.4.68.222] 2016-07-25 21:58:47,007
> OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.222
> INFO  [main] 2016-07-25 21:58:47,030 StorageService.java:1902 - Node
> ip-10-4-43-66.ec2.internal/10.4.43.66 state jump to NORMAL
> INFO  [main] 2016-07-25 21:58:47,096 CassandraDaemon.java:644 - Waiting
> for gossip to settle before accepting client requests...
> INFO  [GossipStage:1] 2016-07-25 21:58:47,134 StorageService.java:1902 -
> Node /10.4.68.221 state jump to NORMAL
> INFO  [HANDSHAKE-/10.4.68.221] 2016-07-25 21:58:47,137
> OutboundTcpConnection.java:515 - Handshaking version with /10.4.68.221
> INFO  [GossipStage:1] 2016-07-25 21:58:47,211 TokenMetadata.java:429 -
> Updating topology for /10.4.68.221
> INFO  [GossipStage:1] 2016-07-25 21:58:47,261 TokenMetadata.java:429 -
> Updating topology for /10.4.68.221
> INFO  [GossipStage:1] 2016-07-25 21:58:47,295 Gossiper.java:1028 - Node /
> 10.4.68.222 has restarted, now UP
> INFO  [GossipStage:1] 2016-07-25 21:58:47,337 StorageService.java:1902 -
> Node /10.4.68.222 state jump to NORMAL
> INFO  [GossipStage:1] 2016-07-25 21:58:47,385 TokenMetadata.java:429 -
> Updating topology for /10.4.68.222
> INFO  [GossipStage:1] 2016-07-25 21:58:47,452 TokenMetadata.java:429 -
> Updating topology for /10.4.68.222
> INFO  [GossipStage:1] 2016-07-25 21:58:47,497 Gossiper.java:1028 - Node /
> 10.4.54.176 has restarted, now UP
> INFO  [GossipStage:1] 2016-07-25 21:58:47,544 StorageService.java:1902 -
> Node /10.4.54.176 state jump to NORMAL
> INFO  [HANDSHAKE-/10.4.54.176] 2016-07-25 21:58:47,548
> OutboundTcpConnection.java:515 - Handshaking version with /10.4.54.176
> INFO  [GossipStage:1] 2016-07-25 21:58:47,594 TokenMetadata.java:429 -
> Updating topology for /10.4.54.176
> INFO  [GossipStage:1] 2016-07-25 21:58:47,639 TokenMetadata.java:429 -
> Updating topology for /10.4.54.176
> WARN  [GossipTasks:1] 2016-07-25 21:58:47,678 FailureDetector.java:287 -
> Not marking nodes down due to local pause of 43226235115 > 50
> INFO  [HANDSHAKE-/10.4.43.65] 2016-07-25 21:58:47,679
> OutboundTcpConnection.java:515 - Handshaking version with /10.4.43.65
> INFO  [GossipStage:1] 2016-07-25 21:58:47,757 Gossiper.java:1028 - Node /
> 10.4.54.177 has restarted, now UP
> INFO  [GossipStage:1] 2016-07-25 21:58:47,788 StorageService.java:1902 -
> Node /10.4.54.177 state jump to NORMAL
> INFO  [HANDSHAKE-/10.4.54.177] 2016-07-25 21:58:47,789
> OutboundTcpConnection.java:515 - Handshaking 

Re: Re : Purging tombstones from a particular row in SSTable

2016-07-27 Thread Ben Slater
No real evidence it’s the case hear but the one time I’ve seen tombstones
that refused to go away despite many attempts at compactions, etc it turned
out to be due to the data being written (and deleted) with invalid
timestamps years in the future (we guessed due to the time being set wrong
somewhere at some point in time).

Cheers
Ben

On Thu, 28 Jul 2016 at 09:17 Alain RODRIGUEZ  wrote:

> Hi,
>
> I just released a detailed post about tombstones today that might be of
> some interest for you:
> http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
>
> 220kb worth of tombstones doesn’t seem like enough to worry about.
>
>
> +1
>
> I believe you might be missing some other bigger SSTable having a lot of
> tombstones as well. Finding the biggest sstable and reading the tombstone
> ratio from there might be more relevant.
>
> You also should give a try to: "unchecked_tombstone_compaction" set to
> true rather than tuning other options so aggressively. The "single SSTable
> compaction" section of my post might help you on this issue:
> http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html#single-sstable-compaction
>
> Other thoughts:
>
> Also if you use TTLs and timeseries, using TWCS instead of STCS could be
> more efficient evicting tombstones.
>
> we have a columnfamily that has around 1000 rows, with one row is really
>> huge (million columns)
>
>
> I am sorry to say that this model does not look that great. Imbalances
> might become an issue as a few nodes will handle a lot more load than the
> rest of the nodes. Also even if this is getting improved in newer versions
> of Cassandra, wide rows are something you want to avoid while using 2.0.14
> (which is no longer supported for about a year now). I know it is not
> always easy and never the good time, but maybe should you consider
> upgrading both your model and your version of Cassandra (regardless of the
> fact you manage to solve this issue or not with
> "unchecked_tombstone_compaction").
>
> Good luck,
>
> C*heers,
> ---
> Alain Rodriguez - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2016-07-28 0:00 GMT+02:00 sai krishnam raju potturi :
>
>> The read queries are continuously failing though because of the
>> tombstones. "Request did not complete within rpc_timeout."
>>
>> thanks
>>
>>
>> On Wed, Jul 27, 2016 at 5:51 PM, Jeff Jirsa 
>> wrote:
>>
>>> 220kb worth of tombstones doesn’t seem like enough to worry about.
>>>
>>>
>>>
>>>
>>>
>>> *From: *sai krishnam raju potturi 
>>> *Reply-To: *"user@cassandra.apache.org" 
>>> *Date: *Wednesday, July 27, 2016 at 2:43 PM
>>> *To: *Cassandra Users 
>>> *Subject: *Re: Re : Purging tombstones from a particular row in SSTable
>>>
>>>
>>>
>>> and also the sstable size in question is like 220 kb in size.
>>>
>>>
>>>
>>> thanks
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jul 27, 2016 at 5:41 PM, sai krishnam raju potturi <
>>> pskraj...@gmail.com> wrote:
>>>
>>> it's set to 1800 Vinay.
>>>
>>>
>>>
>>>  bloom_filter_fp_chance=0.01 AND
>>>
>>>   caching='KEYS_ONLY' AND
>>>
>>>   comment='' AND
>>>
>>>   dclocal_read_repair_chance=0.10 AND
>>>
>>>   gc_grace_seconds=1800 AND
>>>
>>>   index_interval=128 AND
>>>
>>>   read_repair_chance=0.00 AND
>>>
>>>   replicate_on_write='true' AND
>>>
>>>   populate_io_cache_on_flush='false' AND
>>>
>>>   default_time_to_live=0 AND
>>>
>>>   speculative_retry='99.0PERCENTILE' AND
>>>
>>>   memtable_flush_period_in_ms=0 AND
>>>
>>>   compaction={'min_sstable_size': '1024', 'tombstone_threshold': '0.01',
>>> 'tombstone_compaction_interval': '1800', 'class':
>>> 'SizeTieredCompactionStrategy'} AND
>>>
>>>   compression={'sstable_compression': 'LZ4Compressor'};
>>>
>>>
>>>
>>> thanks
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jul 27, 2016 at 5:34 PM, Vinay Kumar Chella <
>>> vinaykumar...@gmail.com> wrote:
>>>
>>> What is your GC_grace_seconds set to?
>>>
>>>
>>>
>>> On Wed, Jul 27, 2016 at 1:13 PM, sai krishnam raju potturi <
>>> pskraj...@gmail.com> wrote:
>>>
>>> thanks Vinay and DuyHai.
>>>
>>>
>>>
>>> we are using verison 2.0.14. I did "user defined compaction"
>>> following the instructions in the below link, The tombstones still persist
>>> even after that.
>>>
>>>
>>>
>>> https://gist.github.com/jeromatron/e238e5795b3e79866b83
>>> 
>>>
>>>
>>>
>>> Also, we changed the tombstone_compaction_interval : 1800
>>> and tombstone_threshold : 0.1, but it did not help.
>>>
>>>
>>>
>>> thanks
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Jul 27, 

Re: Re : Purging tombstones from a particular row in SSTable

2016-07-27 Thread Vinay Kumar Chella
What is your GC_grace_seconds set to?

On Wed, Jul 27, 2016 at 1:13 PM, sai krishnam raju potturi <
pskraj...@gmail.com> wrote:

> thanks Vinay and DuyHai.
>
> we are using verison 2.0.14. I did "user defined compaction" following
> the instructions in the below link, The tombstones still persist even after
> that.
>
> https://gist.github.com/jeromatron/e238e5795b3e79866b83
>
> Also, we changed the tombstone_compaction_interval : 1800 and 
> tombstone_threshold
> : 0.1, but it did not help.
>
> thanks
>
>
>
> On Wed, Jul 27, 2016 at 4:05 PM, DuyHai Doan  wrote:
>
>> This feature is also exposed directly in nodetool from version Cassandra
>> 3.4
>>
>> nodetool compact --user-defined 
>>
>> On Wed, Jul 27, 2016 at 9:58 PM, Vinay Chella 
>> wrote:
>>
>>> You can run file level compaction using JMX to get rid of tombstones in
>>> one SSTable. Ensure you set GC_Grace_seconds such that
>>>
>>> current time >= deletion(tombstone time)+ GC_Grace_seconds
>>>
>>>
>>> File level compaction
>>>
>>> /usr/bin/java -jar cmdline-jmxclient-0.10.3.jar - localhost:
 ​{​
 ​port}
  org.apache.cassandra.db:type=CompactionManager 
 forceUserDefinedCompaction="'${KEYSPACE}','${
 ​SSTABLEFILENAME
 }'""


>>>
>>>
>>>
>>> On Wed, Jul 27, 2016 at 11:59 AM, sai krishnam raju potturi <
>>> pskraj...@gmail.com> wrote:
>>>
 hi;
   we have a columnfamily that has around 1000 rows, with one row is
 really huge (million columns). 95% of the row contains tombstones. Since
 there exists just one SSTable , there is going to be no compaction kicked
 in. Any way we can get rid of the tombstones in that row?

 Userdefined compaction nor nodetool compact had no effect. Any ideas
 folks?

 thanks



>>>
>>>
>>
>


Re: Re : Purging tombstones from a particular row in SSTable

2016-07-27 Thread sai krishnam raju potturi
it's set to 1800 Vinay.

 bloom_filter_fp_chance=0.01 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.10 AND
  gc_grace_seconds=1800 AND
  index_interval=128 AND
  read_repair_chance=0.00 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'min_sstable_size': '1024', 'tombstone_threshold': '0.01',
'tombstone_compaction_interval': '1800', 'class':
'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

thanks


On Wed, Jul 27, 2016 at 5:34 PM, Vinay Kumar Chella  wrote:

> What is your GC_grace_seconds set to?
>
> On Wed, Jul 27, 2016 at 1:13 PM, sai krishnam raju potturi <
> pskraj...@gmail.com> wrote:
>
>> thanks Vinay and DuyHai.
>>
>> we are using verison 2.0.14. I did "user defined compaction"
>> following the instructions in the below link, The tombstones still persist
>> even after that.
>>
>> https://gist.github.com/jeromatron/e238e5795b3e79866b83
>>
>> Also, we changed the tombstone_compaction_interval : 1800 and 
>> tombstone_threshold
>> : 0.1, but it did not help.
>>
>> thanks
>>
>>
>>
>> On Wed, Jul 27, 2016 at 4:05 PM, DuyHai Doan 
>> wrote:
>>
>>> This feature is also exposed directly in nodetool from version Cassandra
>>> 3.4
>>>
>>> nodetool compact --user-defined 
>>>
>>> On Wed, Jul 27, 2016 at 9:58 PM, Vinay Chella 
>>> wrote:
>>>
 You can run file level compaction using JMX to get rid of tombstones in
 one SSTable. Ensure you set GC_Grace_seconds such that

 current time >= deletion(tombstone time)+ GC_Grace_seconds


 File level compaction

 /usr/bin/java -jar cmdline-jmxclient-0.10.3.jar - localhost:
> ​{​
> ​port}
>  org.apache.cassandra.db:type=CompactionManager 
> forceUserDefinedCompaction="'${KEYSPACE}','${
> ​SSTABLEFILENAME
> }'""
>
>



 On Wed, Jul 27, 2016 at 11:59 AM, sai krishnam raju potturi <
 pskraj...@gmail.com> wrote:

> hi;
>   we have a columnfamily that has around 1000 rows, with one row is
> really huge (million columns). 95% of the row contains tombstones. Since
> there exists just one SSTable , there is going to be no compaction kicked
> in. Any way we can get rid of the tombstones in that row?
>
> Userdefined compaction nor nodetool compact had no effect. Any ideas
> folks?
>
> thanks
>
>
>


>>>
>>
>


Re: Node after restart sees other nodes down for 10 minutes

2016-07-27 Thread Farzad Panahi
Paulo,

I can confirm that the problem is as you stated. Some or all of the other
nodes are keeping a connection in CLOSE_WAIT state. Those nodes are seen as
DN from the point of the node I have restarted the Cassandra service on.
But nodetool disablegossip did not fix the problem.

This sounds like an issue that can potentially affect many users. Is it not
the case?
Do we have a solution for this?



Here is netstat and nodetool status output
1. right after stopping cassandra service on 10.4.68.222:
--
ip-10-4-54-176
tcp0  0 10.4.54.176:51268   10.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.54.176:56135   10.4.68.222:7000
 TIME_WAIT
tcp1  0 10.4.54.176:43697   10.4.68.222:7000
 CLOSE_WAIT
tcp0  0 10.4.54.176:52372   10.4.68.222:7000
 TIME_WAIT
--
--
ip-10-4-54-177
tcp0  0 10.4.54.177:56960   10.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.54.177:54539   10.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.54.177:32823   10.4.68.222:7000
 TIME_WAIT
tcp1  0 10.4.54.177:48985   10.4.68.222:7000
 CLOSE_WAIT
--
--
ip-10-4-68-222
tcp0  0 10.4.68.222:700010.4.54.176:43697
FIN_WAIT2
tcp0  0 10.4.68.222:700010.4.54.177:48985
FIN_WAIT2
tcp0  0 10.4.68.222:700010.4.68.222:54419
TIME_WAIT
tcp0  0 10.4.68.222:700010.4.43.65:43197
 FIN_WAIT2
tcp0  0 10.4.68.222:700010.4.68.221:44149
FIN_WAIT2
tcp0  0 10.4.68.222:700010.4.68.222:41302
TIME_WAIT
tcp0  0 10.4.68.222:700010.4.43.66:54321
 FIN_WAIT2
--
--
ip-10-4-68-221
tcp0  0 10.4.68.221:49599   10.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.68.221:55033   10.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.68.221:51628   10.4.68.222:7000
 TIME_WAIT
tcp1  0 10.4.68.221:44149   10.4.68.222:7000
 CLOSE_WAIT
--
--
ip-10-4-43-66
tcp0  0 10.4.43.66:5593010.4.68.222:7000
 TIME_WAIT
tcp1  0 10.4.43.66:5432110.4.68.222:7000
 CLOSE_WAIT
tcp0  0 10.4.43.66:6096810.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.43.66:4908710.4.68.222:7000
 TIME_WAIT
--
--
ip-10-4-43-65
tcp1  0 10.4.43.65:4319710.4.68.222:7000
 CLOSE_WAIT
tcp0  0 10.4.43.65:3646710.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.43.65:5331710.4.68.222:7000
 TIME_WAIT
tcp0  0 10.4.43.65:5489710.4.68.222:7000
 TIME_WAIT
--
2. a bit after stopping cassandra service on 10.4.68.222:
--
ip-10-4-54-176
tcp1  0 10.4.54.176:43697   10.4.68.222:7000
 CLOSE_WAIT
--
--
ip-10-4-54-177
--
--
ip-10-4-68-222
--
--
ip-10-4-68-221
tcp1  0 10.4.68.221:44149   10.4.68.222:7000
 CLOSE_WAIT
--
--
ip-10-4-43-66
tcp1  0 10.4.43.66:5432110.4.68.222:7000
 CLOSE_WAIT
--
--
ip-10-4-43-65
tcp1  0 10.4.43.65:4319710.4.68.222:7000
 CLOSE_WAIT
--
3. after starting cassandra service on 10.4.68.222:
--
ip-10-4-54-176
tcp0  0 10.4.54.176:42460   10.4.68.222:7000
 ESTABLISHED
tcp1 303403 10.4.54.176:43697   10.4.68.222:7000
 CLOSE_WAIT
tcp0  0 10.4.54.176:42109   10.4.68.222:7000
 ESTABLISHED
--
--
ip-10-4-54-177
tcp0  0 10.4.54.177:43687   10.4.68.222:7000
 ESTABLISHED
tcp0  0 10.4.54.177:56107   10.4.68.222:7000
 ESTABLISHED
tcp0  0 10.4.54.177:39426   10.4.68.222:7000
 ESTABLISHED
--
--
ip-10-4-68-222
tcp0  0 10.4.68.222:70000.0.0.0:*
LISTEN
tcp0  0 10.4.68.222:700010.4.54.176:42109
ESTABLISHED
tcp0  0 10.4.68.222:7000

Re: Re : Purging tombstones from a particular row in SSTable

2016-07-27 Thread Alain RODRIGUEZ
Hi,

I just released a detailed post about tombstones today that might be of
some interest for you:
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html

220kb worth of tombstones doesn’t seem like enough to worry about.


+1

I believe you might be missing some other bigger SSTable having a lot of
tombstones as well. Finding the biggest sstable and reading the tombstone
ratio from there might be more relevant.

You also should give a try to: "unchecked_tombstone_compaction" set to true
rather than tuning other options so aggressively. The "single SSTable
compaction" section of my post might help you on this issue:
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html#single-sstable-compaction

Other thoughts:

Also if you use TTLs and timeseries, using TWCS instead of STCS could be
more efficient evicting tombstones.

we have a columnfamily that has around 1000 rows, with one row is really
> huge (million columns)


I am sorry to say that this model does not look that great. Imbalances
might become an issue as a few nodes will handle a lot more load than the
rest of the nodes. Also even if this is getting improved in newer versions
of Cassandra, wide rows are something you want to avoid while using 2.0.14
(which is no longer supported for about a year now). I know it is not
always easy and never the good time, but maybe should you consider
upgrading both your model and your version of Cassandra (regardless of the
fact you manage to solve this issue or not with
"unchecked_tombstone_compaction").

Good luck,

C*heers,
---
Alain Rodriguez - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-07-28 0:00 GMT+02:00 sai krishnam raju potturi :

> The read queries are continuously failing though because of the
> tombstones. "Request did not complete within rpc_timeout."
>
> thanks
>
>
> On Wed, Jul 27, 2016 at 5:51 PM, Jeff Jirsa 
> wrote:
>
>> 220kb worth of tombstones doesn’t seem like enough to worry about.
>>
>>
>>
>>
>>
>> *From: *sai krishnam raju potturi 
>> *Reply-To: *"user@cassandra.apache.org" 
>> *Date: *Wednesday, July 27, 2016 at 2:43 PM
>> *To: *Cassandra Users 
>> *Subject: *Re: Re : Purging tombstones from a particular row in SSTable
>>
>>
>>
>> and also the sstable size in question is like 220 kb in size.
>>
>>
>>
>> thanks
>>
>>
>>
>>
>>
>> On Wed, Jul 27, 2016 at 5:41 PM, sai krishnam raju potturi <
>> pskraj...@gmail.com> wrote:
>>
>> it's set to 1800 Vinay.
>>
>>
>>
>>  bloom_filter_fp_chance=0.01 AND
>>
>>   caching='KEYS_ONLY' AND
>>
>>   comment='' AND
>>
>>   dclocal_read_repair_chance=0.10 AND
>>
>>   gc_grace_seconds=1800 AND
>>
>>   index_interval=128 AND
>>
>>   read_repair_chance=0.00 AND
>>
>>   replicate_on_write='true' AND
>>
>>   populate_io_cache_on_flush='false' AND
>>
>>   default_time_to_live=0 AND
>>
>>   speculative_retry='99.0PERCENTILE' AND
>>
>>   memtable_flush_period_in_ms=0 AND
>>
>>   compaction={'min_sstable_size': '1024', 'tombstone_threshold': '0.01',
>> 'tombstone_compaction_interval': '1800', 'class':
>> 'SizeTieredCompactionStrategy'} AND
>>
>>   compression={'sstable_compression': 'LZ4Compressor'};
>>
>>
>>
>> thanks
>>
>>
>>
>>
>>
>> On Wed, Jul 27, 2016 at 5:34 PM, Vinay Kumar Chella <
>> vinaykumar...@gmail.com> wrote:
>>
>> What is your GC_grace_seconds set to?
>>
>>
>>
>> On Wed, Jul 27, 2016 at 1:13 PM, sai krishnam raju potturi <
>> pskraj...@gmail.com> wrote:
>>
>> thanks Vinay and DuyHai.
>>
>>
>>
>> we are using verison 2.0.14. I did "user defined compaction"
>> following the instructions in the below link, The tombstones still persist
>> even after that.
>>
>>
>>
>> https://gist.github.com/jeromatron/e238e5795b3e79866b83
>> 
>>
>>
>>
>> Also, we changed the tombstone_compaction_interval : 1800
>> and tombstone_threshold : 0.1, but it did not help.
>>
>>
>>
>> thanks
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Jul 27, 2016 at 4:05 PM, DuyHai Doan 
>> wrote:
>>
>> This feature is also exposed directly in nodetool from version Cassandra
>> 3.4
>>
>>
>>
>> nodetool compact --user-defined 
>>
>>
>>
>> On Wed, Jul 27, 2016 at 9:58 PM, Vinay Chella 
>> wrote:
>>
>> You can run file level compaction using JMX to get rid of tombstones in
>> one SSTable. Ensure you set GC_Grace_seconds such that
>>
>>
>>
>> current time >= deletion(tombstone time)+ GC_Grace_seconds
>>
>>
>>
>> File level compaction
>>
>>
>>
>> /usr/bin/java -jar cmdline-jmxclient-0.10.3.jar - localhost:
>>
>> ​{​
>>
>> 

Listing Keys Hierarchically Using a Prefix and Delimiter

2016-07-27 Thread Jacob Willoughby
Hi, data modeling question,


I have been investigating cassandra to store small objects as a trivial 
replacement for s3.  GET/PUT/DELETE are all easy, but LIST is what is tripping 
me up.


S3 does a hierarchical list that kinda simulates traversing folders.

http://docs.aws.amazon.com/AmazonS3/latest/dev/ListingKeysHierarchy.html


So say my schema is this:

CREATE TABLE "stuff" (key BLOB PRIMARY KEY, value BLOB)


I know that the prefix part is easy with a ByteOrderedPartitioner (and possibly 
with a secondary index in Cassandra 3.x? ).  What trips me up is the delimiter 
part.


I have looked at a handful of open source projects that are s3 clones and use 
cassandra, and they seem to do the prefix match then manually search for the 
delimiter.  I have looked at doing a UDA, but they also seem to send all of the 
data to a single node to do the aggregation.


What I am hoping to do is achieve what S3 does: "List performance is not 
substantially affected by the total number of keys in your bucket, nor by the 
presence or absence of the prefix, marker, maxkeys, or delimiter arguments." (

http://docs.aws.amazon.com/AmazonS3/latest/dev/ListingKeysUsingAPIs.html)


Is there some sort of denormalization, indexing, querying that I am missing 
that might help solve this?  I think if UDA's could do some summary operation 
on each node before returning it then aggregating the results it would work, 
but as far as I know that isn't possible.  It seems like a binary search of 
each partition involved in the list prefix would be a really quick and easy way 
to return the first 1000 results.


Is this even possible using cassandra?


Thanks,

Jake Willoughby



Re: Node after restart sees other nodes down for 10 minutes

2016-07-27 Thread Paulo Motta
> This sounds like an issue that can potentially affect many users. Is it
not the case?

This seems to affect only some configurations, specially EC2, but not all
for some reason (it might be related to default tcp timeout configuration).

> Do we have a solution for this?

Watch https://issues.apache.org/jira/browse/CASSANDRA-9630 and add your
report there to reinforce the issue is still present on recent versions (it
has been a while that was reported and there were no newer reports so it
didn't get much attention).

2016-07-27 19:39 GMT-03:00 Farzad Panahi :

> Paulo,
>
> I can confirm that the problem is as you stated. Some or all of the other
> nodes are keeping a connection in CLOSE_WAIT state. Those nodes are seen as
> DN from the point of the node I have restarted the Cassandra service on.
> But nodetool disablegossip did not fix the problem.
>
> This sounds like an issue that can potentially affect many users. Is it
> not the case?
> Do we have a solution for this?
>
>
> 
> Here is netstat and nodetool status output
> 1. right after stopping cassandra service on 10.4.68.222:
> --
> ip-10-4-54-176
> tcp0  0 10.4.54.176:51268   10.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.54.176:56135   10.4.68.222:7000
>TIME_WAIT
> tcp1  0 10.4.54.176:43697   10.4.68.222:7000
>CLOSE_WAIT
> tcp0  0 10.4.54.176:52372   10.4.68.222:7000
>TIME_WAIT
> --
> --
> ip-10-4-54-177
> tcp0  0 10.4.54.177:56960   10.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.54.177:54539   10.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.54.177:32823   10.4.68.222:7000
>TIME_WAIT
> tcp1  0 10.4.54.177:48985   10.4.68.222:7000
>CLOSE_WAIT
> --
> --
> ip-10-4-68-222
> tcp0  0 10.4.68.222:700010.4.54.176:43697
>   FIN_WAIT2
> tcp0  0 10.4.68.222:700010.4.54.177:48985
>   FIN_WAIT2
> tcp0  0 10.4.68.222:700010.4.68.222:54419
>   TIME_WAIT
> tcp0  0 10.4.68.222:700010.4.43.65:43197
>FIN_WAIT2
> tcp0  0 10.4.68.222:700010.4.68.221:44149
>   FIN_WAIT2
> tcp0  0 10.4.68.222:700010.4.68.222:41302
>   TIME_WAIT
> tcp0  0 10.4.68.222:700010.4.43.66:54321
>FIN_WAIT2
> --
> --
> ip-10-4-68-221
> tcp0  0 10.4.68.221:49599   10.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.68.221:55033   10.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.68.221:51628   10.4.68.222:7000
>TIME_WAIT
> tcp1  0 10.4.68.221:44149   10.4.68.222:7000
>CLOSE_WAIT
> --
> --
> ip-10-4-43-66
> tcp0  0 10.4.43.66:5593010.4.68.222:7000
>TIME_WAIT
> tcp1  0 10.4.43.66:5432110.4.68.222:7000
>CLOSE_WAIT
> tcp0  0 10.4.43.66:6096810.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.43.66:4908710.4.68.222:7000
>TIME_WAIT
> --
> --
> ip-10-4-43-65
> tcp1  0 10.4.43.65:4319710.4.68.222:7000
>CLOSE_WAIT
> tcp0  0 10.4.43.65:3646710.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.43.65:5331710.4.68.222:7000
>TIME_WAIT
> tcp0  0 10.4.43.65:5489710.4.68.222:7000
>TIME_WAIT
> --
> 2. a bit after stopping cassandra service on 10.4.68.222:
> --
> ip-10-4-54-176
> tcp1  0 10.4.54.176:43697   10.4.68.222:7000
>CLOSE_WAIT
> --
> --
> ip-10-4-54-177
> --
> --
> ip-10-4-68-222
> --
> --
> ip-10-4-68-221
> tcp1  0 10.4.68.221:44149   10.4.68.222:7000
>CLOSE_WAIT
> --
> --
> ip-10-4-43-66
> tcp1  0 10.4.43.66:5432110.4.68.222:7000
>CLOSE_WAIT
> --
> --
> ip-10-4-43-65
> tcp1  0 10.4.43.65:4319710.4.68.222:7000
>CLOSE_WAIT
> --
> 3. after starting cassandra service on 10.4.68.222:
>