Re: Bootstrap streaming issues

2018-10-23 Thread Jai Bheemsen Rao Dhanwada
Also, I see this issue only when I have more columnfamilies. looks like be
number of vnodes * number of CF combination.
does anyone have any idea on this?

On Tue, Oct 23, 2018 at 9:48 AM Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> Did anyone run into similar issues?
>
> On Thu, Sep 6, 2018 at 10:27 AM Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
>
>> Here is the stacktrace from the failure, it looks like it's trying to
>> gather all the columfamily metrics and going OOM. Is this just for the JMX
>> metrics?
>>
>>
>> https://github.com/apache/cassandra/blob/cassandra-2.1.16/src/java/org/apache/cassandra/metrics/ColumnFamilyMetrics.java
>>
>> ERROR [MessagingService-Incoming-/10.133.33.57] 2018-09-06 15:43:19,280
>> CassandraDaemon.java:231 - Exception in thread
>> Thread[MessagingService-Incoming-/x.x.x.x,5,main]
>> java.lang.OutOfMemoryError: Java heap space
>> at java.io.DataInputStream.(DataInputStream.java:58)
>> ~[na:1.8.0_151]
>> at
>> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:139)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:88)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> ERROR [InternalResponseStage:1] 2018-09-06 15:43:19,281
>> CassandraDaemon.java:231 - Exception in thread
>> Thread[InternalResponseStage:1,5,main]
>> java.lang.OutOfMemoryError: Java heap space
>> at
>> org.apache.cassandra.metrics.ColumnFamilyMetrics$AllColumnFamilyMetricNameFactory.createMetricName(
>> *ColumnFamilyMetrics.java:784*) ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.metrics.ColumnFamilyMetrics.createColumnFamilyHistogram(ColumnFamilyMetrics.java:716)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.metrics.ColumnFamilyMetrics.(ColumnFamilyMetrics.java:597)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:361)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:527)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:498)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:335)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.db.DefsTables.addColumnFamily(DefsTables.java:385)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.db.DefsTables.mergeColumnFamilies(DefsTables.java:293)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.db.DefsTables.mergeSchemaInternal(DefsTables.java:194)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:166)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.service.MigrationTask$1.response(MigrationTask.java:75)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:54)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64)
>> ~[apache-cassandra-2.1.16.jar:2.1.16]
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>> ~[na:1.8.0_151]
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>> ~[na:1.8.0_151]
>> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_151]
>>
>> On Thu, Aug 30, 2018 at 12:51 PM Jai Bheemsen Rao Dhanwada <
>> jaibheem...@gmail.com> wrote:
>>
>>> thank you
>>>
>>> On Thu, Aug 30, 2018 at 11:58 AM Jeff Jirsa  wrote:
>>>
 This is the closest JIRA that comes to mind (from memory, I didn't
 search, there may be others):
 https://issues.apache.org/jira/browse/CASSANDRA-8150

 The best blog that's all in one place on tuning GC in cassandra is
 actually Amy's 2.1 tuning guide:
 https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html -
 it's somewhat out of date as it's for 2.1, but since that's what you're
 running, that works out in your favor.





 On Thu, Aug 30, 2018 at 10:53 AM Jai Bheemsen Rao Dhanwada <
 jaibheem...@gmail.com> wrote:

> Hi Jeff,
>
> Is there any JIRA that talks about increasing the HEAP will help?
> Also, any other alternatives than increasing the HEAP Size? last time
> when I tried increasing the heap, longer GC Pauses caused more damage in
> terms of latencies while gc pause.
>
> On Wed, Aug 29, 2018 at 11:07 PM Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
>
>> okay, thank you
>>
>> On Wed, Aug 29, 

Re: Aggregation of Set Data Type

2018-10-23 Thread DuyHai Doan
You will need to use user defined aggregates for this

Le 23 oct. 2018 16:46, "Joseph Wonesh"  a
écrit :

> Hello all,
>
>  I am trying to aggregate rows which each contain a column of Set.
> I would like the result to contain the sum of all sets, where null would be
> equivalent to the empty set. I expected a query like: "select
> sum(my_set_column) from my_table group by my_key_column" to do this, but
> the set type is not supported by this aggregate. Does anyone know of a way
> to aggregate this using existing cassandra built-ins? Thanks!
>
> This message is private and confidential. If you have received message in
> error, please notify us and remove from your system.


RE: [EXTERNAL] Re: Installing a Cassandra cluster with multiple Linux OSs (Ubuntu+CentOS)

2018-10-23 Thread Durity, Sean R
Agreed. I have run clusters with both RHEL5 and RHEL6 nodes.


Sean Durity
From: Jeff Jirsa 
Sent: Sunday, October 14, 2018 12:40 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Installing a Cassandra cluster with multiple Linux OSs 
(Ubuntu+CentOS)

Should be fine, just get the java and kernel versions and kernel tuning params 
as close as possible



--
Jeff Jirsa


On Oct 14, 2018, at 5:09 PM, Eyal Bar 
mailto:eyal@kenshoo.com>> wrote:
Hi all,

Did anyone installed a Cassandra cluster with mixed Linux OSs where some of the 
nodes were ubuntu 12\14\16 and some of the nodes where CentOS7?

Will it work without issues?

Rational: We have a 40 servers cluster which was originally installed only with 
Ubuntu servers. Now we want to move to CentOS 7 but the effort of reinstalling 
the entire cluster + migration to CentOS 7 is not simple. So we thought about 
adding new CentOS 7 nodes to the existing cluster and gradually remove the 
Ubuntu ones.

Would love to read your thoughts.

Best,

--
Eyal Bar
Big Data Ops Team Lead | Data Platform and Monitoring  | Kenshoo
Office +972 (3) 746-6500 *473
Mobile +972 (52) 458-6100
www.Kenshoo.com
[Transform your marketing. Grow your 
business.]

This e-mail, as well as any attached document, may contain material which is 
confidential and privileged and may include trademark, copyright and other 
intellectual property rights that are proprietary to Kenshoo Ltd,  its 
subsidiaries or affiliates ("Kenshoo"). This e-mail and its attachments may be 
read, copied and used only by the addressee for the purpose(s) for which it was 
disclosed herein. If you have received it in error, please destroy the message 
and any attachment, and contact us immediately. If you are not the intended 
recipient, be aware that any review, reliance, disclosure, copying, 
distribution or use of the contents of this message without Kenshoo's express 
permission is strictly prohibited.



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: Bootstrap streaming issues

2018-10-23 Thread Jai Bheemsen Rao Dhanwada
Did anyone run into similar issues?

On Thu, Sep 6, 2018 at 10:27 AM Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> Here is the stacktrace from the failure, it looks like it's trying to
> gather all the columfamily metrics and going OOM. Is this just for the JMX
> metrics?
>
>
> https://github.com/apache/cassandra/blob/cassandra-2.1.16/src/java/org/apache/cassandra/metrics/ColumnFamilyMetrics.java
>
> ERROR [MessagingService-Incoming-/10.133.33.57] 2018-09-06 15:43:19,280
> CassandraDaemon.java:231 - Exception in thread
> Thread[MessagingService-Incoming-/x.x.x.x,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at java.io.DataInputStream.(DataInputStream.java:58)
> ~[na:1.8.0_151]
> at
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:139)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:88)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> ERROR [InternalResponseStage:1] 2018-09-06 15:43:19,281
> CassandraDaemon.java:231 - Exception in thread
> Thread[InternalResponseStage:1,5,main]
> java.lang.OutOfMemoryError: Java heap space
> at
> org.apache.cassandra.metrics.ColumnFamilyMetrics$AllColumnFamilyMetricNameFactory.createMetricName(
> *ColumnFamilyMetrics.java:784*) ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.metrics.ColumnFamilyMetrics.createColumnFamilyHistogram(ColumnFamilyMetrics.java:716)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.metrics.ColumnFamilyMetrics.(ColumnFamilyMetrics.java:597)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:361)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:527)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:498)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at org.apache.cassandra.db.Keyspace.initCf(Keyspace.java:335)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.db.DefsTables.addColumnFamily(DefsTables.java:385)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.db.DefsTables.mergeColumnFamilies(DefsTables.java:293)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.db.DefsTables.mergeSchemaInternal(DefsTables.java:194)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.db.DefsTables.mergeSchema(DefsTables.java:166)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.service.MigrationTask$1.response(MigrationTask.java:75)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:54)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:64)
> ~[apache-cassandra-2.1.16.jar:2.1.16]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> ~[na:1.8.0_151]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[na:1.8.0_151]
> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_151]
>
> On Thu, Aug 30, 2018 at 12:51 PM Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
>
>> thank you
>>
>> On Thu, Aug 30, 2018 at 11:58 AM Jeff Jirsa  wrote:
>>
>>> This is the closest JIRA that comes to mind (from memory, I didn't
>>> search, there may be others):
>>> https://issues.apache.org/jira/browse/CASSANDRA-8150
>>>
>>> The best blog that's all in one place on tuning GC in cassandra is
>>> actually Amy's 2.1 tuning guide:
>>> https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html -
>>> it's somewhat out of date as it's for 2.1, but since that's what you're
>>> running, that works out in your favor.
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Aug 30, 2018 at 10:53 AM Jai Bheemsen Rao Dhanwada <
>>> jaibheem...@gmail.com> wrote:
>>>
 Hi Jeff,

 Is there any JIRA that talks about increasing the HEAP will help?
 Also, any other alternatives than increasing the HEAP Size? last time
 when I tried increasing the heap, longer GC Pauses caused more damage in
 terms of latencies while gc pause.

 On Wed, Aug 29, 2018 at 11:07 PM Jai Bheemsen Rao Dhanwada <
 jaibheem...@gmail.com> wrote:

> okay, thank you
>
> On Wed, Aug 29, 2018 at 11:04 PM Jeff Jirsa  wrote:
>
>> You’re seeing an OOM, not a socket error / timeout.
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Aug 29, 2018, at 10:56 PM, Jai Bheemsen Rao Dhanwada <
>> jaibheem...@gmail.com> wrote:
>>
>> Jeff,
>>
>> any idea if this is somehow related to :
>> 

Cassandra 4.0

2018-10-23 Thread Abdul Patel
Hi all,

Any idea when 4.0 is planned to release?


RE: TWCS: Repair create new buckets with old data

2018-10-23 Thread Meg Mara
Hi Maik,

I noticed in your table description that your default_time_to_live = 0, so 
where is your TTL property set? At what point do your sstables get tombstoned?

Also, could you please mention what kind of repair you performed on this table? 
(Incremental, Full, Full repair with -pr option)

Thank you,
Meg


From: Caesar, Maik [mailto:maik.cae...@dxc.com]
Sent: Monday, October 22, 2018 10:17 AM
To: user@cassandra.apache.org
Subject: RE: TWCS: Repair create new buckets with old data

Ok, thanks.
My conclusion:

1.   I will set unchecked_tombstone_compaction to true to get old data with 
tombstones removed

2.   I will exclude TWCS tables from repair

Regarding exclude table from repair, is there any easy way to do this?  
Nodetool repaire do not support excludes.

Regards
Maik

From: wxn...@zjqunshuo.com 
[mailto:wxn...@zjqunshuo.com]
Sent: Freitag, 19. Oktober 2018 03:58
To: user mailto:user@cassandra.apache.org>>
Subject: RE: TWCS: Repair create new buckets with old data

> Is the repair not necessary to get data files remove from filesystem ?
The answer is no. IMO, Cassandra will remove sstable files automatically if it 
can make sure the sstable files are 100% of tombstones and safe to do deletion. 
If you use TWCS and you have only insertion and no update, you don't need run 
repair manually.

-Simon

From: Caesar, Maik
Date: 2018-10-18 20:30
To: user@cassandra.apache.org
Subject: RE: TWCS: Repair create new buckets with old data
Hello Simon,
Is the repair not necessary to get data files remove from filesystem ? My 
assumption was, that only repaired data will removed after TTL is reached.

Regards
Maik

From: wxn...@zjqunshuo.com 
[mailto:wxn...@zjqunshuo.com]
Sent: Mittwoch, 17. Oktober 2018 09:02
To: user mailto:user@cassandra.apache.org>>
Subject: Re: TWCS: Repair create new buckets with old data

Hi Maik,
IMO, when using TWCS, you had better not run repair. The behaviour of TWCS is 
same with STCS for repair when merging sstables and the result is leaving 
sstables spanning multiple time buckets, but maybe I'm wrong. In my use case, I 
don't do repair with table using TWCS.

-Simon

From: Caesar, Maik
Date: 2018-10-16 17:46
To: user@cassandra.apache.org
Subject: TWCS: Repair create new buckets with old data
Hallo,
we work with Cassandra version 3.0.9 and have a problem in a table with TWCS. 
The command “nodetool repair” create always new files with old data. This avoid 
the delete of the old data.
The layout of the Table is following:
cqlsh> desc stat.spa

CREATE TABLE stat.spa (
region int,
id int,
date text,
hour int,
zippedjsonstring blob,
PRIMARY KEY ((region, id), date, hour)
) WITH CLUSTERING ORDER BY (date ASC, hour ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 
'compaction_window_size': '1', 'compaction_window_unit': 'DAYS', 
'max_threshold': '100', 'min_threshold': '4', 'tombstone_compaction_interval': 
'86460'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

Actual the oldest data are from 2017/04/15 and will not remove:

$ for f in *Data.db; do meta=$(sudo sstablemetadata $f); echo -e "Max:" $(date 
--date=@$(echo "$meta" | grep Maximum\ time | cut -d" "  -f3| cut -c 1-10) 
'+%Y/%m/%d %H:%M') "Min:" $(date --date=@$(echo "$meta" | grep Minimum\ time | 
cut -d" "  -f3| cut -c 1-10) '+%Y/%m/%d %H:%M') $(echo "$meta" | grep 
droppable) $(echo "$meta" | grep "Repaired at") ' \t ' $(ls -lh $f | awk 
'{print $5" "$6" "$7" "$8" "$9}'); done | sort
Max: 2017/04/15 12:08 Min: 2017/03/31 13:09 Estimated droppable tombstones: 
1.7731048805815162 Repaired at: 1525685601400 42K May 7 19:56 
mc-22922-big-Data.db
Max: 2017/04/17 13:49 Min: 2017/03/31 13:09 Estimated droppable tombstones: 
1.9600207684319835 Repaired at: 1525685601400 116M May 7 13:31 
mc-15096-big-Data.db
Max: 2017/04/21 13:43 Min: 2017/04/15 13:34 Estimated droppable tombstones: 
1.9090909090909092 Repaired at: 1525685601400 11K May 7 19:56 
mc-22921-big-Data.db
Max: 2017/05/23 21:45 Min: 2017/04/21 14:00 Estimated droppable tombstones: 
1.8360655737704918 Repaired at: 1525685601400 21M May 7 19:56 
mc-22919-big-Data.db
Max: 2017/06/12 15:19 Min: 2017/04/25 14:45 Estimated droppable tombstones: 
1.8091397849462365 Repaired at: 

Aggregation of Set Data Type

2018-10-23 Thread Joseph Wonesh
Hello all,

 I am trying to aggregate rows which each contain a column of Set.
I would like the result to contain the sum of all sets, where null would be
equivalent to the empty set. I expected a query like: "select
sum(my_set_column) from my_table group by my_key_column" to do this, but
the set type is not supported by this aggregate. Does anyone know of a way
to aggregate this using existing cassandra built-ins? Thanks!

-- 
This message is private and confidential. If you have received message in 
error, please notify us and remove from your system. 


Re: Nodetool info for heap usage

2018-10-23 Thread Horia Mocioi
Hello,

I aggre with Anup, nodetool is adding some overhead.

You could use dstat or sjk ttop.

See below some link to get started:
https://tobert.github.io/pages/als-cassandra-21-tuning-guide.html
https://academy.datastax.com/support-blog/deeper-dive-diagnosing-dse-performance-issues-ttop-and-multidump

Side note: sjk is only included in DSE, not Apache Cassandra. But you can still 
use it as standalone.

Regards,
Horia

On tis, 2018-10-23 at 10:25 +1100, Anup Shirolkar wrote:
Hi,

The nodetool output should be accurate and reliable.

But using nodetool command for monitoring is not a very good idea.
Nodetool has its own resource overhead each time it is invoked.

You should ideally use a standard monitoring tool/method.

Regards,

Anup Shirolkar



On Tue, 23 Oct 2018 at 07:16, Abdul Patel 
mailto:abd786...@gmail.com>> wrote:
Hi All,

Is nodetool info information is accurate to monitor memory usage, intially with 
3.1.0 we had monitoring nodetool infor for heap usage and it never reported 
this information as high,after upgrading to 3.11.2 we started getting high 
usage using nodetool info   later upgraded to 3.11.3 and same behaviour.
Just wanted make sure if monutoring heap memory usage via nodetool info correct 
or its actually memory leak issue in 3.11.2 anf 3.11.3?