Re: Reading Commit log files

2016-11-27 Thread Kamesh
Hi All,
  I am able to read cdc events from key spaces like *system *and
*system_schema,
*but not from the one that I created. Any help on this?.

Thanks & Regards
Kamesh.

On Wed, Nov 23, 2016 at 9:14 PM, Kamesh  wrote:

> Hi Carlos,
>  durable_writes = true.
>
>  *cqlsh:test> describe test;*
> * CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': '1'}  AND durable_writes = true;*
>
> Thanks & Regards
> Kamesh.
>
> On Wed, Nov 23, 2016 at 9:10 PM, Carlos Alonso  wrote:
>
>> Did you configured your keyspace with durable_writes = false by any
>> chance? That would make operations not reach the commitlog.
>>
>>
>> On Wed, 23 Nov 2016 at 13:06 Kamesh  wrote:
>>
>>> Hi Carlos,
>>>  Thanks for your response.
>>>  I performed few insert statements and run my application without
>>> flushing. Still not able to read the commit logs.
>>>  However, I am able to read the commit logs of  *system* and
>>> *system_schema* key spaces but not able to read the application key
>>> space (key space created by me).
>>>
>>> Thanks & Regards
>>>
>>> Kamesh.
>>>
>>> On Wed, Nov 23, 2016 at 5:24 PM, Carlos Alonso 
>>> wrote:
>>>
>>> Hi Kamesh.
>>>
>>> Flushing memtables to disk causes the corresponding commitlog segments
>>> to be deleted. Once the data is flushed into SSTables it can be considered
>>> durable (in case of a node crash, the data won't be lost), and therefore
>>> there's no point in keeping it in the commitlog as well.
>>>
>>> Try without flushing and see if you can see your operations there.
>>>
>>> Regards
>>>
>>> On Wed, 23 Nov 2016 at 11:04 Kamesh  wrote:
>>>
>>> Hi All,
>>>  I am trying to read cassandra commit log files, but unable to do it. I
>>> am experimenting this with 1 node cluster(laptop)
>>>
>>>  Cassandra Version : *3.8*
>>>  Updated cassadra.yaml with *cdc_enabled: true*
>>>
>>>  After executing the below statments and flushing memtables, tried
>>> reading commit log files, but there are no cdc events correpsonding to
>>> *test*keyspace.
>>>
>>>  CREATE KEYSPACE *test* WITH replication = {'class': 'SimpleStrategy',
>>> 'replication_factor': '1'};
>>>  CREATE TABLE foo (a int, b text, PRIMARY KEY(a)) WITH cdc=true;
>>>
>>>
>>>  INSERT INTO foo(a, b) VALUES (0, 'static0');
>>>  INSERT INTO foo(a, b) VALUES (1, 'static1');
>>>  INSERT INTO foo(a, b) VALUES (2, 'static2');
>>>  INSERT INTO foo(a, b) VALUES (3, 'static3');
>>>  INSERT INTO foo(a, b) VALUES (4, 'static4');
>>>  INSERT INTO foo(a, b) VALUES (5, 'static5');
>>>  INSERT INTO foo(a, b) VALUES (6, 'static6');
>>>  INSERT INTO foo(a, b) VALUES (7, 'static7');
>>>  INSERT INTO foo(a, b) VALUES (8, 'static8');
>>>
>>>  Can someone please help us.
>>>
>>> Thanks & Regards
>>>
>>> Kamesh.
>>>
>>>
>>>
>


Re: Java GC pauses, reality check

2016-11-27 Thread Bill Hastings
Hi Hari

Could you share your G1GC settings please?

On Sun, Nov 27, 2016 at 9:57 PM, Harikrishnan Pillai <
hpil...@walmartlabs.com> wrote:

> Hi @Kant Kodali,
>
> We have multiple clusters running zing .
>
> One cluster has 11/11 and another one also has 11/11.(190 GB mem,6TB hard
> disk and 16 Physical core machines)
>
> The average read size is around 200KB and it can go upto 6 MB.
>
> We are using g1GC in most clusters with *26GB heap* and extra threads
> given to parallel and old gen collection. Those clusters 99% is also under
> 5 ms and doing good. We used Zing to remove all timeouts . If application
> is not having that requirement G1GC is good.
>
> with g1gGC i have seen average 200-300 ms min pauses every 4 minutes and
> 600 ms pauses every 6 hours and 99% latency is under 5-10 ms for most of
> the clusters having 10- 100 KB of read data.
>
> Regards
>
> Hari
> --
> *From:* Kant Kodali 
> *Sent:* Saturday, November 26, 2016 8:39:01 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Java GC pauses, reality check
>
> @Harikrishnan Pillai: How many nodes you guys are running? and what is an
> approximate read size and an approximate write size?
>
> On Fri, Nov 25, 2016 at 7:32 PM, Harikrishnan Pillai <
> hpil...@walmartlabs.com> wrote:
>
>> We are running azul zing in prod with 1 million reads/s and 100 K
>> writes/s with azul .we never had a major gc above 10 ms .
>>
>> Sent from my iPhone
>>
>> > On Nov 25, 2016, at 3:49 PM, Martin Schröder  wrote:
>> >
>> > 2016-11-25 23:38 GMT+01:00 Kant Kodali :
>> >> I would also restate the following sentence "java GC pauses are pretty
>> much
>> >> a fact of life" to "Any GC based system pauses are pretty much a fact
>> of
>> >> life".
>> >>
>> >> I would be more than happy to see if someone can counter prove.
>> >
>> > Azul disagrees.
>> > https://www.azul.com/products/zing/pgc/
>> >
>> > Best
>> >   Martin
>>
>
>


-- 
Cheers
Bill


Re: Java GC pauses, reality check

2016-11-27 Thread Harikrishnan Pillai
Hi @Kant Kodali,

We have multiple clusters running zing .

One cluster has 11/11 and another one also has 11/11.(190 GB mem,6TB hard disk 
and 16 Physical core machines)

The average read size is around 200KB and it can go upto 6 MB.

We are using g1GC in most clusters with 26GB heap and extra threads given to 
parallel and old gen collection. Those clusters 99% is also under 5 ms and 
doing good. We used Zing to remove all timeouts . If application is not having 
that requirement G1GC is good.

with g1gGC i have seen average 200-300 ms min pauses every 4 minutes and 600 ms 
pauses every 6 hours and 99% latency is under 5-10 ms for most of the clusters 
having 10- 100 KB of read data.

Regards

Hari


From: Kant Kodali 
Sent: Saturday, November 26, 2016 8:39:01 PM
To: user@cassandra.apache.org
Subject: Re: Java GC pauses, reality check

@Harikrishnan Pillai: How many nodes you guys are running? and what is an 
approximate read size and an approximate write size?

On Fri, Nov 25, 2016 at 7:32 PM, Harikrishnan Pillai 
> wrote:
We are running azul zing in prod with 1 million reads/s and 100 K writes/s with 
azul .we never had a major gc above 10 ms .

Sent from my iPhone

> On Nov 25, 2016, at 3:49 PM, Martin Schr?der 
> > wrote:
>
> 2016-11-25 23:38 GMT+01:00 Kant Kodali 
> >:
>> I would also restate the following sentence "java GC pauses are pretty much
>> a fact of life" to "Any GC based system pauses are pretty much a fact of
>> life".
>>
>> I would be more than happy to see if someone can counter prove.
>
> Azul disagrees.
> https://www.azul.com/products/zing/pgc/
>
> Best
>   Martin



Cassandra Multi DC with diff version.

2016-11-27 Thread Abhishek Kumar Maheshwari
Hi All,

We have 2 Cassandra DC with below config:

DC1: In DC1 we have 9 Servers with 64 GB ram 40 Core machines. In this DC we 
have Cassandra version: 2.1.4. And We have 2 TB data on each server. 
Application is connected with DC.
DC2: In DC2 we have 5 Servers with 64 GB ram 40 Core machines. In this DC we 
have Cassandra version: 3.0.9.

My Question is both DC will be in sync perfectly?

What will happen if I will use LOCAL_QURAM on both DC with same queries?

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

Education gets Exciting with IIM Kozhikode Executive Post Graduate Programme in 
Management - 2 years (AMBA accredited with full benefits of IIMK Alumni 
status). Brought to you by IIMK in association with TSW, an Executive Education 
initiative from The Times of India Group. Learn more: www.timestsw.com


Re: Does recovery continue after truncating a table?

2016-11-27 Thread Yuji Ito
Thanks Ben and Hiro,

I've reported it at https://issues.apache.org/jira/browse/CASSANDRA-12960.

I'll use `truncatehints` or DROP command after this.


On Sun, Nov 27, 2016 at 12:33 PM, Ben Slater 
wrote:

> By “undocumented limitation”, I meant “TRUNCATE” is mainly only used in
> development and testing, not production scenarios so a sufficient fix (and
> certainly a better than nothing fix) might be just to document that if you
> issue a TRUNCATE while there are still hinted hand-offs pending the hinted
> hand-offs replayed after the truncate will come back to life. Of course, an
> actual fix would be better.
>
> Cheers
> Ben
>
> On Sat, 26 Nov 2016 at 21:08 Hiroyuki Yamada  wrote:
>
>> Hi Yuji and Ben,
>>
>> I tried out this revised script and the same issue occurred to me, too.
>> I think it's definitely a bug to be solved asap.
>>
>> >Ben
>> What do you mean "an undocumented limitation" ?
>>
>> Thanks,
>> Hiro
>>
>> On Sat, Nov 26, 2016 at 3:13 PM, Ben Slater 
>> wrote:
>> > Nice detective work! Seems to me that it’s a best an undocumented
>> limitation
>> > and potentially could be viewed as a bug - maybe log another JIRA?
>> >
>> > One node - there is a nodetool truncatehints command that could be used
>> to
>> > clear out the hints
>> > (http://cassandra.apache.org/doc/latest/tools/nodetool/
>> truncatehints.html?highlight=truncate)
>> > . However, it seems to clear all hints on particular endpoint, not just
>> for
>> > a specific table.
>> >
>> > Cheers
>> > Ben
>> >
>> > On Fri, 25 Nov 2016 at 17:42 Yuji Ito  wrote:
>> >>
>> >> Hi all,
>> >>
>> >> I revised the script to reproduce the issue.
>> >> I think the issue happens more frequently than before.
>> >> Killing another node is added to the previous script.
>> >>
>> >>  [script] 
>> >> #!/bin/sh
>> >>
>> >> node1_ip=
>> >> node2_ip=
>> >> node3_ip=
>> >> node2_user=
>> >> node3_user=
>> >> rows=1
>> >>
>> >> echo "consistency quorum;" > init_data.cql
>> >> for key in $(seq 0 $(expr $rows - 1))
>> >> do
>> >> echo "insert into testdb.testtbl (key, val) values($key, ) IF
>> NOT
>> >> EXISTS;" >> init_data.cql
>> >> done
>> >>
>> >> while true
>> >> do
>> >> echo "truncate the table"
>> >> cqlsh $node1_ip -e "truncate table testdb.testtbl" > /dev/null 2>&1
>> >> if [ $? -ne 0 ]; then
>> >> echo "truncating failed"
>> >> continue
>> >> else
>> >> break
>> >> fi
>> >> done
>> >>
>> >> echo "kill C* process on node3"
>> >> pdsh -l $node3_user -R ssh -w $node3_ip "ps auxww | grep
>> CassandraDaemon |
>> >> awk '{if (\$13 ~ /cassand/) print \$2}' | xargs sudo kill -9"
>> >>
>> >> echo "insert $rows rows"
>> >> cqlsh $node1_ip -f init_data.cql > insert_log 2>&1
>> >>
>> >> echo "restart C* process on node3"
>> >> pdsh -l $node3_user -R ssh -w $node3_ip "sudo /etc/init.d/cassandra
>> start"
>> >>
>> >> while true
>> >> do
>> >> echo "truncate the table again"
>> >> cqlsh $node1_ip -e "truncate table testdb.testtbl"
>> >> if [ $? -ne 0 ]; then
>> >> echo "truncating failed"
>> >> continue
>> >> else
>> >> echo "truncation succeeded!"
>> >> break
>> >> fi
>> >> done
>> >>
>> >> echo "kill C* process on node2"
>> >> pdsh -l $node2_user -R ssh -w $node2_ip "ps auxww | grep
>> CassandraDaemon |
>> >> awk '{if (\$13 ~ /cassand/) print \$2}' | xargs sudo kill -9"
>> >>
>> >> cqlsh $node1_ip --request-timeout 3600 -e "consistency serial; select
>> >> count(*) from testdb.testtbl;"
>> >> sleep 10
>> >> cqlsh $node1_ip --request-timeout 3600 -e "consistency serial; select
>> >> count(*) from testdb.testtbl;"
>> >>
>> >> echo "restart C* process on node2"
>> >> pdsh -l $node2_user -R ssh -w $node2_ip "sudo /etc/init.d/cassandra
>> start"
>> >>
>> >>
>> >> Thanks,
>> >> yuji
>> >>
>> >>
>> >> On Fri, Nov 18, 2016 at 7:52 PM, Yuji Ito 
>> wrote:
>> >>>
>> >>> I investigated source code and logs of killed node.
>> >>> I guess that unexpected writes are executed when truncation is being
>> >>> executed.
>> >>>
>> >>> Some writes were executed after flush (the first flush) in truncation
>> and
>> >>> these writes could be read.
>> >>> These writes were requested as MUTATION by another node for hinted
>> >>> handoff.
>> >>> Their data was stored to a new memtable and flushed (the second
>> flush) to
>> >>> a new SSTable before snapshot in truncation.
>> >>> So, the truncation discarded only old SSTables, not the new SSTable.
>> >>> That's because ReplayPosition which was used for discarding SSTable
>> was
>> >>> that of the first flush.
>> >>>
>> >>> I copied some parts of log as below.
>> >>> "##" line is my comment.
>> >>> The point is that the ReplayPosition is moved forward by the second
>> >>> flush.
>> >>> It means some writes are executed after the first flush.
>> >>>
>> >>> == log ==
>> >>> ## started truncation
>> >>> TRACE [SharedPool-Worker-16] 

Re: Java GC pauses, reality check

2016-11-27 Thread Benjamin Roth
Maybe I was not totally clear. Reference counting is of course done at
runtime but the compiler automates where + when to do the counting.
Before, the developer had to retain + release objects manually. Since ARC,
this is done by the compiler at file level.
Nothing is "free" in this world. There are also drawbacks on it. But there
is indeed no GC like in Java (at least not in Clang). Cycles have to be
avoided by the developer.
See here https://en.wikipedia.org/wiki/Automatic_Reference_Counting

2016-11-27 15:28 GMT+01:00 Jonathan Haddad :

> Reference counting happens at run time, not compile time. It's not free
> either. Every time a reference is added, there's overhead in tracking it.
> It also doesn't catch cycles. You still need garbage collection to avoid
> memory leaks.
>
> On Sun, Nov 27, 2016 at 12:31 AM Benjamin Roth 
> wrote:
>
>> Arc means Automatic Reference counting which is done at compilen time. Eg
>> Objektive c and Swift use this technique. There are absolutely No gc's. Its
>> a completely different memory Management technique.
>>
>> Why i dont like Java on Server side? Because gc is a pain in the ass. I
>> am doing this Business since over 15 years and running/maintaining Apps
>> that are build in c or c++ has never been such a pain.
>>
>> On the other Hand Java is easier to handle for Developers. And coding
>> plain c is also a pain.
>>
>> Thats why i Said its a philosophic discussion.
>> Anyway Cassandra rund on Java so We have to Deal with it.
>>
>> Am 27.11.2016 05:28 schrieb "Kant Kodali" :
>>
>> Benjamin Roth: How do you know Arc eliminates GC pauses completely? By
>> completely I mean no GC pauses whatsoever.
>>
>> When you say Java is NOT the First choice for Server Applications you
>> are generalizing it too much I would say since many of them fall under that
>> category. Either way the statement you made is purely subjective.
>>
>> On Fri, Nov 25, 2016 at 2:41 PM, Benjamin Roth 
>> wrote:
>>
>> Lol. The counter proof is to use another memory Model like Arc. Thats why
>> i personally think Java is NOT the First choice for Server Applications.
>> But thats a philosophic discussion.
>>
>> Am 25.11.2016 23:38 schrieb "Kant Kodali" :
>>
>> +1 Chris Lohfink response
>>
>> I would also restate the following sentence "java GC pauses are pretty
>> much a fact of life" to "Any GC based system pauses are pretty much a
>> fact of life".
>>
>> I would be more than happy to see if someone can counter prove.
>>
>>
>>
>> On Fri, Nov 25, 2016 at 1:41 PM, Chris Lohfink 
>> wrote:
>>
>> No tuning will eliminate gcs.
>>
>> 20-30 seconds is horrific and out of the ordinary. Most likely
>> implementing antipatterns and/or poorly configured. Sub 1s is realistic but
>> with some workloads still may require some tuning to maintain. Some
>> workloads are very unfriendly to GCs though (ie heavy tombstones, very wide
>> partitions).
>>
>> Chris
>>
>> On Fri, Nov 25, 2016 at 3:25 PM, S Ahmed  wrote:
>>
>> Hello!
>>
>> From what I understand java GC pauses are pretty much a fact of life, but
>> you can tune the jvm to reduce the likelihood of the frequency and length
>> of GC pauses.
>>
>> When using Cassandra, how frequent or long have these pauses known to
>> be?  Even with tuning, is it safe to assume they cannot be eliminated?
>>
>> Would a 20-30 second pause be something out of the ordinary?
>>
>> Thanks.
>>
>>
>>
>>
>>


-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: Java GC pauses, reality check

2016-11-27 Thread Jonathan Haddad
Reference counting happens at run time, not compile time. It's not free
either. Every time a reference is added, there's overhead in tracking it.
It also doesn't catch cycles. You still need garbage collection to avoid
memory leaks.

On Sun, Nov 27, 2016 at 12:31 AM Benjamin Roth 
wrote:

> Arc means Automatic Reference counting which is done at compilen time. Eg
> Objektive c and Swift use this technique. There are absolutely No gc's. Its
> a completely different memory Management technique.
>
> Why i dont like Java on Server side? Because gc is a pain in the ass. I am
> doing this Business since over 15 years and running/maintaining Apps that
> are build in c or c++ has never been such a pain.
>
> On the other Hand Java is easier to handle for Developers. And coding
> plain c is also a pain.
>
> Thats why i Said its a philosophic discussion.
> Anyway Cassandra rund on Java so We have to Deal with it.
>
> Am 27.11.2016 05:28 schrieb "Kant Kodali" :
>
> Benjamin Roth: How do you know Arc eliminates GC pauses completely? By
> completely I mean no GC pauses whatsoever.
>
> When you say Java is NOT the First choice for Server Applications you are
> generalizing it too much I would say since many of them fall under that
> category. Either way the statement you made is purely subjective.
>
> On Fri, Nov 25, 2016 at 2:41 PM, Benjamin Roth 
> wrote:
>
> Lol. The counter proof is to use another memory Model like Arc. Thats why
> i personally think Java is NOT the First choice for Server Applications.
> But thats a philosophic discussion.
>
> Am 25.11.2016 23:38 schrieb "Kant Kodali" :
>
> +1 Chris Lohfink response
>
> I would also restate the following sentence "java GC pauses are pretty
> much a fact of life" to "Any GC based system pauses are pretty much a
> fact of life".
>
> I would be more than happy to see if someone can counter prove.
>
>
>
> On Fri, Nov 25, 2016 at 1:41 PM, Chris Lohfink 
> wrote:
>
> No tuning will eliminate gcs.
>
> 20-30 seconds is horrific and out of the ordinary. Most likely
> implementing antipatterns and/or poorly configured. Sub 1s is realistic but
> with some workloads still may require some tuning to maintain. Some
> workloads are very unfriendly to GCs though (ie heavy tombstones, very wide
> partitions).
>
> Chris
>
> On Fri, Nov 25, 2016 at 3:25 PM, S Ahmed  wrote:
>
> Hello!
>
> From what I understand java GC pauses are pretty much a fact of life, but
> you can tune the jvm to reduce the likelihood of the frequency and length
> of GC pauses.
>
> When using Cassandra, how frequent or long have these pauses known to be?
> Even with tuning, is it safe to assume they cannot be eliminated?
>
> Would a 20-30 second pause be something out of the ordinary?
>
> Thanks.
>
>
>
>
>


Re: Java GC pauses, reality check

2016-11-27 Thread Benjamin Roth
I didn't even know there are plans to move to TPC in Cs. Thanks for that
update. After all I will follow the development of both Scylla and Cs and
am excited about the future of both!

Am 27.11.2016 10:02 schrieb "Kant Kodali" :

> Yes I am well aware of Scyalldb. It might be well written in C++ but the
> performance gain they are claiming has very little to do with moving from
> Java to C++. They had major design changes such as moving away from SEDA to
> TPC and so on. Moreover I would say it still needs to mature. Lot of users
> had complained that they cannot get the benchmarks similar to the ones that
> are posted online and I keep seeing comments stating that you need to use a
> specific hardware and specific tuning mechanisms and so on (I don't mean to
> say what scylladb is claiming is wrong I certainly haven't verified it but
> I do know for the fact lot of people are having trouble to reach those
> benchmarks).
>
> SEDA to TPC is a very big change. Let's see how long it would take for
> Apache C*
>
> https://issues.apache.org/jira/browse/CASSANDRA-10989
>
>
>
>
> On Sat, Nov 26, 2016 at 11:45 PM, Benjamin Roth 
> wrote:
>
>> You are of course right. There is no solution and no language that is a
>> perfect match for every situation and every solution and language has it's
>> own pros, cons, pitfalls and drawbacks.
>> Actually that article you posted points at some aspect of ARC, I wasn't
>> aware of, yet.
>> Nevertheless, GC is an issue for Cassandra, otherwise this thread would
>> not exist, right? But we have to deal with it and get the best out of it.
>>
>> Another option, besides optimizing your GC: You could check if
>> http://www.scylladb.com/ is an option for you.
>> They rewrote CS from the scratch. The goal is to be completely compatible
>> with CS but to be much, much faster. Check their benchmarks and their
>> architecture.
>> I really do not want do depreciate the work of all the Cassandra
>> Developers - they did a great job - but what I have seen there looked very
>> interesting and promising! By the way it's written in C++.
>>
>>
>> 2016-11-27 7:06 GMT+01:00 Kant Kodali :
>>
>>> Automatic Reference counting sounds like college level idea that we all
>>> have been hearing for since GC is born! There seem to be bunch of cons of
>>> ARC as explained here
>>>
>>> https://www.quora.com/Why-doesnt-Apple-Swift-adopt-the-memor
>>> y-management-method-of-garbage-collection-like-in-Java
>>>
>>> Maintaining C and C++ APPS are never a pain? How about versioning and
>>> static time libraries? There is work there too. so its all pros and cons
>>>
>>> "gc is a pain in the ass". How about seg faults? they aren't any lesser
>>> pain :)
>>>
>>> Not only Cassandra that runs on JVM. Majority of Apache projects do run
>>> on JVM for a reason.
>>>
>>> Bottom line. My point here is there are pros and cons of every language.
>>> It doesn't make much sense to target one language.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sat, Nov 26, 2016 at 9:31 PM, Benjamin Roth 
>>> wrote:
>>>
 Arc means Automatic Reference counting which is done at compilen time.
 Eg Objektive c and Swift use this technique. There are absolutely No gc's.
 Its a completely different memory Management technique.

 Why i dont like Java on Server side? Because gc is a pain in the ass. I
 am doing this Business since over 15 years and running/maintaining Apps
 that are build in c or c++ has never been such a pain.

 On the other Hand Java is easier to handle for Developers. And coding
 plain c is also a pain.

 Thats why i Said its a philosophic discussion.
 Anyway Cassandra rund on Java so We have to Deal with it.

 Am 27.11.2016 05:28 schrieb "Kant Kodali" :

> Benjamin Roth: How do you know Arc eliminates GC pauses completely? By
> completely I mean no GC pauses whatsoever.
>
> When you say Java is NOT the First choice for Server Applications you
> are generalizing it too much I would say since many of them fall under 
> that
> category. Either way the statement you made is purely subjective.
>
> On Fri, Nov 25, 2016 at 2:41 PM, Benjamin Roth <
> benjamin.r...@jaumo.com> wrote:
>
>> Lol. The counter proof is to use another memory Model like Arc. Thats
>> why i personally think Java is NOT the First choice for Server
>> Applications. But thats a philosophic discussion.
>>
>> Am 25.11.2016 23:38 schrieb "Kant Kodali" :
>>
>>> +1 Chris Lohfink response
>>>
>>> I would also restate the following sentence "java GC pauses are
>>> pretty much a fact of life" to "Any GC based system pauses are
>>> pretty much a fact of life".
>>>
>>> I would be more than happy to see if someone can counter prove.
>>>
>>>
>>>
>>> On Fri, Nov 25, 2016 at 1:41 PM, Chris Lohfink 

Re: Java GC pauses, reality check

2016-11-27 Thread Kant Kodali
Yes I am well aware of Scyalldb. It might be well written in C++ but the
performance gain they are claiming has very little to do with moving from
Java to C++. They had major design changes such as moving away from SEDA to
TPC and so on. Moreover I would say it still needs to mature. Lot of users
had complained that they cannot get the benchmarks similar to the ones that
are posted online and I keep seeing comments stating that you need to use a
specific hardware and specific tuning mechanisms and so on (I don't mean to
say what scylladb is claiming is wrong I certainly haven't verified it but
I do know for the fact lot of people are having trouble to reach those
benchmarks).

SEDA to TPC is a very big change. Let's see how long it would take for
Apache C*

https://issues.apache.org/jira/browse/CASSANDRA-10989




On Sat, Nov 26, 2016 at 11:45 PM, Benjamin Roth 
wrote:

> You are of course right. There is no solution and no language that is a
> perfect match for every situation and every solution and language has it's
> own pros, cons, pitfalls and drawbacks.
> Actually that article you posted points at some aspect of ARC, I wasn't
> aware of, yet.
> Nevertheless, GC is an issue for Cassandra, otherwise this thread would
> not exist, right? But we have to deal with it and get the best out of it.
>
> Another option, besides optimizing your GC: You could check if
> http://www.scylladb.com/ is an option for you.
> They rewrote CS from the scratch. The goal is to be completely compatible
> with CS but to be much, much faster. Check their benchmarks and their
> architecture.
> I really do not want do depreciate the work of all the Cassandra
> Developers - they did a great job - but what I have seen there looked very
> interesting and promising! By the way it's written in C++.
>
>
> 2016-11-27 7:06 GMT+01:00 Kant Kodali :
>
>> Automatic Reference counting sounds like college level idea that we all
>> have been hearing for since GC is born! There seem to be bunch of cons of
>> ARC as explained here
>>
>> https://www.quora.com/Why-doesnt-Apple-Swift-adopt-the-memor
>> y-management-method-of-garbage-collection-like-in-Java
>>
>> Maintaining C and C++ APPS are never a pain? How about versioning and
>> static time libraries? There is work there too. so its all pros and cons
>>
>> "gc is a pain in the ass". How about seg faults? they aren't any lesser
>> pain :)
>>
>> Not only Cassandra that runs on JVM. Majority of Apache projects do run
>> on JVM for a reason.
>>
>> Bottom line. My point here is there are pros and cons of every language.
>> It doesn't make much sense to target one language.
>>
>>
>>
>>
>>
>>
>> On Sat, Nov 26, 2016 at 9:31 PM, Benjamin Roth 
>> wrote:
>>
>>> Arc means Automatic Reference counting which is done at compilen time.
>>> Eg Objektive c and Swift use this technique. There are absolutely No gc's.
>>> Its a completely different memory Management technique.
>>>
>>> Why i dont like Java on Server side? Because gc is a pain in the ass. I
>>> am doing this Business since over 15 years and running/maintaining Apps
>>> that are build in c or c++ has never been such a pain.
>>>
>>> On the other Hand Java is easier to handle for Developers. And coding
>>> plain c is also a pain.
>>>
>>> Thats why i Said its a philosophic discussion.
>>> Anyway Cassandra rund on Java so We have to Deal with it.
>>>
>>> Am 27.11.2016 05:28 schrieb "Kant Kodali" :
>>>
 Benjamin Roth: How do you know Arc eliminates GC pauses completely? By
 completely I mean no GC pauses whatsoever.

 When you say Java is NOT the First choice for Server Applications you
 are generalizing it too much I would say since many of them fall under that
 category. Either way the statement you made is purely subjective.

 On Fri, Nov 25, 2016 at 2:41 PM, Benjamin Roth  wrote:

> Lol. The counter proof is to use another memory Model like Arc. Thats
> why i personally think Java is NOT the First choice for Server
> Applications. But thats a philosophic discussion.
>
> Am 25.11.2016 23:38 schrieb "Kant Kodali" :
>
>> +1 Chris Lohfink response
>>
>> I would also restate the following sentence "java GC pauses are
>> pretty much a fact of life" to "Any GC based system pauses are
>> pretty much a fact of life".
>>
>> I would be more than happy to see if someone can counter prove.
>>
>>
>>
>> On Fri, Nov 25, 2016 at 1:41 PM, Chris Lohfink 
>> wrote:
>>
>>> No tuning will eliminate gcs.
>>>
>>> 20-30 seconds is horrific and out of the ordinary. Most likely
>>> implementing antipatterns and/or poorly configured. Sub 1s is realistic 
>>> but
>>> with some workloads still may require some tuning to maintain. Some
>>> workloads are very unfriendly to GCs though