Re: if the heap size exceeds 32GB..

2018-02-13 Thread Thakrar, Jayesh
atically tune your CMS initiating occupancy, or you'd probably see horrible, horrible pauses. On Tue, Feb 13, 2018 at 8:44 AM, James Rothering mailto:jrother...@codojo.me>> wrote: Wow, an 84GB heap! Would you mind disclosing the kind of data requirements behind this choice? And what

Re: if the heap size exceeds 32GB..

2018-02-13 Thread Thakrar, Jayesh
In most cases, Cassandra is pretty efficient about memory usage. However, if your use case does require/need/demand more memory for your workload, I would not hesitate to use heap > 32 GB. FYI, we have configured our heap for 84 GB. However there's more tuning that we have done beyond just the hea

Re: TWCS not deleting expired sstables

2018-02-01 Thread Thakrar, Jayesh
#x27;t happen. Seems like it's possibly not looking in the correct location for data directories. Try setting CASSANDRA_INCLUDE=http://cassandra.in.sh>> prior to running the script? e.g: CASSANDRA_INCLUDE=/cassandra.in.sh<http://cassandra.in.sh> sstableexpiredblockers ae raw_log

Re: TWCS not deleting expired sstables

2018-01-30 Thread Thakrar, Jayesh
Thanks Kurt and Kenneth. Now only if they would work as expected. node111.ord.ae.tsg.cnvr.net:/ae/disk1/data/ae/raw_logs_by_user-f58b9960980311e79ac26928246f09c1>ls -lt | tail -rw-r--r--. 1 vchadoop vchadoop286889260 Sep 18 14:14 mc-1070-big-Index.db -rw-r--r--. 1 vchadoop vchadoop12

Re: TWCS not deleting expired sstables

2018-01-28 Thread Thakrar, Jayesh
pairs” and “Timestamp overlap” sections might be of use. -B On Jan 25, 2018, at 11:05 AM, Thakrar, Jayesh mailto:jthak...@conversantmedia.com>> wrote: Wondering if I can get some pointers to what's happening here and why sstables that I think should be expired are not being dropped

TWCS not deleting expired sstables

2018-01-25 Thread Thakrar, Jayesh
Wondering if I can get some pointers to what's happening here and why sstables that I think should be expired are not being dropped. Here's the table's compaction property - note also set "unchecked_tombstone_compaction" to true. compaction = {'class': 'org.apache.cassandra.db.compaction.TimeW

Re: C* Logs to Kibana

2018-01-10 Thread Thakrar, Jayesh
Wondering what is the purpose - is it to get some insight into the cluster? Besides the logs themselves, another approach that many and I have taken is to pull the JMX metrics from Cassandra and push them to an appropriate metrics/timeseries system. Here's one approach of getting JMX metrics ou

Re: Question upon gracefully restarting c* node(s)

2018-01-10 Thread Thakrar, Jayesh
Just curious - aside from the "sleep", is this all not part of the shutdown command? Is this an "opportunity" to improve C*? Having worked with RDBMSes, Hadoop and HBase, stopping communication, flushing memcache (HBase), and relinquishing ownership of data (HBase) is all part of the shutdown pr

Re: Why don't I see my spark jobs running in parallel in Cassandra/Spark DSE cluster?

2017-10-27 Thread Thakrar, Jayesh
What you have is sequential and hence sequential processing. Also Spark/Scala are not parallel programming languages. But even if they were, statements are executed sequentially unless you exploit the parallel/concurrent execution features. Anyway, see if this works: val (RDD1, RDD2) = (JavaFunc

Re: Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
Yep, similar symptoms - but no, there's no OOM killer Also, if you look in the gc log around the time of failure, the heap memory was much below the 16 GB limit. And if I look at the 2nd last GC log before the crash, here’s what we see. And you will notice that cleaning up the 4 GB Eden (along wi

Re: Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
executed are not needed anymore, so its ok for the older prepared statements to be purged. All the same, I will do some analysis on the prepared statements table. Thanks for the tip/pointer! On 8/22/17, 5:17 PM, "Alain Rastoul" wrote: On 08/22/2017 05:39 PM, Thakrar, Jayesh wrote:

Re: Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
s.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance =

Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
Hi All, We are somewhat new users to Cassandra 3.10 on Linux and wanted to ping the user group for their experiences. Our usage profile is batch jobs that load millions of rows to Cassandra every hour. And there are similar period batch jobs that read millions of rows and do some processing,

Re: READ Queries timing out.

2017-07-07 Thread Thakrar, Jayesh
Can you provide more details. E.g. table structure, the app used for the query, the query itself and the error message. Also get the output of the following commands from your cluster nodes (note that one command uses "." and the other "space" between keyspace and tablename) nodetool -h tables

Re: Question: Behavior of inserting a list multiple times with same timestamp

2017-06-30 Thread Thakrar, Jayesh
Thanks Sander - this helps get a better understanding! From: Fridtjof Sander Date: Friday, June 30, 2017 at 4:19 AM To: Vladimir Yudovin , "Thakrar, Jayesh" Cc: Subroto Barua , Zhongxiang Zheng , "user@cassandra.apache.org" Subject: Re: Question: Behavior of inserting a

Re: Question: Behavior of inserting a list multiple times with same timestamp

2017-06-20 Thread Thakrar, Jayesh
ECT * FROM test.test ; k | v ---+- 1 | [3] // = EXPECTED RESULT = From: Subroto Barua Date: Monday, June 19, 2017 at 11:09 PM To: "Thakrar, Jayesh" , Subroto Barua , Zhongxiang Zheng Cc: "user@cassandra.apache.org" Subject: Re: Question: Behavior of in

Re: Question: Behavior of inserting a list multiple times with same timestamp

2017-06-19 Thread Thakrar, Jayesh
Subroto, Cassandra docs say otherwise. Writing list data is accomplished with a JSON-style syntax. To write a record using INSERT, specify the entire list as a JSON array. Note: An INSERT will always replace the entire list. Maybe you can elaborate/shed some more light? Thanks, Jayesh Lists

Re: Apache Cassandra - Memory usage on server

2017-06-14 Thread Thakrar, Jayesh
Asad, The rest of the 42 GB of memory on your server is used by the filesystem buffer cache - see the "cached" column and the -/+ buffers/cache line. The OS (Linux) uses all free memory for filesystem buffer cache and if applications need memory, will relinquish it appropriately. To see the ac

Re: Question: Large partition warning

2017-06-14 Thread Thakrar, Jayesh
Thank you Kurt - that makes sense. Will certainly reduce it to 1024. Greatly appreciate your quick reply. Thanks, Jayesh From: kurt greaves Sent: Wednesday, June 14, 5:53 PM Subject: Re: Question: Large partition warning To: Fay Hou [Data Pipeline & Real-time Analytics] ­ Cc: Thakrar, Ja

Question: Large partition warning

2017-06-14 Thread Thakrar, Jayesh
We are on Cassandra 2.2.5 and I am constantly seeing warning messages about large partitions in system.log even though our setting for partition warning threshold is set to 4096 (MB). WARN [CompactionExecutor:43180] 2017-06-14 20:02:13,189 BigTableWriter.java:184 - Writing large partition tsg

Re: Effect of frequent mutations / memtable

2017-05-25 Thread Thakrar, Jayesh
"polling" cycle overhead. Furthermore, zk does not have the "overhead" of other things that Cassandra does. Honestly I am not familiar with Paxos and stuff, so can't speak to it. On 5/25/17, 3:40 PM, "Jan Algermissen" wrote: Hi Jayesh, On

Re: Effect of frequent mutations / memtable

2017-05-25 Thread Thakrar, Jayesh
Hi Jan, I would suggest looking at using Zookeeper for such a usecase. See http://zookeeper.apache.org/doc/trunk/recipes.html for some examples. Zookeeper is used for such purposes in Apache HBase (active master), Apache Kafka (active controller), Apache Hadoop, etc. Look for the "Leader Elect

Re: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

2017-04-27 Thread Thakrar, Jayesh
me data everywhere increases. C*heers, --- Alain Rodriguez - @arodream - al...@thelastpickle.com<mailto:al...@thelastpickle.com> France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2017-04-21 15:54 GMT+02:00 Thakrar, Jayesh mailto:jthak...@conver

Re: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

2017-04-21 Thread Thakrar, Jayesh
e 2 failure cases I mentioned earlier, the only other way data can become inconsistent is error when replicating the data in the background. Does Cassandra have a retry policy for internal replication? Is there a setting to change it? On Thu, Apr 6, 2017 at 10:54 PM, Thakrar,

Re: Why are automatic anti-entropy repairs required when hinted hand-off is enabled?

2017-04-06 Thread Thakrar, Jayesh
I had asked a similar/related question - on how to carry out repair, etc and got some useful pointers. I would highly recommend the youtube video or the slideshare link below (both are for the same presentation). https://www.youtube.com/watch?v=1Sz_K8UID6E http://www.slideshare.net/DataStax/rea

Re: [Cassandra 3.0.9] Cannot allocate memory

2017-03-23 Thread Thakrar, Jayesh
To: "user@cassandra.apache.org" Subject: RE: [Cassandra 3.0.9] Cannot allocate memory JVM config is as below: -Xms16G -Xmx16G -Xmn3000M What I need to check in dmesg? From: Thakrar, Jayesh [mailto:jthak...@conversantmedia.com] Sent: 23 March 2017 03:39 To: Abhishek Kumar Maheshwari ; user@

RE: [Cassandra 3.0.9] Cannot allocate memory

2017-03-22 Thread Thakrar, Jayesh
And what is the configured max heap? Sometimes you may also be able to see some useful messages in "dmesg" output. Jayesh From: Abhishek Kumar Maheshwari Sent: Wednesday, March 22, 2017 5:05:14 PM To: Thakrar, Jayesh; user@cassandra.apache.org S

Re: [Cassandra 3.0.9] Cannot allocate memory

2017-03-22 Thread Thakrar, Jayesh
Is/are the Cassandra server(s) shared? E.g. do they run mesos + spark? From: Abhishek Kumar Maheshwari Date: Wednesday, March 22, 2017 at 12:45 AM To: "user@cassandra.apache.org" Subject: [Cassandra 3.0.9] Cannot allocate memory Hi all, I am using Cassandra 3.0.9. while I am adding new server

Re: repair performance

2017-03-20 Thread Thakrar, Jayesh
on threads being used during repair (according to compactionstats). thank you also for your link recommendations. i will go through them. On Sat, 2017-03-18 at 16:54 +0000, Thakrar, Jayesh wrote: You changed compaction_throughput_mb_per_sec, but did you also increase concurrent_compactors? In re

Re: repair performance

2017-03-18 Thread Thakrar, Jayesh
You changed compaction_throughput_mb_per_sec, but did you also increase concurrent_compactors? In reference to the reaper and some other info I received on the user forum to my question on "nodetool repair", here are some useful links/slides - https://www.datastax.com/dev/blog/repair-in-cassa

Re: Does "nodetool repair" need to be run on each node for a given table?

2017-03-15 Thread Thakrar, Jayesh
Thank you Eric for helping out. The reason I sent the question a second time is because I did not see my question and the first reply from the usergroup. After I sent the question a second time, I got a personal flame from somebody else too and so examined my "spam" folders and that's where I fou

Does "nodetool repair" need to be run on each node for a given table?

2017-03-14 Thread Thakrar, Jayesh
I understand that the nodetool command connects to a specific server and for many of the commands, e.g. "info", "compactionstats", etc, the information is for that specific node. While for some other commands like "status", the info is for the whole cluster. So is "nodetool repair" that operates

Does "nodetool repair" need to be run on each node for a given table?

2017-03-13 Thread Thakrar, Jayesh
I understand that the nodetool command connects to a specific server and for many of the commands, e.g. "info", "compactionstats", etc, the information is for that specific node. While for some other commands like "status", the info is for the whole cluster. So is "nodetool repair" that operates

Re: Any way to control/limit off-heap memory?

2017-03-06 Thread Thakrar, Jayesh
22:54, Thakrar, Jayesh mailto:jthak...@conversantmedia.com>> wrote: I have a situation where the off-heap memory is bloating the jvm process memory, making it a candidate to be killed by the oom_killer. My server has 256 GB RAM and Cassandra heap memory of 16 GB Below is the output of "no

Any way to control/limit off-heap memory?

2017-03-04 Thread Thakrar, Jayesh
I have a situation where the off-heap memory is bloating the jvm process memory, making it a candidate to be killed by the oom_killer. My server has 256 GB RAM and Cassandra heap memory of 16 GB Below is the output of "nodetool info" and nodetool compactionstats for a culprit table which causes

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-04 Thread Thakrar, Jayesh
s , "user@cassandra.apache.org" Subject: Re: OOM on Apache Cassandra on 30 Plus node at the same time I was looking at nodetool info across all nodes. Consistently JVM heap used is ~ 12GB and off heap is ~ 4-5GB. ________ From: Thakrar, Jayesh Sent: Saturday, M

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-04 Thread Thakrar, Jayesh
STCS . LCS is not an option for us as we have frequent updates. Thanks, Shravan ________ From: Thakrar, Jayesh Sent: Friday, March 3, 2017 3:47:27 PM To: Joaquin Casares; user@cassandra.apache.org Subject: Re: OOM on Apache Cassandra on 30 Plus node at the same time

Re: OOM on Apache Cassandra on 30 Plus node at the same time

2017-03-03 Thread Thakrar, Jayesh
Had been fighting a similar battle, but am now over the hump for most part. Get info on the server config (e.g. memory, cpu, free memory (free -g), etc) Run "nodetool info" on the nodes to get heap and off-heap sizes Run "nodetool tablestats" or "nodetool tablestats ." on the key large tables Ess

Re: Is periodic manual repair necessary?

2017-02-28 Thread Thakrar, Jayesh
mamd All are basically best effort. Commit logs get corrupt and only flush periodically. Bits rot on disk and while crossing networks network Read repair is async and only happens randomly Hinted handoff stops after some time and is not guarenteed. On Monday, February 27, 2017, Thakrar, Jayesh

Re: Is periodic manual repair necessary?

2017-02-27 Thread Thakrar, Jayesh
e gc_grace_seconds after the data has been TTL'ed, you won't need an extra repair. 2017-02-27 18:29 GMT+01:00 Oskar Kjellin mailto:oskar.kjel...@gmail.com>>: Are you running multi dc? Skickat från min iPad 27 feb. 2017 kl. 16:08 skrev Thakrar, Jayesh mailto:jthak...@conversantmed

Is periodic manual repair necessary?

2017-02-27 Thread Thakrar, Jayesh
Suppose I have an application, where there are no deletes, only 5-10% of rows being occasionally updated (and that too only once) and a lot of reads. Furthermore, I have replication = 3 and both read and write are configured for local_quorum. Occasionally, servers do go into maintenance. I und