Re: which astyanax version to use?

2015-11-17 Thread Lijun Huang
Thank you Minh, So it means if I want to use Cassandra 2.1+, any version of Astyanax cannot be compatible with it? Because we are already using the Astyanax, it maybe a heavy work to change from Astyanax to Datastax Java Driver. On Wed, Nov 18, 2015 at 11:52 AM, Minh Do wrote: > The latest v

Re: which astyanax version to use?

2015-11-17 Thread Minh Do
The latest version of Astyanax won't work with Cassandra 2.1+. So you are better off using Java Driver from Datastax. /Minh On Tue, Nov 17, 2015 at 7:29 PM, Lijun Huang wrote: > Hi All, > > I have the similar problem, if I use the Cassandra 2.1 version, which > Astyanax version is the best one

Re: which astyanax version to use?

2015-11-17 Thread Lijun Huang
Hi All, I have the similar problem, if I use the Cassandra 2.1 version, which Astyanax version is the best one for me? For the versions in Astyanax Github pages make me a little confused, I need some experience about this. Thanks in advance. Thanks, Lijun Huang -- Origina

Re: Repair Hangs while requesting Merkle Trees

2015-11-17 Thread Anuj Wadehra
Thanks Bryan !! Connection is in ESTBLISHED state on on end and completely missing at other end (in another dc). Yes, we can revisit TCP tuning.But the problem is node specific. So not sure whether tuning is the culprit. Thanks Anuj Sent from Yahoo Mail on Android From:"Bryan Cheng" Date

Re: Nodetool rebuild on vnodes enabled

2015-11-17 Thread Robert Coli
On Tue, Nov 17, 2015 at 3:24 PM, cass savy wrote: > I am exploring vnodes on DSE spark enabled DC. I added new nodes with 64 > vnodes, stream thruput 100mb instead of default 200mb, sokcet_timeout set > to 1hr. > 1) what version of Cassandra (please the version of Apache Cassandra, not DSE)? 2

unsubscribe

2015-11-17 Thread Johan Sandström

Nodetool rebuild on vnodes enabled

2015-11-17 Thread cass savy
I am exploring vnodes on DSE spark enabled DC. I added new nodes with 64 vnodes, stream thruput 100mb instead of default 200mb, sokcet_timeout set to 1hr. Nodetool rebuild on new node which has vnodes has been running for days and not completing. Nodetool netstats says its streaming files from 4

Re: Repair Hangs while requesting Merkle Trees

2015-11-17 Thread Bryan Cheng
Ah OK, might have misunderstood you. Streaming socket should not be in play during merkle tree generation (validation compaction). They may come in play during merkle tree exchange- that I'm not sure about. You can read a bit more here: https://issues.apache.org/jira/browse/CASSANDRA-8611. Regardl

Re: handling down node cassandra 2.0.15

2015-11-17 Thread Robert Coli
On Tue, Nov 17, 2015 at 4:33 AM, Anuj Wadehra wrote: > Only if gc_grace_seconds havent passed since the failure. If your machine > is down for more than gc_grace_seconds you need to delete the data > directory and go with auto bootstrap = true . > Since CASSANDRA-6961 you can : 1) bring up the

Re: Ingesting Large Number of files

2015-11-17 Thread Robert Coli
On Tue, Nov 17, 2015 at 6:32 AM, Tushar Agrawal wrote: > We get periodic bulk load (twice a month) in form of delimited data files. > We get about 10K files with average size of 50 MB. Each record is a row in > Cassandra table. > http://www.pythian.com/blog/bulk-loading-options-for-cassandra/ =

Re: Help diagnosing performance issue

2015-11-17 Thread Robert Coli
On Tue, Nov 17, 2015 at 11:08 AM, Sebastian Estevez < sebastian.este...@datastax.com> wrote: > You're sstables are probably falling out of page cache on the smaller > nodes and your slow disks are killing your latencies. > +1 most likely. Are the heaps the same size on both machines? =Rob

Re: Help diagnosing performance issue

2015-11-17 Thread Sebastian Estevez
Hi, You're sstables are probably falling out of page cache on the smaller nodes and your slow disks are killing your latencies. Check to see if this is the case with pcstat: https://github.com/tobert/pcstat All the best, [image: datastax_logo.png] Sebastián Estéve

Re: Help diagnosing performance issue

2015-11-17 Thread Antoine Bonavita
Hello, As I have not heard from anybody on the list, I guess I did not provide the right kind of information or I did not ask the right question. The things I forgot to mention in my previous email: * Checked the logs without noticing anything out of the ordinary. Memtables flushes occur ever

Re: Ingesting Large Number of files

2015-11-17 Thread areddyraja
This is about 5gb one time. Network speed lets us say 200mb/sec Let us say you have 10 node cluster. Choose your partition key such a way that, it can write on all nodes. That means about 0.5gb per node. With 200 mb/sec network speed, 500 mb takes 500*8/200 would give 20 secs total time for

Ingesting Large Number of files

2015-11-17 Thread Tushar Agrawal
We get periodic bulk load (twice a month) in form of delimited data files. We get about 10K files with average size of 50 MB. Each record is a row in Cassandra table. What is the best way to ingest data into cassandra in fastest possible way? Thank you, Tushar

Re: Hotspots on Time Series based Model

2015-11-17 Thread areddyraja
13mb seems to very fine in our exp. we have keys that could take more than 100 mb Sent from my iPhone > On 17-Nov-2015, at 7:47 PM, Yuri Shkuro wrote: > > You can also subdivide hourly partition further by adding an artificial > "bucket" field to the partition key, which you populate with a

Re: Hotspots on Time Series based Model

2015-11-17 Thread Yuri Shkuro
You can also subdivide hourly partition further by adding an artificial "bucket" field to the partition key, which you populate with a random number say between 0 and 10. When you query, you fan out 10 queries, one for each bucket, and you need to do a manual merge of the resilts. This way you pay

Re: Hotspots on Time Series based Model

2015-11-17 Thread Jack Krupansky
I'd be more comfortable keeping partition size below 10MB, but the more critical factor is the write rate. In a technical sense a single node (and its replicas) and a single partition will be a hotspot since all writes for an extended period of time will go to that single node and partition (for on

Re: handling down node cassandra 2.0.15

2015-11-17 Thread Anuj Wadehra
Only if gc_grace_seconds havent passed since the failure. If your machine is down for more than gc_grace_seconds you need to delete the data directory and go with auto bootstrap = true . Thanks Anuj Sent from Yahoo Mail on Android From:"Anishek Agarwal" Date:Tue, 17 Nov, 2015 at 10:52 am Su

Re: Hotspots on Time Series based Model

2015-11-17 Thread DuyHai Doan
"Will the partition on PRIMARY KEY ((YEAR, MONTH, DAY, HOUR) cause any hotspot issues on a node given the hourly data size is ~13MB ?" 13MB/partition is quite small, you should be fine. One thing to be careful is the memtable flush frequency and appropriate compaction tuning to avoid having one p

Hotspots on Time Series based Model

2015-11-17 Thread Chandra Sekar KR
Hi, I have a time-series based table with the below structure and partition size/volumetrics. The purpose of this table is to enable range based scans on log_ts and filter the log_id, so it can be further used in the main table (EVENT_LOG) for checking the actual data. The EVENT_LOG_BY_DATE ac

Re: Devcenter & C* 3.0 Connection Error.

2015-11-17 Thread Alexandre Dutra
Hello, Unfortunately, even with their most recent versions, both the Java driver and DevCenter are incompatible with C* 3.0. Both teams are working actively to release compatible versions in the next days. Regards, Alexandre Dutra On Tue, Nov 17, 2015 at 12:16 AM Michael Shuler wrote: > On 1