Re: Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
Yep, similar symptoms - but no, there's no OOM killer
Also, if you look in the gc log around the time of failure, the heap memory was 
much below the 16 GB limit.

And if I look at the 2nd last GC log before the crash, here’s what we see.
And you will notice that cleaning up the 4 GB Eden (along with the other 
cleanup = full GC) took 3.48 seconds.

Hence I reduced the New space to a tiny amount (5% = 800 GB).
With this setting, there have been no crashes so far and all STW GC pauses have 
been under 200 ms.
We borrowed this "approach" from doing something similar with HBase (where we 
had a lot of read/write requests too).
As part of the tuning, we have also reduced the tenure from default of 15 to 2.
This pushes all medium/long-living objects into old-gen very early in the 
lifecycle.
Having them around in Eden/survivor space would just result in them doing 
hop-scotch until they are done with 15 iterations.
And each of those copy from survivor 0 to survivor 1 is a STW operation, so 
having smaller new space and having shorter tenure seems to be helping so far.

  region size 8192K, 544 young (4456448K), 5 survivors (40960K)
Metaspace   used 43707K, capacity 45602K, committed 45696K, reserved 
1089536K
  class spaceused 5872K, capacity 6217K, committed 6272K, reserved 1048576K
2017-08-22T07:11:46.594+: 96808.824: [Full GC (System.gc())  
8196M->1670M(16G), 3.4842221 secs]
   [Eden: 4312.0M(4760.0M)->0.0B(4800.0M) Survivors: 40.0M->0.0B Heap: 
8196.3M(16.0G)->1670.6M(16.0G)], [Metaspace: 43707K->43488K(1089536K)]
Heap after GC invocations=20378 (full 3):
garbage-first heap   total 16777216K, used 1710741K [0x0003c000, 
0x0007c000, 0x0007c000)
  region size 8192K, 0 young (0K), 0 survivors (0K)
Metaspace   used 43488K, capacity 45128K, committed 45696K, reserved 
1089536K
  class spaceused 5798K, capacity 6098K, committed 6272K, reserved 1048576K
}
[Times: user=5.48 sys=0.54, real=3.48 secs]


From: kurt greaves 
Date: Tuesday, August 22, 2017 at 5:40 PM
To: User 
Subject: Re: Cassandra crashes

sounds like Cassandra is being killed by the oom killer. can you check dmesg to 
see if this is the case? sounds a bit absurd with 256g of memory but could be a 
config problem.


Re: Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
So the reason for the large number of prepared statements is because of the 
nature of the application.
One of the periodic job does lookup with a partial key (key prefix, not 
filtered queries) for thousands of rows.
Hence the large number of prepared statements.

Almost of the queries once executed are not needed anymore, so its ok for the 
older prepared statements to be purged.

All the same, I will do some analysis on the prepared statements table.

Thanks for the tip/pointer!

On 8/22/17, 5:17 PM, "Alain Rastoul"  wrote:

On 08/22/2017 05:39 PM, Thakrar, Jayesh wrote:
> Surbhi and Fay,
>
> I agree we have plenty of RAM to spare.
>

Hi

At the very beginning of system.log there is a
INFO  [CompactionExecutor:487] 2017-08-21 23:21:01,684 
NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot 
allocate chunk of 1.000MiB
who comes from BufferPool exhaustion (several messages)
 From the source
file_cache_size_in_mb
 (Default: Smaller of 1/4 heap or 512) Total memory to use for 
SSTable-reading buffers.

So here in your configuration it is 512M, may be you should set it to a 
higher value in your cassandra.yaml (1/4 => 4G) ?
(also see https://issues.apache.org/jira/browse/CASSANDRA-11681, the 
default value may not be accurate)

Another strange thing is the number of prepared statements which also 
gives errors: lot of messages like
WARN  [ScheduledTasks:1] 2017-08-22 07:09:25,009 QueryProcessor.java:105 
- 1 prepared statements discarded in the last minute because cache limit 
reached (64 MB)
...
on startup you see:
INFO  [main] 2017-08-22 12:50:13,787 QueryProcessor.java:162 - Preloaded 
13357 prepared statements

13K different prepared statements sounds a lot...
an issue about that seems to be fixed in 3.11 
https://issues.apache.org/jira/browse/CASSANDRA-13641
May be youc should truncate your system.prepared_statements and restart 
your node


HTH


-- 
best,
Alain




-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org


Re: Cassandra isn't compacting old files

2017-08-22 Thread Sotirios Delimanolis
I issued another major compaction just now and a brand new SSTable in Level 2 
has an Estimated droppable tombstone value of 0.64. I don't know how accurate 
that is.

On Tuesday, August 22, 2017, 9:33:34 PM PDT, Sotirios Delimanolis 
 wrote:

What do you mean by "a single SSTable"? SSTable size is set to 200MB and there 
are ~ 100 SSTables in that previous example in Level 3.
This previous example table doesn't have a TTL, but we do delete rows. I've 
since compacted the table so I can't provide the previous "Estimated droppable 
tombstones", but it was > 0.3. I've set the threshold to 0.25, but 
unchecked_tombstone_compaction is false. Perhaps setting it to true would 
eventually compact individual SSTables.
I agree that I am probably not creating enough data at the moment, but what got 
me into this situation in the first place? All (+/- a couple) SSTables in each 
level are last modified on the same date. 
On Tuesday, August 22, 2017, 5:16:27 PM PDT, kurt greaves 
 wrote:

LCS major compaction on 2.2 should compact each level to have a single SSTable. 
It seems more likely to me that you are simply not generating enough data to 
require compactions in L3 and most data is TTL'ing before it gets there. Out of 
curiosity, what does sstablemetadata report for  Estimated droppable tombstones 
on one of those tables, and what is your TTL?​

Re: Cassandra isn't compacting old files

2017-08-22 Thread Sotirios Delimanolis
What do you mean by "a single SSTable"? SSTable size is set to 200MB and there 
are ~ 100 SSTables in that previous example in Level 3.
This previous example table doesn't have a TTL, but we do delete rows. I've 
since compacted the table so I can't provide the previous "Estimated droppable 
tombstones", but it was > 0.3. I've set the threshold to 0.25, but 
unchecked_tombstone_compaction is false. Perhaps setting it to true would 
eventually compact individual SSTables.
I agree that I am probably not creating enough data at the moment, but what got 
me into this situation in the first place? All (+/- a couple) SSTables in each 
level are last modified on the same date. 
On Tuesday, August 22, 2017, 5:16:27 PM PDT, kurt greaves 
 wrote:

LCS major compaction on 2.2 should compact each level to have a single SSTable. 
It seems more likely to me that you are simply not generating enough data to 
require compactions in L3 and most data is TTL'ing before it gets there. Out of 
curiosity, what does sstablemetadata report for  Estimated droppable tombstones 
on one of those tables, and what is your TTL?​

Re: Cassandra isn't compacting old files

2017-08-22 Thread kurt greaves
LCS major compaction on 2.2 should compact each level to have a single
SSTable. It seems more likely to me that you are simply not generating
enough data to require compactions in L3 and most data is TTL'ing before it
gets there. Out of curiosity, what does sstablemetadata report for
 Estimated droppable tombstones on one of those tables, and what is your
TTL?​


Re: Bootstrapping a node fails because of compactions not keeping up

2017-08-22 Thread kurt greaves
What version are you running? 2.2 has an improvement that will retain
levels when streaming and this shouldn't really happen. If you're on 2.1
best bet is to upgrade


Re: Cassandra crashes....

2017-08-22 Thread kurt greaves
sounds like Cassandra is being killed by the oom killer. can you check
dmesg to see if this is the case? sounds a bit absurd with 256g of memory
but could be a config problem.


Re: ExceptionInInitializerError encountered during startup

2017-08-22 Thread Russell Bateman

Reporting back...

I gave up and did this work in a stand-alone project where 
/EmbeddedCass//andraServ//erHelper.startEmbeddedCassandra()/ works fine. 
I think now that Cassandra's dependency upon /slf4j/ clashes with what 
we've had to do in our greater product to regulate which version of 
/slf4j/ is included by the myriad, disparate components (see error log 
in original post). Maybe what we're doing, mostly, requiring 1.7.25 and 
excluding (via Maven) any linked-in /slf4j/ from any of many components, 
is getting us into trouble with Cassandra. You can't mix and match 
/slf4j/ versions. There has been lots of hair-pulling over /slf4j/ as it 
is and this is not a welcome development.


Thanks.


On 08/22/2017 12:34 PM, Russell Bateman wrote:


Thanks, Myrle. This confirms what I've tried so far. The problem may 
be an assumed requirement, such as the YAML file and perhaps 
/log4j-embedded-cassandra.properties/. But, I'm supplying both of 
those. This has something to do with /slf4j/ logging, a logger that 
has no name when it goes to get it. It's unclear how it was supposed 
to get a name.



On 08/22/2017 08:48 AM, Myrle Krantz wrote:

On Tue, Aug 22, 2017 at 4:21 PM, Russell Bateman  wrote:

As this was my first post to this forum, I wonder if someone would reply to
it if only to prove to myself that I've not posted to /dev/null as it were
even if there's no answer or the question is stupid, etc. (Note: I am
getting other forum posts, but maybe what I've posted didn't reach the
forum?)

Profuse thanks,

Russ

This will be my second post to this forum : o).  We're using embedded
Cassandra in our component tests as a junit ExternalResource, together
with datastax.  Here's some of what our start code looks like:
The original code can be found here:
https://github.com/mifosio/test/blob/develop/src/main/java/io/mifos/core/test/fixture/cassandra/CassandraInitializer.java

An example yaml file with the properties requested here can be found:
https://github.com/mifosio/portfolio/blob/develop/service/src/main/resources/application.yml

I use this hundreds of times a day and it works, but because our use
case is kind of special (multi-tenancy via keyspaces and multiple data
stores initialized as TestRules), you may have to noodle through what
we've done a bit to get your stuff working.

Greets,
Myrle

public final class CassandraInitializer {
   public void initialize() throws Exception {

 Builder clusterBuilder = (new
Builder()).withClusterName(System.getProperty("cassandra.clusterName"));
 ContactPointUtils.process(clusterBuilder,
System.getProperty("cassandra.contactPoints"));
 this.cluster = clusterBuilder.build();

 this.setup();
   }


   private void setup() throws Exception {
 if (!this.useExistingDB) {
   this.startEmbeddedCassandra();
   this.createKeyspaceSeshat();
 }

   }


   private void startEmbeddedCassandra() throws Exception {
 
EmbeddedCassandraServerHelper.startEmbeddedCassandra(TimeUnit.SECONDS.toMillis(30L));
   }
}



On 08/18/2017 05:49 PM, Russell Bateman wrote:

Cassandra version 3.9, -unit version 3.1.3.2.

In my (first ever) unit test, I've coded:

@BeforeClass public static void initFakeCassandra() throws
InterruptedException, IOException, TTransportException
{
 EmbeddedCassandraServerHelper.startEmbeddedCassandra( 2L );
}

Execution crashes down inside at

 at org.apache.cassandra.transport.Server.start(Server.java:128)
 at java.util.Collections$SingletonSet.forEach(Collections.java:4767)
 at
org.apache.cassandra.service.NativeTransportService.start(NativeTransportService.java:128)
 at
org.apache.cassandra.service.CassandraDaemon.startNativeTransport(CassandraDaemon.java:649)
 at
org.apache.cassandra.service.CassandraDaemon.start(CassandraDaemon.java:511)
 at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:616)
 at
org.cassandraunit.utils.EmbeddedCassandraServerHelper$1.run(EmbeddedCassandraServerHelper.java:129)
 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException: name
 at
io.netty.util.internal.logging.AbstractInternalLogger.(AbstractInternalLogger.java:39)
 at
io.netty.util.internal.logging.Slf4JLogger.(Slf4JLogger.java:30)
 at
io.netty.util.internal.logging.Slf4JLoggerFactory.newInstance(Slf4JLoggerFactory.java:73)
 at
io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:84)
 at
io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:77)
 at io.netty.bootstrap.ServerBootstrap.(ServerBootstrap.java:46)
 ... 10 more

I am following the tutorial at Baeldung. Not sure where to go from here.
Stackoverflow response was not helpful to me, I probably 

Re: Cassandra crashes....

2017-08-22 Thread Alain Rastoul

On 08/22/2017 05:39 PM, Thakrar, Jayesh wrote:

Surbhi and Fay,

I agree we have plenty of RAM to spare.



Hi

At the very beginning of system.log there is a
INFO  [CompactionExecutor:487] 2017-08-21 23:21:01,684 
NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot 
allocate chunk of 1.000MiB

who comes from BufferPool exhaustion (several messages)
From the source
file_cache_size_in_mb
(Default: Smaller of 1/4 heap or 512) Total memory to use for 
SSTable-reading buffers.


So here in your configuration it is 512M, may be you should set it to a 
higher value in your cassandra.yaml (1/4 => 4G) ?
(also see https://issues.apache.org/jira/browse/CASSANDRA-11681, the 
default value may not be accurate)


Another strange thing is the number of prepared statements which also 
gives errors: lot of messages like
WARN  [ScheduledTasks:1] 2017-08-22 07:09:25,009 QueryProcessor.java:105 
- 1 prepared statements discarded in the last minute because cache limit 
reached (64 MB)

...
on startup you see:
INFO  [main] 2017-08-22 12:50:13,787 QueryProcessor.java:162 - Preloaded 
13357 prepared statements


13K different prepared statements sounds a lot...
an issue about that seems to be fixed in 3.11 
https://issues.apache.org/jira/browse/CASSANDRA-13641
May be youc should truncate your system.prepared_statements and restart 
your node



HTH


--
best,
Alain

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Upgrade requirements for upgrading from cassandra 2.1.x to 2.2.x

2017-08-22 Thread Jon Haddad
NEWS.txt is the goto spot for upgrade instructions, caveats, etc.

Jon

> On Aug 22, 2017, at 2:46 PM, Chuck Reynolds  wrote:
> 
> Anyone?
>  
> From: "Chuck (me) Reynolds" 
> Reply-To: "user@cassandra.apache.org" 
> Date: Tuesday, August 22, 2017 at 9:40 AM
> To: "user@cassandra.apache.org" 
> Subject: Upgrade requirements for upgrading from cassandra 2.1.x to 2.2.x
>  
> Where can I find requirements to upgrade from Cassandra 2.1.x to 2.2.x?
>  
> I would like to know things like do I have to do an SStable upgrade or not.
>  
>  
> Thanks



Re: Upgrade requirements for upgrading from cassandra 2.1.x to 2.2.x

2017-08-22 Thread Chuck Reynolds
Anyone?

From: "Chuck (me) Reynolds" 
Reply-To: "user@cassandra.apache.org" 
Date: Tuesday, August 22, 2017 at 9:40 AM
To: "user@cassandra.apache.org" 
Subject: Upgrade requirements for upgrading from cassandra 2.1.x to 2.2.x

Where can I find requirements to upgrade from Cassandra 2.1.x to 2.2.x?

I would like to know things like do I have to do an SStable upgrade or not.


Thanks


Re: ExceptionInInitializerError encountered during startup

2017-08-22 Thread Russell Bateman
Thanks, Myrle. This confirms what I've tried so far. The problem may be 
an assumed requirement, such as the YAML file and perhaps 
/log4j-embedded-cassandra.properties/. But, I'm supplying both of those. 
This has something to do with /slf4j/ logging, a logger that has no name 
when it goes to get it. It's unclear how it was supposed to get a name.



On 08/22/2017 08:48 AM, Myrle Krantz wrote:

On Tue, Aug 22, 2017 at 4:21 PM, Russell Bateman  wrote:

As this was my first post to this forum, I wonder if someone would reply to
it if only to prove to myself that I've not posted to /dev/null as it were
even if there's no answer or the question is stupid, etc. (Note: I am
getting other forum posts, but maybe what I've posted didn't reach the
forum?)

Profuse thanks,

Russ

This will be my second post to this forum : o).  We're using embedded
Cassandra in our component tests as a junit ExternalResource, together
with datastax.  Here's some of what our start code looks like:
The original code can be found here:
https://github.com/mifosio/test/blob/develop/src/main/java/io/mifos/core/test/fixture/cassandra/CassandraInitializer.java

An example yaml file with the properties requested here can be found:
https://github.com/mifosio/portfolio/blob/develop/service/src/main/resources/application.yml

I use this hundreds of times a day and it works, but because our use
case is kind of special (multi-tenancy via keyspaces and multiple data
stores initialized as TestRules), you may have to noodle through what
we've done a bit to get your stuff working.

Greets,
Myrle

public final class CassandraInitializer {
   public void initialize() throws Exception {

 Builder clusterBuilder = (new
Builder()).withClusterName(System.getProperty("cassandra.clusterName"));
 ContactPointUtils.process(clusterBuilder,
System.getProperty("cassandra.contactPoints"));
 this.cluster = clusterBuilder.build();

 this.setup();
   }


   private void setup() throws Exception {
 if (!this.useExistingDB) {
   this.startEmbeddedCassandra();
   this.createKeyspaceSeshat();
 }

   }


   private void startEmbeddedCassandra() throws Exception {
 
EmbeddedCassandraServerHelper.startEmbeddedCassandra(TimeUnit.SECONDS.toMillis(30L));
   }
}



On 08/18/2017 05:49 PM, Russell Bateman wrote:

Cassandra version 3.9, -unit version 3.1.3.2.

In my (first ever) unit test, I've coded:

@BeforeClass public static void initFakeCassandra() throws
InterruptedException, IOException, TTransportException
{
 EmbeddedCassandraServerHelper.startEmbeddedCassandra( 2L );
}

Execution crashes down inside at

 at org.apache.cassandra.transport.Server.start(Server.java:128)
 at java.util.Collections$SingletonSet.forEach(Collections.java:4767)
 at
org.apache.cassandra.service.NativeTransportService.start(NativeTransportService.java:128)
 at
org.apache.cassandra.service.CassandraDaemon.startNativeTransport(CassandraDaemon.java:649)
 at
org.apache.cassandra.service.CassandraDaemon.start(CassandraDaemon.java:511)
 at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:616)
 at
org.cassandraunit.utils.EmbeddedCassandraServerHelper$1.run(EmbeddedCassandraServerHelper.java:129)
 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException: name
 at
io.netty.util.internal.logging.AbstractInternalLogger.(AbstractInternalLogger.java:39)
 at
io.netty.util.internal.logging.Slf4JLogger.(Slf4JLogger.java:30)
 at
io.netty.util.internal.logging.Slf4JLoggerFactory.newInstance(Slf4JLoggerFactory.java:73)
 at
io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:84)
 at
io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:77)
 at io.netty.bootstrap.ServerBootstrap.(ServerBootstrap.java:46)
 ... 10 more

I am following the tutorial at Baeldung. Not sure where to go from here.
Stackoverflow response was not helpful to me, I probably don't know enough
yet.

Thanks.



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org





Re: Cassandra isn't compacting old files

2017-08-22 Thread Sotirios Delimanolis
Ignore the files missing those other components, that was confirmation bias :( 
I was sorting by date instead of by name and just assumed that something was 
wrong with Cassandra.
Here's an example table's SSTables, sorted by level, then by repaired status:
SSTable [name=lb-432055-big-Data.db, level=0, repaired=false, 
instant=2017-08-22] Level 1 SSTable 
[name=lb-431497-big-Data.db, level=1, repaired=false, instant=2017-08-17]
SSTable [name=lb-431496-big-Data.db, level=1, repaired=false, 
instant=2017-08-17]
SSTable [name=lb-431495-big-Data.db, level=1, repaired=false, 
instant=2017-08-17]
SSTable [name=lb-431498-big-Data.db, level=1, repaired=false, 
instant=2017-08-17]
SSTable [name=lb-431499-big-Data.db, level=1, repaired=false, 
instant=2017-08-17]
SSTable [name=lb-431501-big-Data.db, level=1, repaired=false, 
instant=2017-08-17]
SSTable [name=lb-431503-big-Data.db, level=1, repaired=false, 
instant=2017-08-17]
SSTable [name=lb-431500-big-Data.db, level=1, repaired=false, 
instant=2017-08-17]
SSTable [name=lb-431502-big-Data.db, level=1, repaired=false, 
instant=2017-08-17]
SSTable [name=lb-431504-big-Data.db, level=1, repaired=false, 
instant=2017-08-17]
SSTable [name=lb-426107-big-Data.db, level=1, repaired=true, instant=2017-07-07]
SSTable [name=lb-426105-big-Data.db, level=1, repaired=true, instant=2017-07-07]
SSTable [name=lb-426090-big-Data.db, level=1, repaired=true, instant=2017-07-07]
SSTable [name=lb-426092-big-Data.db, level=1, repaired=true, instant=2017-07-07]
SSTable [name=lb-426094-big-Data.db, level=1, repaired=true, instant=2017-07-07]
SSTable [name=lb-426096-big-Data.db, level=1, repaired=true, instant=2017-07-07]
SSTable [name=lb-426104-big-Data.db, level=1, repaired=true, instant=2017-07-07]
SSTable [name=lb-426102-big-Data.db, level=1, repaired=true, instant=2017-07-07]
SSTable [name=lb-426100-big-Data.db, level=1, repaired=true, 
instant=2017-07-07] Level 2 SSTable 
[name=lb-423829-big-Data.db, level=2, repaired=false, instant=2017-06-23]
SSTable [name=lb-431505-big-Data.db, level=2, repaired=false, 
instant=2017-08-17]
SSTable [name=lb-423830-big-Data.db, level=2, repaired=false, 
instant=2017-06-23]
SSTable [name=lb-424559-big-Data.db, level=2, repaired=false, 
instant=2017-06-29]
SSTable [name=lb-424568-big-Data.db, level=2, repaired=false, 
instant=2017-06-29]
SSTable [name=lb-424567-big-Data.db, level=2, repaired=false, 
instant=2017-06-29]
SSTable [name=lb-424566-big-Data.db, level=2, repaired=false, 
instant=2017-06-29]
SSTable [name=lb-424561-big-Data.db, level=2, repaired=false, 
instant=2017-06-29]
SSTable [name=lb-424563-big-Data.db, level=2, repaired=false, 
instant=2017-06-29]
SSTable [name=lb-424562-big-Data.db, level=2, repaired=false, 
instant=2017-06-29]
SSTable [name=lb-423825-big-Data.db, level=2, repaired=false, 
instant=2017-06-23]
SSTable [name=lb-423823-big-Data.db, level=2, repaired=false, 
instant=2017-06-23]
SSTable [name=lb-424560-big-Data.db, level=2, repaired=false, 
instant=2017-06-29]
SSTable [name=lb-423824-big-Data.db, level=2, repaired=false, 
instant=2017-06-23]
SSTable [name=lb-423828-big-Data.db, level=2, repaired=false, 
instant=2017-06-23]
SSTable [name=lb-423826-big-Data.db, level=2, repaired=false, 
instant=2017-06-23]
SSTable [name=lb-423380-big-Data.db, level=2, repaired=false, 
instant=2017-06-20]
SSTable [name=lb-426057-big-Data.db, level=2, repaired=true, instant=2017-07-07]
SSTable [name=lb-426058-big-Data.db, level=2, repaired=true, 
instant=2017-07-07][...~60 more from 2017-07-07...]
SSTable [name=lb-425991-big-Data.db, level=2, repaired=true, instant=2017-07-07]
SSTable [name=lb-426084-big-Data.db, level=2, repaired=true, 
instant=2017-07-07] Level 3 SSTable 
[name=lb-383142-big-Data.db, level=3, repaired=false, instant=2016-11-19]
SSTable [name=lb-383143-big-Data.db, level=3, repaired=false, 
instant=2016-11-19][...~40 more from 2016-11-19...]
SSTable [name=lb-383178-big-Data.db, level=3, repaired=false, 
instant=2016-11-19]
SSTable [name=lb-383188-big-Data.db, level=3, repaired=false, 
instant=2016-11-19]
SSTable [name=lb-425948-big-Data.db, level=3, repaired=false, 
instant=2017-07-07]
SSTable [name=lb-383179-big-Data.db, level=3, repaired=false, 
instant=2016-11-19]
SSTable [name=lb-383175-big-Data.db, level=3, repaired=false, 
instant=2016-11-19][...~30 more from 2016-11-19...]
SSTable [name=lb-383160-big-Data.db, level=3, repaired=false, 
instant=2016-11-19]
SSTable [name=lb-383181-big-Data.db, level=3, repaired=false, 
instant=2016-11-19]
SSTable [name=lb-383258-big-Data.db, level=3, repaired=true, instant=2016-11-19]
SSTable [name=lb-383256-big-Data.db, level=3, repaired=true, instant=2016-11-19]
SSTable [name=lb-386829-big-Data.db, level=3, repaired=true, instant=2016-11-30]
SSTable [name=lb-383251-big-Data.db, level=3, repaired=true, instant=2016-11-19]
SSTable [name=lb-383259-big-Data.db, level=3, repaired=true, instant=2016-11-19]
SSTable 

Re: Limit on having number of nodes in C* cluster

2017-08-22 Thread Jeff Jirsa
You can't, you typically need to add a whole new datacenter, replicate your 
data and traffic there, and decommission the old one 

-- 
Jeff Jirsa


> On Aug 22, 2017, at 8:56 AM, techpyaasa .  wrote:
> 
> How can I decrease tokens for existing nodes?
> Doesn't it create problem?
> 
> 
> On Aug 22, 2017 7:22 PM, "Vladimir Yudovin"  wrote:
> Probably decreasing tokens number can help to mange big cluster?
> 
> Best regards, Vladimir Yudovin, 
> Winguzone - Cloud Cassandra Hosting
> 
> 
>  On Mon, 21 Aug 2017 19:38:37 -0400 Eduard Tudenhoefner 
>  wrote 
> 
> We've been doing successful testing with multi-DC setups and 500 nodes per 
> DC. However, I agree with Jon here. Certain things are easier/faster with 
> e.g. 5x100 node clusters than 1x500 node cluster.
> 
> Cheers
> 
> On Mon, Aug 21, 2017 at 10:16 AM, Jon Haddad  
> wrote:
> As far as I know, those 75K nodes are not in a single cluster.  If memory 
> serves correctly (and this article seems to indicate that it does 
> http://www.techrepublic.com/article/apples-secret-nosql-sauce-includes-a-hefty-dose-of-cassandra/),
>  you’ll see clusters of 1,000 nodes.  
> 
> Things start to get a little hairy once you go above a couple hundred nodes.  
> I would rather run 5 100 node clusters than a single 500 node cluster.  In 
> theory, once you’ve built out the tooling to manage 2 clusters you should be 
> able to apply it to manage 20 (reality always gets in the way though…)
> 
> Jon
> 
> On Aug 21, 2017, at 9:15 AM, techpyaasa .  wrote:
> 
> Thanks lot for reply :)
> 
> On Aug 21, 2017 6:44 PM, "Vladimir Yudovin"  wrote:
> 
> Actually there are clusters of thousandths nodes: Some of the largest 
> production deployments include Apple's, with over 75,000 nodes storing over 
> 10 PB of data
> 
> Best regards, Vladimir Yudovin, 
> Winguzone - Cloud Cassandra Hosting
> 
> 
>  On Mon, 21 Aug 2017 08:35:37 -0400 techpyaasa .  
> wrote 
> 
> Hi 
> 
> Is there any limit on having number of nodes in c* cluster.
> Right now we have c*-2.1.17 cluster with 3 DCs each DC with 3 groups & each 
> group has 21 nodes.
> 
> We wanted to increase the cluster capacity by adding 6 nodes per group as 
> many of nodes disk usage crossed 65%.
> 
> So just wanted to clarify is there any limit/drawback having huge cluster/too 
> many nodes in a c* cluster
> 
> Thanks in advance
> TechPyaasa
> 
> 
> 
> 
> 
> 
> 


Re: Limit on having number of nodes in C* cluster

2017-08-22 Thread techpyaasa .
How can I decrease tokens for existing nodes?
Doesn't it create problem?


On Aug 22, 2017 7:22 PM, "Vladimir Yudovin"  wrote:

Probably decreasing tokens number
 can help to mange
big cluster?

Best regards, Vladimir Yudovin,
*Winguzone  - Cloud Cassandra Hosting*


 On Mon, 21 Aug 2017 19:38:37 -0400 *Eduard Tudenhoefner
>*
wrote 

We've been doing successful testing with multi-DC setups and 500 nodes per
DC. However, I agree with Jon here. Certain things are easier/faster with
e.g. 5x100 node clusters than 1x500 node cluster.

Cheers

On Mon, Aug 21, 2017 at 10:16 AM, Jon Haddad 
wrote:

As far as I know, those 75K nodes are not in a single cluster.  If memory
serves correctly (and this article seems to indicate that it does
http://www.techrepublic.com/article/apples-secret-
nosql-sauce-includes-a-hefty-dose-of-cassandra/), you’ll see clusters of
1,000 nodes.

Things start to get a little hairy once you go above a couple hundred
nodes.  I would rather run 5 100 node clusters than a single 500 node
cluster.  In theory, once you’ve built out the tooling to manage 2 clusters
you should be able to apply it to manage 20 (reality always gets in the way
though…)

Jon

On Aug 21, 2017, at 9:15 AM, techpyaasa .  wrote:

Thanks lot for reply :)

On Aug 21, 2017 6:44 PM, "Vladimir Yudovin"  wrote:


Actually there are clusters of thousandths nodes: Some of the largest
production deployments include Apple's, with over 75,000 nodes storing over
10 PB of data 

Best regards, Vladimir Yudovin,
*Winguzone  - Cloud Cassandra Hosting*


 On Mon, 21 Aug 2017 08:35:37 -0400 *techpyaasa . >* wrote 

Hi

Is there any limit on having number of nodes in c* cluster.
Right now we have c*-2.1.17 cluster with 3 DCs each DC with 3 groups & each
group has 21 nodes.

We wanted to increase the cluster capacity by adding 6 nodes per group as
many of nodes disk usage crossed 65%.

So just wanted to clarify is there any limit/drawback having huge
cluster/too many nodes in a c* cluster

Thanks in advance
TechPyaasa


Re: Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
We are using TWCS compaction.

Here's one sample table

CREATE TABLE ae.raw_logs_by_user (
dtm_id bigint,
company_id int,
source text,
status_id int,
log_date bigint,
uuid_least bigint,
uuid_most bigint,
profile_system_id int,
parent_message_id int,
parent_template_id int,
record text,
PRIMARY KEY (dtm_id, company_id, source, status_id, log_date, uuid_least, 
uuid_most, profile_system_id)
) WITH CLUSTERING ORDER BY (company_id ASC, source ASC, status_id ASC, log_date 
DESC, uuid_least ASC, uuid_most ASC, profile_system_id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 
'compaction_window_size': '7', 'compaction_window_unit': 'DAYS', 
'max_threshold': '4', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';



From: "Fay Hou [Storage Service] ­" 
Date: Tuesday, August 22, 2017 at 10:52 AM
To: "Thakrar, Jayesh" 
Cc: "user@cassandra.apache.org" , Surbhi Gupta 

Subject: Re: Cassandra crashes

what kind compaction? LCS ?

On Aug 22, 2017 8:39 AM, "Thakrar, Jayesh" 
> wrote:

Surbhi and Fay,



I agree we have plenty of RAM to spare.

However, our data load and compaction churn is so high (partially thanks to 
SSDs!), its causing too much GC pressure.

And as you know the Edenspace and survivor space cleanup is a STW - hence 
larger heap will increase the gc pauses.



As for "what happens" during the crash - nothing.

It seems that the daemon just dies silently.



If you are interested, attached are the Cassandra system.log and the detailed 
gc log files.



system.log = Cassandra log (see line 424 - it’s the last line before the crash)



cassandra-gc.log.8.currrent = last gc log at the time of crash

Cassandra-gc.log.0 = gc log after startup



If you want compare the "gc pauses" grep the gc files for the word "stopped"

(e.g. grep stopped cassandra-gc.log.*)



Thanks for the quick replies!



Jayesh





From: Surbhi Gupta >
Date: Tuesday, August 22, 2017 at 10:19 AM
To: "Thakrar, Jayesh" 
>, 
"user@cassandra.apache.org" 
>
Subject: Re: Cassandra crashes



16GB heap is too small for G1GC . Try at least 32GB of heap size

On Tue, Aug 22, 2017 at 7:58 AM Fay Hou [Storage Service] ­ 
> wrote:

What errors do you see?

16gb of 256 GB . Heap is too small. I would give heap at least 160gb.





On Aug 22, 2017 7:42 AM, "Thakrar, Jayesh" 
> wrote:



















Hi All,







We are somewhat new users to Cassandra 3.10 on Linux and wanted to ping the 
user group for their experiences.







Our usage profile is  batch jobs that load millions of rows to Cassandra every 
hour.



And there are similar period batch jobs that read millions of rows and do some 
processing, outputting the result to HDFS (no issues with HDFS).







We often seen Cassandra daemons crash.



Key points of our environment are:



Pretty good servers: 54 cores (with hyperthreading), 256 GB RAM, 3.2 TB SSD 
drive



Compaction: TWCS compaction with 7 day windows as the data retention period is 
limited - about 120 days.



JDK: Java 1.8.0.71 and G1 GC



Heap Size: 16 GB



Large SSTables: 50 GB to 300+ GB






We see the daemons crash after some back-to-back long GCs (1.5 to 3.5 seconds).



Note that we had set the target for GC pauses to be 200 ms







We have been somewhat able to tame the crashes by updating the TWCS compaction 
properties



to have min/max compaction sstables = 4 and by drastically reducing the size of 
the New/Eden space (to 5% of heap space = 800 MB).



Its been about 12 hours and our stop-the-world gc pauses are under 90 ms.



Since the servers have more than sufficient resources, we are not seeing any 
noticeable performance impact.







Is this kind of tuning normal/expected?







Thanks,



Jayesh















Re: Cassandra crashes....

2017-08-22 Thread Fay Hou [Storage Service] ­
what kind compaction? LCS ?

On Aug 22, 2017 8:39 AM, "Thakrar, Jayesh" 
wrote:

Surbhi and Fay,



I agree we have plenty of RAM to spare.

However, our data load and compaction churn is so high (partially thanks to
SSDs!), its causing too much GC pressure.

And as you know the Edenspace and survivor space cleanup is a STW - hence
larger heap will increase the gc pauses.



As for "what happens" during the crash - nothing.

It seems that the daemon just dies silently.



If you are interested, attached are the Cassandra system.log and the
detailed gc log files.



system.log = Cassandra log (see line 424 - it’s the last line before the
crash)



cassandra-gc.log.8.currrent = last gc log at the time of crash

Cassandra-gc.log.0 = gc log after startup



If you want compare the "gc pauses" grep the gc files for the word "stopped"

(e.g. grep stopped cassandra-gc.log.*)



Thanks for the quick replies!



Jayesh





*From: *Surbhi Gupta 
*Date: *Tuesday, August 22, 2017 at 10:19 AM
*To: *"Thakrar, Jayesh" , "
user@cassandra.apache.org" 
*Subject: *Re: Cassandra crashes



16GB heap is too small for G1GC . Try at least 32GB of heap size

On Tue, Aug 22, 2017 at 7:58 AM Fay Hou [Storage Service] ­ <
fay...@coupang.com> wrote:

What errors do you see?

16gb of 256 GB . Heap is too small. I would give heap at least 160gb.





On Aug 22, 2017 7:42 AM, "Thakrar, Jayesh" 
wrote:




















Hi All,







We are somewhat new users to Cassandra 3.10 on Linux and wanted to ping the
user group for their experiences.







Our usage profile is  batch jobs that load millions of rows to Cassandra
every hour.



And there are similar period batch jobs that read millions of rows and do
some processing, outputting the result to HDFS (no issues with HDFS).







We often seen Cassandra daemons crash.



Key points of our environment are:



*Pretty good servers:* 54 cores (with hyperthreading), 256 GB RAM, 3.2 TB
SSD drive



*Compaction:* TWCS compaction with 7 day windows as the data retention
period is limited - about 120 days.



*JDK: *Java 1.8.0.71 and G1 GC



*Heap Size:* 16 GB



*Large SSTables:* 50 GB to 300+ GB






We see the daemons crash after some back-to-back long GCs (1.5 to 3.5
seconds).



Note that we had set the target for GC pauses to be 200 ms







We have been somewhat able to tame the crashes by updating the TWCS
compaction properties



to have min/max compaction sstables = 4 and by drastically reducing the
size of the New/Eden space (to 5% of heap space = 800 MB).



Its been about 12 hours and our stop-the-world gc pauses are under 90 ms.



Since the servers have more than sufficient resources, we are not seeing
any noticeable performance impact.







Is this kind of tuning normal/expected?







Thanks,



Jayesh


Upgrade requirements for upgrading from cassandra 2.1.x to 2.2.x

2017-08-22 Thread Chuck Reynolds
Where can I find requirements to upgrade from Cassandra 2.1.x to 2.2.x?

I would like to know things like do I have to do an SStable upgrade or not.


Thanks


Re: Cassandra crashes....

2017-08-22 Thread Surbhi Gupta
16GB heap is too small for G1GC . Try at least 32GB of heap size
On Tue, Aug 22, 2017 at 7:58 AM Fay Hou [Storage Service] ­ <
fay...@coupang.com> wrote:

> What errors do you see?
> 16gb of 256 GB . Heap is too small. I would give heap at least 160gb.
>
>
> On Aug 22, 2017 7:42 AM, "Thakrar, Jayesh" 
> wrote:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Hi All,
>
>
>
>
>
> We are somewhat new users to Cassandra 3.10 on Linux and wanted to ping
> the user group for their experiences.
>
>
>
>
>
> Our usage profile is  batch jobs that load millions of rows to Cassandra
> every hour.
>
>
> And there are similar period batch jobs that read millions of rows and do
> some processing, outputting the result to HDFS (no issues with HDFS).
>
>
>
>
>
> We often seen Cassandra daemons crash.
>
>
> Key points of our environment are:
>
>
> *Pretty good servers:* 54 cores (with hyperthreading), 256 GB RAM, 3.2 TB
> SSD drive
>
>
> *Compaction:* TWCS compaction with 7 day windows as the data retention
> period is limited - about 120 days.
>
>
> *JDK: *Java 1.8.0.71 and G1 GC
>
>
>
> *Heap Size:* 16 GB
>
>
> *Large SSTables:* 50 GB to 300+ GB
>
>
>
>
>
>
>
> We see the daemons crash after some back-to-back long GCs (1.5 to 3.5
> seconds).
>
>
> Note that we had set the target for GC pauses to be 200 ms
>
>
>
>
>
> We have been somewhat able to tame the crashes by updating the TWCS
> compaction properties
>
>
>
> to have min/max compaction sstables = 4 and by drastically reducing the
> size of the New/Eden space (to 5% of heap space = 800 MB).
>
>
> Its been about 12 hours and our stop-the-world gc pauses are under 90 ms.
>
>
> Since the servers have more than sufficient resources, we are not seeing
> any noticeable performance impact.
>
>
>
>
>
> Is this kind of tuning normal/expected?
>
>
>
>
>
> Thanks,
>
>
> Jayesh
>
>
>
>
>
>
>
>
>
>
>
>
>


Re: Cassandra crashes....

2017-08-22 Thread Fay Hou [Storage Service] ­
What errors do you see?
16gb of 256 GB . Heap is too small. I would give heap at least 160gb.

On Aug 22, 2017 7:42 AM, "Thakrar, Jayesh" 
wrote:

Hi All,



We are somewhat new users to Cassandra 3.10 on Linux and wanted to ping the
user group for their experiences.



Our usage profile is  batch jobs that load millions of rows to Cassandra
every hour.

And there are similar period batch jobs that read millions of rows and do
some processing, outputting the result to HDFS (no issues with HDFS).



We often seen Cassandra daemons crash.

Key points of our environment are:

*Pretty good servers:* 54 cores (with hyperthreading), 256 GB RAM, 3.2 TB
SSD drive

*Compaction:* TWCS compaction with 7 day windows as the data retention
period is limited - about 120 days.

*JDK: *Java 1.8.0.71 and G1 GC

*Heap Size:* 16 GB

*Large SSTables:* 50 GB to 300+ GB

We see the daemons crash after some back-to-back long GCs (1.5 to 3.5
seconds).

Note that we had set the target for GC pauses to be 200 ms



We have been somewhat able to tame the crashes by updating the TWCS
compaction properties

to have min/max compaction sstables = 4 and by drastically reducing the
size of the New/Eden space (to 5% of heap space = 800 MB).

Its been about 12 hours and our stop-the-world gc pauses are under 90 ms.

Since the servers have more than sufficient resources, we are not seeing
any noticeable performance impact.



Is this kind of tuning normal/expected?



Thanks,

Jayesh


Re: Cassandra crashes....

2017-08-22 Thread Jeff Jirsa
You typically don't want to set the eden space when you're using G1

-- 
Jeff Jirsa


> On Aug 22, 2017, at 7:42 AM, Thakrar, Jayesh  
> wrote:
> 
> Hi All,
>  
> We are somewhat new users to Cassandra 3.10 on Linux and wanted to ping the 
> user group for their experiences.
>  
> Our usage profile is  batch jobs that load millions of rows to Cassandra 
> every hour.
> And there are similar period batch jobs that read millions of rows and do 
> some processing, outputting the result to HDFS (no issues with HDFS).
>  
> We often seen Cassandra daemons crash.
> Key points of our environment are:
> Pretty good servers: 54 cores (with hyperthreading), 256 GB RAM, 3.2 TB SSD 
> drive
> Compaction: TWCS compaction with 7 day windows as the data retention period 
> is limited - about 120 days.
> JDK: Java 1.8.0.71 and G1 GC
> Heap Size: 16 GB
> Large SSTables: 50 GB to 300+ GB
> 
> We see the daemons crash after some back-to-back long GCs (1.5 to 3.5 
> seconds).
> Note that we had set the target for GC pauses to be 200 ms
>  
> We have been somewhat able to tame the crashes by updating the TWCS 
> compaction properties
> to have min/max compaction sstables = 4 and by drastically reducing the size 
> of the New/Eden space (to 5% of heap space = 800 MB).
> Its been about 12 hours and our stop-the-world gc pauses are under 90 ms.
> Since the servers have more than sufficient resources, we are not seeing any 
> noticeable performance impact.
>  
> Is this kind of tuning normal/expected?
>  
> Thanks,
> Jayesh
>  


Re: ExceptionInInitializerError encountered during startup

2017-08-22 Thread Myrle Krantz
On Tue, Aug 22, 2017 at 4:21 PM, Russell Bateman  wrote:
> As this was my first post to this forum, I wonder if someone would reply to
> it if only to prove to myself that I've not posted to /dev/null as it were
> even if there's no answer or the question is stupid, etc. (Note: I am
> getting other forum posts, but maybe what I've posted didn't reach the
> forum?)
>
> Profuse thanks,
>
> Russ

This will be my second post to this forum : o).  We're using embedded
Cassandra in our component tests as a junit ExternalResource, together
with datastax.  Here's some of what our start code looks like:
The original code can be found here:
https://github.com/mifosio/test/blob/develop/src/main/java/io/mifos/core/test/fixture/cassandra/CassandraInitializer.java

An example yaml file with the properties requested here can be found:
https://github.com/mifosio/portfolio/blob/develop/service/src/main/resources/application.yml

I use this hundreds of times a day and it works, but because our use
case is kind of special (multi-tenancy via keyspaces and multiple data
stores initialized as TestRules), you may have to noodle through what
we've done a bit to get your stuff working.

Greets,
Myrle

public final class CassandraInitializer {
  public void initialize() throws Exception {

Builder clusterBuilder = (new
Builder()).withClusterName(System.getProperty("cassandra.clusterName"));
ContactPointUtils.process(clusterBuilder,
System.getProperty("cassandra.contactPoints"));
this.cluster = clusterBuilder.build();

this.setup();
  }


  private void setup() throws Exception {
if (!this.useExistingDB) {
  this.startEmbeddedCassandra();
  this.createKeyspaceSeshat();
}

  }


  private void startEmbeddedCassandra() throws Exception {

EmbeddedCassandraServerHelper.startEmbeddedCassandra(TimeUnit.SECONDS.toMillis(30L));
  }
}


> On 08/18/2017 05:49 PM, Russell Bateman wrote:
>
> Cassandra version 3.9, -unit version 3.1.3.2.
>
> In my (first ever) unit test, I've coded:
>
> @BeforeClass public static void initFakeCassandra() throws
> InterruptedException, IOException, TTransportException
> {
> EmbeddedCassandraServerHelper.startEmbeddedCassandra( 2L );
> }
>
> Execution crashes down inside at
>
> at org.apache.cassandra.transport.Server.start(Server.java:128)
> at java.util.Collections$SingletonSet.forEach(Collections.java:4767)
> at
> org.apache.cassandra.service.NativeTransportService.start(NativeTransportService.java:128)
> at
> org.apache.cassandra.service.CassandraDaemon.startNativeTransport(CassandraDaemon.java:649)
> at
> org.apache.cassandra.service.CassandraDaemon.start(CassandraDaemon.java:511)
> at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:616)
> at
> org.cassandraunit.utils.EmbeddedCassandraServerHelper$1.run(EmbeddedCassandraServerHelper.java:129)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException: name
> at
> io.netty.util.internal.logging.AbstractInternalLogger.(AbstractInternalLogger.java:39)
> at
> io.netty.util.internal.logging.Slf4JLogger.(Slf4JLogger.java:30)
> at
> io.netty.util.internal.logging.Slf4JLoggerFactory.newInstance(Slf4JLoggerFactory.java:73)
> at
> io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:84)
> at
> io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:77)
> at io.netty.bootstrap.ServerBootstrap.(ServerBootstrap.java:46)
> ... 10 more
>
> I am following the tutorial at Baeldung. Not sure where to go from here.
> Stackoverflow response was not helpful to me, I probably don't know enough
> yet.
>
> Thanks.
>
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Cassandra crashes....

2017-08-22 Thread Thakrar, Jayesh
Hi All,

We are somewhat new users to Cassandra 3.10 on Linux and wanted to ping the 
user group for their experiences.

Our usage profile is  batch jobs that load millions of rows to Cassandra every 
hour.
And there are similar period batch jobs that read millions of rows and do some 
processing, outputting the result to HDFS (no issues with HDFS).

We often seen Cassandra daemons crash.
Key points of our environment are:
Pretty good servers: 54 cores (with hyperthreading), 256 GB RAM, 3.2 TB SSD 
drive
Compaction: TWCS compaction with 7 day windows as the data retention period is 
limited - about 120 days.
JDK: Java 1.8.0.71 and G1 GC
Heap Size: 16 GB
Large SSTables: 50 GB to 300+ GB

We see the daemons crash after some back-to-back long GCs (1.5 to 3.5 seconds).
Note that we had set the target for GC pauses to be 200 ms

We have been somewhat able to tame the crashes by updating the TWCS compaction 
properties
to have min/max compaction sstables = 4 and by drastically reducing the size of 
the New/Eden space (to 5% of heap space = 800 MB).
Its been about 12 hours and our stop-the-world gc pauses are under 90 ms.
Since the servers have more than sufficient resources, we are not seeing any 
noticeable performance impact.

Is this kind of tuning normal/expected?

Thanks,
Jayesh



Re: ExceptionInInitializerError encountered during startup

2017-08-22 Thread Russell Bateman
As this was my first post to this forum, I wonder if someone would reply 
to it if only to prove to myself that I've not posted to //dev/null/ as 
it were even if there's no answer or the question is stupid, etc. (Note: 
I am getting other forum posts, but maybe what I've posted didn't reach 
the forum?)


Profuse thanks,

Russ


On 08/18/2017 05:49 PM, Russell Bateman wrote:


Cassandra version 3.9, -unit version 3.1.3.2.

In my (first ever) unit test, I've coded:

@BeforeClass public static void initFakeCassandra() throws 
InterruptedException, IOException, TTransportException

{
EmbeddedCassandraServerHelper.startEmbeddedCassandra( 2L );
}

Execution crashes down inside at

at org.apache.cassandra.transport.Server.start(Server.java:128)
at java.util.Collections$SingletonSet.forEach(Collections.java:4767)
at 
org.apache.cassandra.service.NativeTransportService.start(NativeTransportService.java:128)
at 
org.apache.cassandra.service.CassandraDaemon.startNativeTransport(CassandraDaemon.java:649)
at 
org.apache.cassandra.service.CassandraDaemon.start(CassandraDaemon.java:511)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:616)
at 
org.cassandraunit.utils.EmbeddedCassandraServerHelper$1.run(EmbeddedCassandraServerHelper.java:129)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException: name
at 
io.netty.util.internal.logging.AbstractInternalLogger.(AbstractInternalLogger.java:39)
at 
io.netty.util.internal.logging.Slf4JLogger.(Slf4JLogger.java:30)
at 
io.netty.util.internal.logging.Slf4JLoggerFactory.newInstance(Slf4JLoggerFactory.java:73)
at 
io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:84)
at 
io.netty.util.internal.logging.InternalLoggerFactory.getInstance(InternalLoggerFactory.java:77)
at 
io.netty.bootstrap.ServerBootstrap.(ServerBootstrap.java:46)

... 10 more

I am following the tutorial at Baeldung. Not sure where to go from 
here. Stackoverflow response 
 
was not helpful to me, I probably don't know enough yet.


Thanks.





Re: Limit on having number of nodes in C* cluster

2017-08-22 Thread Vladimir Yudovin
Probably decreasing tokens number can help to mange big cluster?



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Mon, 21 Aug 2017 19:38:37 -0400 Eduard Tudenhoefner 
eduard.tudenhoef...@datastax.com wrote 




We've been doing successful testing with multi-DC setups and 500 nodes per DC. 
However, I agree with Jon here. Certain things are easier/faster with e.g. 
5x100 node clusters than 1x500 node cluster.



Cheers



On Mon, Aug 21, 2017 at 10:16 AM, Jon Haddad jonathan.had...@gmail.com 
wrote:

As far as I know, those 75K nodes are not in a single cluster.  If memory 
serves correctly (and this article seems to indicate that it does 
http://www.techrepublic.com/article/apples-secret-nosql-sauce-includes-a-hefty-dose-of-cassandra/),
 you’ll see clusters of 1,000 nodes.  



Things start to get a little hairy once you go above a couple hundred nodes.  I 
would rather run 5 100 node clusters than a single 500 node cluster.  In 
theory, once you’ve built out the tooling to manage 2 clusters you should be 
able to apply it to manage 20 (reality always gets in the way though…)



Jon



On Aug 21, 2017, at 9:15 AM, techpyaasa . techpya...@gmail.com wrote:



Thanks lot for reply :)



On Aug 21, 2017 6:44 PM, "Vladimir Yudovin" vla...@winguzone.com wrote:



Actually there are clusters of thousandths nodes: Some of the largest 
production deployments include Apple's, with over 75,000 nodes storing over 10 
PB of data



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Mon, 21 Aug 2017 08:35:37 -0400 techpyaasa . 
techpya...@gmail.com wrote 




Hi 



Is there any limit on having number of nodes in c* cluster.

Right now we have c*-2.1.17 cluster with 3 DCs each DC with 3 groups  each 
group has 21 nodes.



We wanted to increase the cluster capacity by adding 6 nodes per group as many 
of nodes disk usage crossed 65%.



So just wanted to clarify is there any limit/drawback having huge cluster/too 
many nodes in a c* cluster



Thanks in advance

TechPyaasa











































Bootstrapping a node fails because of compactions not keeping up

2017-08-22 Thread Stefano Ortolani
Hi all,

I am trying to bootstrap a node without success due to running out of space.
Average node size is 260GB with lots of LCS tables (overall data ~3.5 TB)
Each node is also configured with a 1TB disk, including the bootstrapping
node.

After 12 hours the bootstrapping node fails with more than 2000 compactions
pending.
I have tried to increase the compaction threads, un-throttle the
throughput, but no luck.
Also I've tried reduce the streaming throughput (on all nodes this) as much
as possible (1 Mb/sec).

The thing is that even if I manage to reduce the streaming throughput,
compactions are still piling up  leaving me with no other choice than
pausing the whole streaming process altogether, which AFAIK is not possible
:/

I have searched the mailing list and this
https://groups.google.com/forum/#!topic/nosql-databases/GbcFMUUJ7XU reminds
me of my current situation. Unfortunately I am still left with the
following two questions:

1) Is it possible to pause all streams so to give the boostrapping node
enough time to catch up?
2) I don't understand why the disk fills up so fast. Considering the amount
of LCS tables I was even ready to blame LCS and the fact that the first
compaction at L0 is done with STCS, but 1 TB is way more than twice the
amount of data the node should own in theory, so something else might be
responsible for the over streaming.

Thanks in advance!
Stefano Ortolani


Commit log archive configuration

2017-08-22 Thread Spyros Tzovairis
Hello there,

in 
https://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configLogArchive.html
 it is mentioned that:
“The commit log is archived at node startup and when a commit log is written to 
disk, or at a specified point-in-time”

How do I configure the archive mechanism to run at a specified point-in-time?

Many thanks
Spyros
-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Commit log archiver

2017-08-22 Thread Spyros Tzovairis
Hello there,

I’m running cassandra 3.11.0 and I have configured 
commitlog_archiving.properties to run a script.
That script rsyncs the commit log files to a backup server. If the rsync is 
successful then the script exits with status 0 and the commit log file is 
deleted from the cassandra node. If the rsync is NOT successful then the script 
exits with status 1 and the commit log file is NOT deleted from the cassandra 
node.

Therefore when the backup server is down the commit log files start to pile up. 
Once the backup server is up again new commit log files will be synced to the 
backup server and deleted locally. The old commit log files (the ones that 
failed to sync earlier) do not get synced at all while cassandra is up. They 
will only get synced after the cassandra node is restarted.

Is this what is expected? I would personally expect that all commit log syncs 
would resume at some point and the archiving status would return to normal.

Many thanks
Spyros
-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Getting all unique keys

2017-08-22 Thread Avi Levi
Thanks Christophe, we will definitely consider that in the future.

On Mon, Aug 21, 2017 at 3:01 PM, Christophe Schmitz <
christo...@instaclustr.com> wrote:

> Hi Avi,
>
> The spark-project documentation is quite good, as well as the
> spark-cassandra-connector github project, which contains some basic
> examples you can easily get inspired from. A few random advice you might
> find usefull:
> - You will want one spark worker on each node, and a spark master on
> either one of the node, or on a separate node.
> - Pay close attention at your port configuration (firewall) as the spark
> error log does not always give you the right hint.
> - Pay close attention at your heap size. Make sure to configure your heap
> size such as Cassandra heap size + spark heap size < your node memory (take
> into account Cassandra off heap usage if enabled, OS etc...)
> - If your Cassandra data center is used in production, make sure you
> throttle read / write from Spark, pay attention to your latencies, and
> consider using a separate analytic cassandra data center if you get serious
> with Spark.
> - More or less everyone I know find that writing spark jobs in scala is
> natural, while writing them in java is painful :D
>
> Getting spark running will be a bit of an investment at the beginning, but
> overall you will find out it allows you to run queries you can't naturally
> run in Cassandra, like the one you described.
>
> Cheers,
>
> Christophe
>
> On 21 August 2017 at 16:16, Avi Levi  wrote:
>
>> Thanks Christophe,
>> we didn't want to add too many moving parts but is sound like a good
>> solution. do you have any reference / link that I can look at ?
>>
>> Cheers
>> Avi
>>
>> On Mon, Aug 21, 2017 at 3:43 AM, Christophe Schmitz <
>> christo...@instaclustr.com> wrote:
>>
>>> Hi Avi,
>>>
>>> Have you thought of using Spark for that work? If you collocate the
>>> spark workers on each Cassandra nodes, the spark-cassandra connector will
>>> split automatically the token range for you in such a way that each spark
>>> worker only hit the Cassandra local node. This will also be done in
>>> parallel. Should be much faster that way.
>>>
>>> Cheers,
>>> Christophe
>>>
>>>
>>> On 21 August 2017 at 01:34, Avi Levi  wrote:
>>>
 Thank you very much , one question . you wrote that I do not need
 distinct here since it's a part from the primary key. but only the
 combination is unique (*PRIMARY KEY (id, timestamp) ) .* also if I
 take the last token and feed it back as you showed wouldn't I get
 overlapping boundaries ?

 On Sun, Aug 20, 2017 at 6:18 PM, Eric Stevens 
 wrote:

> You should be able to fairly efficiently iterate all the partition
> keys like:
>
> select id, token(id) from table where token(id) >=
> -9204925292781066255 limit 1000;
>  id | system.token(id)
> +--
> ...
>  0xb90ea1db5c29f2f6d435426dccf77cca6320fac9 | -7821793584824523686
>
> Take the last token you receive and feed it back in, skipping
> duplicates from the previous page (on the unlikely chance that you have 
> two
> ID's with a token collision on the page boundary):
>
> select id, token(id) from table where token(id) >=
> -7821793584824523686 limit 1000;
>  id | system.token(id)
> +-
> ...
>  0xc6289d729c9087fb5a1fe624b0b883ab82a9bffe | -434806781044590339
>
> Continue until you have no more results.  You don't really need
> distinct here: it's part of your primary key, it must already be distinct.
>
> If you want to parallelize it, split the ring into *n* ranges and
> include it as an upper bound for each segment.
>
> select id, token(id) from table where token(id) >=
> -9204925292781066255 AND token(id) < $rangeUpperBound limit 1000;
>
>
> On Sun, Aug 20, 2017 at 12:33 AM Avi Levi  wrote:
>
>> I need to get all unique keys (not the complete primary key, just the
>> partition key) in order to aggregate all the relevant records of that key
>> and apply some calculations on it.
>>
>> *CREATE TABLE my_table (
>>
>> id text,
>>
>> timestamp bigint,
>>
>> value double,
>>
>> PRIMARY KEY (id, timestamp) )*
>>
>> I know that to query like this
>>
>> *SELECT DISTINCT id FROM my_table *
>>
>> is not very efficient but how about the approach presented here 
>> 
>>  sending queries in parallel and using the token
>>
>> *SELECT DISTINCT id FROM my_table WHERE token(id) >= 
>> -9204925292781066255 AND token(id) <= -9223372036854775808; *