Re: nodetool repair fails after expansion

2013-10-04 Thread Dave Cowen
I should clarify that we are running Cassandra 1.1.12.

Dave


On Fri, Oct 4, 2013 at 2:08 PM, Dave Cowen  wrote:

> We're testing expanding a 4-node cluster into an 8-node cluster, and we
> keep running into issues with the repair process near the end.
>
> We're bringing up nodes 1-by-1 into the cluster, retokening nodes for an
> 8-node configuration, running nodetool cleanup on the nodes after each
> retokening, and then increasing the replication factor to 5. This all works
> without issue, and the cluster appears to be healthy in that 8-node
> configuration with a replication factor of 5.
>
> However, when we then run nodetool repair on the nodes, it will at some
> point stall, even when being run on one of the new nodes.
>
> It doesn't appear to stall while it's performing a compaction or
> transferring CF data. We've monitored compactionstats and netstats closely,
> and things always stall when a repair command is started, ie:
>
> [2013-10-02 23:19:39,254] Starting repair command #9, repairing 5 ranges
> for keyspace ourkeyspace
>
> The last message from AntiEntropyService is usually something to the
> effect of:
>
> <190>Oct  3 00:01:02 myhost.com 1970947950 [AntiEntropySessions:24] INFO
>  org.apache.cassandra.service.AntiEntropyService  - [repair
> #9b17d310-2bbd-11e3--e06ec6c436ff] session completed successfully
>
> ... and then things don't start for the next repair. Nothing in the logs
> that looks related.
>
> Where this occurs is arbitrary. If I run on individual CFs within
> ourkeyspace, some will succeed, and some will fail, but if we start over
> and do the 4-node to 8-node expansion again, things will fail at a
> different place.
>
> Advice as to what to look at next?
>
> Thanks,
>
> Dave
>


nodetool repair fails after expansion

2013-10-04 Thread Dave Cowen
We're testing expanding a 4-node cluster into an 8-node cluster, and we
keep running into issues with the repair process near the end.

We're bringing up nodes 1-by-1 into the cluster, retokening nodes for an
8-node configuration, running nodetool cleanup on the nodes after each
retokening, and then increasing the replication factor to 5. This all works
without issue, and the cluster appears to be healthy in that 8-node
configuration with a replication factor of 5.

However, when we then run nodetool repair on the nodes, it will at some
point stall, even when being run on one of the new nodes.

It doesn't appear to stall while it's performing a compaction or
transferring CF data. We've monitored compactionstats and netstats closely,
and things always stall when a repair command is started, ie:

[2013-10-02 23:19:39,254] Starting repair command #9, repairing 5 ranges
for keyspace ourkeyspace

The last message from AntiEntropyService is usually something to the effect
of:

<190>Oct  3 00:01:02 myhost.com 1970947950 [AntiEntropySessions:24] INFO
 org.apache.cassandra.service.AntiEntropyService  - [repair
#9b17d310-2bbd-11e3--e06ec6c436ff] session completed successfully

... and then things don't start for the next repair. Nothing in the logs
that looks related.

Where this occurs is arbitrary. If I run on individual CFs within
ourkeyspace, some will succeed, and some will fail, but if we start over
and do the 4-node to 8-node expansion again, things will fail at a
different place.

Advice as to what to look at next?

Thanks,

Dave


Re: Facebook Cassandra

2013-10-04 Thread Blair Zajac

On 10/04/2013 11:46 AM, Baskar Duraikannu wrote:

Good evening.  We are using Cassandra for a while. I have been faced
with a question "why did facebook drop Cassandra" over and over again. I
could not find a good answer to this question on the internet.

Could you please help me with the question?


Quora seems to be a good place to ask these questions:

http://www.quora.com/Why-did-Facebook-pick-HBase-instead-of-Cassandra-for-the-new-messaging-platform
http://www.quora.com/search?q=cassandra+facebook

Blair



Re: DataStax driver with Scala/Akka

2013-10-04 Thread Richard Rodseth
For the record, I was depending on a project that was pulling in an older
r09 version of Guava. Thanks again for the help.


On Thu, Oct 3, 2013 at 7:29 PM, Richard Rodseth  wrote:

> Thanks very much. Your pom works for me too, so that gives me a good
> reference point.
>
>
> On Thu, Oct 3, 2013 at 11:30 AM, Giancarlo Silvestrin <
> giancar...@gmail.com> wrote:
>
>> I created a sample pom.xml that successfully compiles cassandra.scala
>> using maven 3, it might be useful to compare with you own:
>> https://gist.github.com/gsilvestrin/6814624
>>
>>
>> On Thu, Oct 3, 2013 at 2:16 PM, Richard Rodseth wrote:
>>
>>> Thanks for the offer. I wouldn't be able to share the whole pom, and
>>> this task has been de-prioritized, but if I can find the time I will try to
>>> create a simpler test case.
>>>
>>> I just tried adding the exclusion to the pom dependency, but it didn't
>>> make a difference.
>>>
>>> sbt:
>>>
>>>   "com.datastax.cassandra"  % "cassandra-driver-core" % "1.0.1"
>>>  exclude("org.xerial.snappy", "snappy-java"),
>>>"org.xerial.snappy"   % "snappy-java"   % "1.0.5", //
>>> https://github.com/ptaoussanis/carmine/issues/5
>>>
>>> maven:
>>>
>>> 
>>>
>>>
>>>
>>> com.datastax.cassandra
>>>
>>> cassandra-driver-core
>>>
>>> 1.0.1
>>>
>>> 
>>>
>>> 
>>>
>>> org.xerial.snappy
>>>
>>> snappy-java
>>>
>>>   
>>>
>>> 
>>>
>>>   
>>>
>>>   
>>>
>>>   org.xerial.snappy
>>>
>>>   snappy-java
>>>
>>>   1.0.5
>>>   
>>>
>>>
>>> On Thu, Oct 3, 2013 at 10:36 AM, Giancarlo Silvestrin <
>>> giancar...@gmail.com> wrote:
>>>
 Richard,

 I'm using akka + cassandra as well. I copied cassandra.scala to my
 local project and it compiled fine using SBT.

 If you can share your pom.xml I can try to help you.

 -- Giancarlo

 On Thu, Oct 3, 2013 at 12:14 PM, Richard Rodseth wrote:

> I wanted to try the async Cassandra driver from DataStax, in a
> Scala/Akka app, so I took a look at the Akka Cassandra Activator template.
>
> https://github.com/eigengo/activator-akka-cassandra
>
>  I copied cassandra.scala from the template (it contains a conversion
> from ResultSetFuture to a Scala future).
>
> Unfortunately that file doesn't compile in my project - the error is:
>
> value addListener is not a member of
> com.datastax.driver.core.ResultSetFuture
>
>
> I have the following in my Maven pom.xml
>
>
>
> com.datastax.cassandra
>
> cassandra-driver-core
>
> 1.0.3
>
>   
>
> Probably something silly. Any pointers appreciated. Any other
> non-blocking Akka/Cassandra samples out there for me to look at?
>
> Thanks in advance
>
> RIchard
>


>>>
>>
>


Facebook Cassandra

2013-10-04 Thread Baskar Duraikannu
Good evening.  We are using Cassandra for a while. I have been faced with a 
question "why did facebook drop Cassandra" over and over again. I could not 
find a good answer to this question on the internet.
Could you please help me with the question? 
--RegardsBaskar Duraikannu
  

Re: Increased read timeouts during rolling upgrade to C* 1.2

2013-10-04 Thread Paulo Motta
One more piece of information to help troubleshooting the issue:

During the "nodetool drain" operation just before the upgrade, instead of
just stopping accepting new writes, the node actually shuts itself down.
This bug was also reported in this other thread:
http://mail-archives.apache.org/mod_mbox/cassandra-user/201303.mbox/%3CCAFDWQMTrYm7hBxXKoW8+eVKfNE6zvjW2h8_BSVGmOL7=grd...@mail.gmail.com%3E

Since I started Cassandra 1.2 only a few seconds before cassandra 1.1 died
(after the nodetool drain), I'm afraid there wasn't sufficient time for the
remaining nodes to update the metadata about the "downed" node. So when the
upgraded node was restarted, the metadata in the other nodes was still
referring to the previous version of the same node, so this may have caused
the handshake problem, and consequently the read timeout. Does that theory
make sense?


2013/10/4 Robert Coli 

> On Fri, Oct 4, 2013 at 9:09 AM, Paulo Motta wrote:
>
>> I manually tried to insert and retrieve some data into both the newly
>> upgraded nodes and the old nodes, and the behavior was very unstable:
>> sometimes it worked, sometimes it didn't (TimedOutException), so I don't
>> think it was a network problem.
>>
>> The number of read timeouts diminished as the number of upgraded nodes
>> increased, until it reached stability. The logs were showing the following
>> messages periodically:
>>
>> ...
>
>> Two similar issues were reported, but without satisfactory responses:
>>
>> -
>> http://stackoverflow.com/questions/15355115/rolling-upgrade-for-cassandra-1-0-9-cluster-to-1-2-1
>> - https://issues.apache.org/jira/browse/CASSANDRA-5740
>>
>
> Both of these issues relate to upgrading from 1._0_.x to 1.2.x, which is
> not supported.
>
> Were I you, I would summarize the above experience in a JIRA ticket, as
> 1.1.x to 1.2.x should be a supported operation and should not unexpectedly
> result in decreased availability during the upgrade.
>
> =Rob
>



-- 
Paulo Ricardo

-- 
European Master in Distributed Computing***
Royal Institute of Technology - KTH
*
*Instituto Superior Técnico - IST*
*http://paulormg.com*


Re: Increased read timeouts during rolling upgrade to C* 1.2

2013-10-04 Thread Robert Coli
On Fri, Oct 4, 2013 at 9:09 AM, Paulo Motta wrote:

> I manually tried to insert and retrieve some data into both the newly
> upgraded nodes and the old nodes, and the behavior was very unstable:
> sometimes it worked, sometimes it didn't (TimedOutException), so I don't
> think it was a network problem.
>
> The number of read timeouts diminished as the number of upgraded nodes
> increased, until it reached stability. The logs were showing the following
> messages periodically:
>
> ...

> Two similar issues were reported, but without satisfactory responses:
>
> -
> http://stackoverflow.com/questions/15355115/rolling-upgrade-for-cassandra-1-0-9-cluster-to-1-2-1
> - https://issues.apache.org/jira/browse/CASSANDRA-5740
>

Both of these issues relate to upgrading from 1._0_.x to 1.2.x, which is
not supported.

Were I you, I would summarize the above experience in a JIRA ticket, as
1.1.x to 1.2.x should be a supported operation and should not unexpectedly
result in decreased availability during the upgrade.

=Rob


Increased read timeouts during rolling upgrade to C* 1.2

2013-10-04 Thread Paulo Motta
Hello,

I have isolated one of our data centers to simulate a rolling restart
upgrade from C* 1.1.10 to 1.2.10. We replayed our production traffic to the
C* nodes during the upgrade and observed an increased number of read
timeouts during the upgrade process.

I executed nodetool drain before upgrading each node, and during the
upgrade "nodetool ring" was showing that node as DOWN, as expected. After
each upgrade all nodes were showing the upgraded node as UP, so apparently
all nodes were communicating fine.

I manually tried to insert and retrieve some data into both the newly
upgraded nodes and the old nodes, and the behavior was very unstable:
sometimes it worked, sometimes it didn't (TimedOutException), so I don't
think it was a network problem.

The number of read timeouts diminished as the number of upgraded nodes
increased, until it reached stability. The logs were showing the following
messages periodically:

 INFO [HANDSHAKE-/10.176.249.XX] 2013-10-03 17:36:16,948
OutboundTcpConnection.java (line 399) Handshaking version with
/10.176.249.XX
 INFO [HANDSHAKE-/10.176.182.YY] 2013-10-03 17:36:17,280
OutboundTcpConnection.java (line 408) Cannot handshake version with
/10.176.182.YY
 INFO [HANDSHAKE-/10.176.182.YY] 2013-10-03 17:36:17,280
OutboundTcpConnection.java (line 399) Handshaking version with
/10.176.182.YY
 INFO [HANDSHAKE-/10.188.13.ZZ] 2013-10-03 17:36:17,510
OutboundTcpConnection.java (line 408) Cannot handshake version with
/10.188.13.ZZ
 INFO [HANDSHAKE-/10.188.13.ZZ] 2013-10-03 17:36:17,511
OutboundTcpConnection.java (line 399) Handshaking version with /10.188.13.ZZ
DEBUG [WRITE-/54.215.70.YY] 2013-10-03 18:01:50,237
OutboundTcpConnection.java (line 338) Target max version is -2147483648; no
version information yet, will retry
TRACE [HANDSHAKE-/10.177.14.XX] 2013-10-03 18:01:50,237
OutboundTcpConnection.java (line 406) Cannot handshake version with
/10.177.14.XX
java.nio.channels.AsynchronousCloseException
at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)
 at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:272)
 at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:176)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:86)
 at java.io.InputStream.read(InputStream.java:82)
 at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:64)
at java.io.DataInputStream.readInt(DataInputStream.java:370)
at
org.apache.cassandra.net.OutboundTcpConnection$1.run(OutboundTcpConnection.java:400)

Another fact is that the number of completed compaction tasks decreased as
the number of upgraded nodes increased. I don't know if that's related to
the increased number read timeouts or just a coincidence. The timeout
configuration is the default (1ms).

Two similar issues were reported, but without satisfactory responses:

-
http://stackoverflow.com/questions/15355115/rolling-upgrade-for-cassandra-1-0-9-cluster-to-1-2-1
- https://issues.apache.org/jira/browse/CASSANDRA-5740

Is that an expected behavior or is there something that might be going
wrong during the upgrade? Has anyone faced similar issues?

Any help would be very much appreciated.

Thanks,

Paulo


Keystore password in yaml is in plain text

2013-10-04 Thread Shahryar Sedghi
Hi

I there a way to obfuscate the keystore/truststore password?

Thanks

Shahryar
--


Re: Why Solandra stores Solr data in Cassandra ? Isn't solr complete solution ?

2013-10-04 Thread Ertio Lew
Yes, what is Solr Cloud then for, that already provides clustering support,
so what's the need for Cassandra ?


On Tue, Oct 1, 2013 at 2:06 AM, Sávio Teles wrote:

>
> Solr's index sitting on a single machine, even if that single machine can
>> vertically scale, is a single point of failure.
>>
>
> And about Cloud Solr?
>
>
> 2013/9/30 Ken Hancock 
>
>> Yes.
>>
>>
>> On Mon, Sep 30, 2013 at 1:57 PM, Andrey Ilinykh wrote:
>>
>>>
>>> Also, be aware that while Cassandra has knobs to allow you to get
 consistent read results (CL=QUORUM), DSE Search does not. If a node drops
 messages for whatever reason, outtage, mutation, etc. its solr indexes will
 be inconsistent with other nodes in its replication group.

 Will repair fix it?
>>>
>>
>>
>>
>> --
>> *Ken Hancock *| System Architect, Advanced Advertising
>> SeaChange International
>> 50 Nagog Park
>> Acton, Massachusetts 01720
>> ken.hanc...@schange.com | www.schange.com | 
>> NASDAQ:SEAC
>>
>> Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com
>>  | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks [image:
>> LinkedIn] 
>>
>> [image: SeaChange International]
>>  This e-mail and any attachments may contain
>> information which is SeaChange International confidential. The information
>> enclosed is intended only for the addressees herein and may not be copied
>> or forwarded without permission from SeaChange International.
>>
>
>
>
> --
> Atenciosamente,
> Sávio S. Teles de Oliveira
> voice: +55 62 9136 6996
> http://br.linkedin.com/in/savioteles
>  Mestrando em Ciências da Computação - UFG
> Arquiteto de Software
> Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
>


Re: Cassandra Heap Size for data more than 1 TB

2013-10-04 Thread Alain RODRIGUEZ
Here.

We have 1.5 TB running smooth. index_interval: 1024 and 8GB JVM. Default
bloomfilters.
The only pb we have is that We have 2TB SSD so they are almost full, C*
starts crashing. It looks like cassandra consider there is no more space
available, when there is still 500GB available (You're not supposed to use
50%+ disk space).

All operations are slower of course with these loads (Bootstrap, Repair,
cleanup, ...).

Yet I read on datastax website that MAX size is around 300 - 500 GB for C*
< 1.2.x and 3 to 5 GB after (under certain conditions, but taking profit of
off heap BF / caches etc.). Vnodes should also help reducing the time
needed for some operations.

Hope that helps somehow





2013/10/3 Michał Michalski 

> Currently we have 480-520 GB of data per node, so it's not even close to
> 1TB, but I'd bet that reaching 700-800GB shouldn't be a problem in terms of
> "everyday performance" - heap space is quite low, no GC issues etc. (to
> give you a comparison: when working on 1.1 and having ~300-400GB per node
> we had a huge problem with bloom filters and heap space, so we had to bump
> it to 12-16 GB; on 1.2 it's not an issue anymore).
>
> However, our main concern is the time that we'll need to rebuild broken
> node, so we are going to extend the cluster soon to avoid such problems and
> keep our nodes about 50% smaller.
>
> M.
>
>
> W dniu 03.10.2013 15:02, srmore pisze:
>
>  Thanks Mohit and Michael,
>> That's what I thought. I have tried all the avenues, will give ParNew a
>> try. With the 1.0.xx I have issues when data sizes go up, hopefully that
>> will not be the case with 1.2.
>>
>> Just curious, has anyone tried 1.2 with large data set, around 1 TB ?
>>
>>
>> Thanks !
>>
>>
>> On Thu, Oct 3, 2013 at 7:20 AM, Michał Michalski 
>> wrote:
>>
>>  I was experimenting with 128 vs. 512 some time ago and I was unable to
>>> see
>>> any difference in terms of performance. I'd probably check 1024 too, but
>>> we
>>> migrated to 1.2 and heap space was not an issue anymore.
>>>
>>> M.
>>>
>>> W dniu 02.10.2013 16:32, srmore pisze:
>>>
>>>   I changed my index_interval from 128 to index_interval: 128 to 512,
>>> does
>>>
 it
 make sense to increase more than this ?


 On Wed, Oct 2, 2013 at 9:30 AM, cem  wrote:

   Have a look to index_interval.

>
> Cem.
>
>
> On Wed, Oct 2, 2013 at 2:25 PM, srmore  wrote:
>
>   The version of Cassandra I am using is 1.0.11, we are migrating to
> 1.2.X
>
>> though. We had tuned bloom filters (0.1) and AFAIK making it lower
>> than
>> this won't matter.
>>
>> Thanks !
>>
>>
>> On Tue, Oct 1, 2013 at 11:54 PM, Mohit Anchlia <
>> mohitanch...@gmail.com
>>
>>> wrote:
>>>
>>
>>   Which Cassandra version are you on? Essentially heap size is
>> function
>>
>>> of
>>> number of keys/metadata. In Cassandra 1.2 lot of the metadata like
>>> bloom
>>> filters were moved off heap.
>>>
>>>
>>> On Tue, Oct 1, 2013 at 9:34 PM, srmore  wrote:
>>>
>>>   Does anyone know what would roughly be the heap size for cassandra
>>>
 with
 1TB of data ? We started with about 200 G and now on one of the
 nodes
 we
 are already on 1 TB. We were using 8G of heap and that served us
 well
 up
 until we reached 700 G where we started seeing failures and nodes
 flipping.

 With 1 TB of data the node refuses to come back due to lack of
 memory.
 needless to say repairs and compactions takes a lot of time. We
 upped
 the
 heap from 8 G to 12 G and suddenly everything started moving rapidly
 i.e.
 the repair tasks and the compaction tasks. But soon (in about 9-10
 hrs) we
 started seeing the same symptoms as we were seeing with 8 G.

 So my question is how do I determine what is the optimal size of
 heap
 for data around 1 TB ?

 Following are some of my JVM settings

 -Xms8G
 -Xmx8G
 -Xmn800m
 -XX:NewSize=1200M
 XX:MaxTenuringThreshold=2
 -XX:SurvivorRatio=4

 Thanks !



>>>
>>>
>>
>

>>>
>>
>


Re: Minimum row size / minimum data point size

2013-10-04 Thread Robert Važan
That spreadsheet doesn't take compression into account, which is very 
important in my case. Uncompressed, my data is going to require a 
petabyte of storage according to the spreadsheet. I am pretty sure I 
won't get that much storage to play with.


The spreadsheet also shows that Cassandra wastes unbelievable amount of 
space on compaction. My experiments with LevelDB however show that it is 
possible for write-optimized database to use negligible compaction 
space. I am not sure how LevelDB does it. I guess it splits the larger 
sstables into smaller chunks and merges them incrementally.


Anyway, does anybody know how densely can I store the data with 
Cassandra when compression is enabled? Would I have to implement some 
smart adaptive grouping to fit lots of records in one row or is there a 
simpler solution?


Dňa 4. 10. 2013 1:56 Andrey Ilinykh wrote / napísal(a):

It may help.
https://docs.google.com/spreadsheet/ccc?key=0Atatq_AL3AJwdElwYVhTRk9KZF9WVmtDTDVhY0xPSmc#gid=0


On Thu, Oct 3, 2013 at 1:31 PM, Robert Važan > wrote:


I need to store one trillion data points. The data is highly
compressible down to 1 byte per data point using simple custom
compression combined with standard dictionary compression. What's
the most space-efficient way to store the data in Cassandra? How
much per-row overhead is there if I store one data point per row?

The data is particularly hard to group. It's a large number of
time series with highly variable density. That makes it hard to
pack subsets of the data into meaningful column families / wide
rows. Is there a table layout scheme that would allow me to
approach the 1B per data point without forcing me to implement
complex abstraction layer on application level?