Re: Replace dead node in non-vnode 1.2 cluster

2015-09-17 Thread Paulo Motta
The datastax documentation was fixed after the initial confusion with
vnodes vs non-vnodes, so you should be safe to follow the procedure
described there. Make sure to set the
-Dcassandra.replace_address=address_of_dead_node JVM option (don't worry
about the initial token).

2015-09-17 21:21 GMT-03:00 John Wong :

> Hi
>
> Can the community help to confirm that
> http://docs.datastax.com/en/cassandra/1.2/cassandra/operations/ops_replace_node_t.html
> will work for non-vnode cluster in Cassandra 1.2.
>
> It looks like I don't have to set the initial token for the replacement
> node (but same IP) at all if I run the JVM opts during startup.
>
> I read a couple mailing list threads.
>
> [1]:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Bootstrap-failure-on-C-1-2-13-td7592664.html
>
> [2]:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Replacing-a-dead-node-in-Cassandra-2-0-8-td7596245.html
>
> They mentioned that back in the date 1.2 doc would defer user to 1.1 doc,
> but the current 1.2 doc doesn't mention anything about that.
>
> Also these as references
>
> [3]:
> https://wiki.apache.org/cassandra/Operations#For_versions_1.2.0_and_above
> [4]:
> http://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html
>
> Thank you.
>
> John
>


Re: What is your backup strategy for Cassandra?

2015-09-17 Thread Marc Tamsky
This seems like an apt time to quote [1]:

> Remember that you get 1 point for making a backup and 10,000 points for
restoring one.

Restoring from backups is my goal.

The commonly recommended tools (tablesnap, cassandra_snapshotter) all seem
to leave the restore operation as a pretty complicated exercise for the
operator.

Do any include a working way to restore, on a different host, all of node
X's data from backups to the correct directories, such that the restored
files are in the proper places and the node restart method [2] "just works"?


On Thu, Sep 17, 2015 at 6:47 PM, Robert Coli  wrote:

> tl;dr - tablesnap works. There are awkward aspects to its use, but if you
> are operating Cassandra in AWS it's probably the best off the shelf
> off-node backup.
>

Have folks here ever used tableslurp to restore a backup taken with
tablesnap?
How would you rate the difficulty of restore?

>From my limited testing, tableslurp looks like it can only restore a single
table within a keyspace per execution.

I have hundreds of tables... so without automation around tableslurp, that
doesn't seem like a reliable path toward a full restore.

Perhaps someone has written a tool that drives tableslurp so it "just
works" ?


[1] http://serverfault.com/a/277092/218999

[2]
http://docs.datastax.com/en/cassandra/1.2/cassandra/operations/ops_backup_noderestart_t.html


Using UDT Collection to store user aggregated data

2015-09-17 Thread IPVP
Hi all,

Could someone please let me know if the following modeling is on the right way 
? 

Te requirement is to query by user_id the summarized data  of 
transaction_status_per_user.
-
### UDT
CREATE TYPE core_analytics.status_summary (
name text,
count int,
hexcolor text
);

### TABLE
CREATE TABLE core_analytics.transaction_status_per_user (
user_id uuid,
statuses LIST>,
PRIMARY KEY (user_id)
);

###Java Mapping

@UDT(keyspace = "core_analytics", name = "status_summary")
public class StatusSummary {

    private String name;
    private Integer count;
    private String hexcolor;
      
   // getters setters omitted
}
-


Thanks


IP



Replace dead node in non-vnode 1.2 cluster

2015-09-17 Thread John Wong
Hi

Can the community help to confirm that
http://docs.datastax.com/en/cassandra/1.2/cassandra/operations/ops_replace_node_t.html
will work for non-vnode cluster in Cassandra 1.2.

It looks like I don't have to set the initial token for the replacement
node (but same IP) at all if I run the JVM opts during startup.

I read a couple mailing list threads.

[1]:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Bootstrap-failure-on-C-1-2-13-td7592664.html

[2]:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Replacing-a-dead-node-in-Cassandra-2-0-8-td7596245.html

They mentioned that back in the date 1.2 doc would defer user to 1.1 doc,
but the current 1.2 doc doesn't mention anything about that.

Also these as references

[3]:
https://wiki.apache.org/cassandra/Operations#For_versions_1.2.0_and_above
[4]:
http://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html

Thank you.

John


RE: Cassandra shutdown during large number of compactions - now fails to start with OOM Exception

2015-09-17 Thread Walsh, Stephen
Some more info,

Looking at the Java Memory Dump file.

I see about 400 SSTableScanners  - one for each of our column Families.
Each is about 200MB in size.
And (from what I can see) all of them are reading from a 
"compactions_in_progress-ka-00-Data.db" file

dfile  org.apache.cassandra.io.compress.CompressedRandomAccessReader path = 
"/var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-71661-Data.db"
 131840 104

Steve


From: Walsh, Stephen
Sent: 17 September 2015 15:33
To: user@cassandra.apache.org
Subject: Cassandra shutdown during large number of compactions - now fails to 
start with OOM Exception

Hey all, I was hoping someone had a similar issue.
We're using 2.1.6 and shutdown a testbed in AWS thinking we were finished with 
it,
We started it backup today and saw that only 2 of 4 nodes came up.

Seems there was a lot of compaction happening at the time it was shutdown, 
cassandra tries to start-up and we get an OutOfMemory Exception.


INFO  13:45:57 Initializing system.range_xfers
INFO  13:45:57 Initializing system.schema_keyspaces
INFO  13:45:57 Opening 
/var/lib/cassandra/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-21807
 (19418 bytes)
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /var/log/cassandra/java_pid3011.hprof ...
Heap dump file created [7751760805 bytes in 52.439 secs]
ERROR 13:47:11 Exception encountered during startup
java.lang.OutOfMemoryError: Java heap space


it's not related the key_cache, we removed this and the issue is still present.
So we believe its re-trying all the compactions that were in place when it went 
down.

We've modified the HEAP size to be half of the systems RAM (8GB in this case)

At the moment the only work around we have is to empty the data / saved_cache / 
commit_log folders and let it re-sync with the other nodes.

Has anyone seen this before and what have they done to solve it?
Can we remove unfinished compactions?

Steve



This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.


Cassandra shutdown during large number of compactions - now fails to start with OOM Exception

2015-09-17 Thread Walsh, Stephen
Hey all, I was hoping someone had a similar issue.
We're using 2.1.6 and shutdown a testbed in AWS thinking we were finished with 
it,
We started it backup today and saw that only 2 of 4 nodes came up.

Seems there was a lot of compaction happening at the time it was shutdown, 
cassandra tries to start-up and we get an OutOfMemory Exception.


INFO  13:45:57 Initializing system.range_xfers
INFO  13:45:57 Initializing system.schema_keyspaces
INFO  13:45:57 Opening 
/var/lib/cassandra/data/system/schema_keyspaces-b0f2235744583cdb9631c43e59ce3676/system-schema_keyspaces-ka-21807
 (19418 bytes)
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /var/log/cassandra/java_pid3011.hprof ...
Heap dump file created [7751760805 bytes in 52.439 secs]
ERROR 13:47:11 Exception encountered during startup
java.lang.OutOfMemoryError: Java heap space


it's not related the key_cache, we removed this and the issue is still present.
So we believe its re-trying all the compactions that were in place when it went 
down.

We've modified the HEAP size to be half of the systems RAM (8GB in this case)

At the moment the only work around we have is to empty the data / saved_cache / 
commit_log folders and let it re-sync with the other nodes.

Has anyone seen this before and what have they done to solve it?
Can we remove unfinished compactions?

Steve



This email (including any attachments) is proprietary to Aspect Software, Inc. 
and may contain information that is confidential. If you have received this 
message in error, please do not read, copy or forward this message. Please 
notify the sender immediately, delete it from your system and destroy any 
copies. You may not further disclose or distribute this email or its 
attachments.


Re: Upgrade Limitations Question

2015-09-17 Thread Vasileios Vlachos
Thank you very much for pointing this out Victor. Really useful to know.

On Wed, Sep 16, 2015 at 4:55 PM, Victor Chen 
wrote:

> Yes, you can examine the actual sstables in your cassandra data dir. That
> will tell you what version sstables you have on that node.
>
> You can refer to this link:
> http://www.bajb.net/2013/03/cassandra-sstable-format-version-numbers/
> which I found via google search phrase "sstable versions" to see which
> version you need to look for-- the relevant section of the link says:
>
>> Cassandra stores the version of the SSTable within the filename,
>> following the format *Keyspace-ColumnFamily-(optional tmp
>> marker-)SSTableFormat-generation*
>>
>
> FYI-- and at least in the cassandra-2.1 branch of the source code-- you
> can find sstable format generation descriptions in comments of
> Descriptor.java. Looks like for your old and new versions, you'd be looking
> for something like:
>
> for 1.2.1:
> $ find  -name "*-ib-*" -ls
>
> for 2.0.1:
> $ find  -name "*-jb-*" -ls
>
>
> On Wed, Sep 16, 2015 at 10:02 AM, Vasileios Vlachos <
> vasileiosvlac...@gmail.com> wrote:
>
>>
>> Hello Rob and thanks for your reply,
>>
>> At the end we had to wait for the upgradesstables ti finish on every
>> node. Just to eliminate the possibility of this being the reason of any
>> weird behaviour after the upgrade. However, this process might take a long
>> time in a cluster with a large number of nodes which means no new work can
>> be done for that period.
>>
>> 1) TRUNCATE requires all known nodes to be available to succeed, if you
>>> are restarting one, it won't be available.
>>>
>>
>> I suppose all means all, not all replicas here, is that right? Not
>> directly related to the original question, but that might explain why we
>> end up with peculiar behaviour some times when we run TRUNCATE. We've now
>> taken the approach DROP it and do it again when possible (even though this
>> is still problematic when using the same CF name).
>>
>>
>>> 2) in theory, the newly upgraded nodes might not get the DDL schema
>>> update properly due to some incompatible change
>>>
>>> To check for 2, do :
>>> "
>>> nodetool gossipinfo | grep SCHEMA |sort | uniq -c | sort -n
>>> "
>>>
>>> Before and after and make sure the schema propagates correctly. There
>>> should be a new version on all nodes between each DDL change, if there is
>>> you will likely be able to see the new schema on all the new nodes.
>>>
>>>
>> Yeas, this makes perfect sense. We monitor the schema changes every
>> minutes across the cluster with Nagios by checking the JMX console. It is
>> an important thing to monitor in several situations (running migrations for
>> example, or during upgrades like you describe here).
>>
>> Is there a way to find out if the upgradesstables has been run against a
>> particular node or not?
>>
>> Many Thanks,
>> Vasilis
>>
>
>