Which SSTable caused CorruptSSTableException?

2014-02-25 Thread Ike Walker
Running Cassandra 1.2.9 in AWS with a 12 host cluster, I am getting lots of
CorruptSSTableException in system.log on one of my hosts.

Is it possible to find out which SSTable(s) is/are corrupt?

I'm currently running "nodetool scrub" on the relevant host, but that
doesn't seem like an efficient way to fix the problem (if it fixes it at
all).

Here's an example error:

ERROR [ReplicateOnWriteStage:1070] 2014-02-25 17:43:03,518
CassandraDaemon.java (line 192) Exception in thread
Thread[ReplicateOnWriteStage:1070,5,main]
java.lang.RuntimeException:
org.apache.cassandra.io.sstable.CorruptSSTableException:
java.io.EOFException
at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1597)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException:
java.io.EOFException
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:65)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68)
at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:272)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1391)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
at org.apache.cassandra.db.Table.getRow(Table.java:347)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64)
at
org.apache.cassandra.db.CounterMutation.makeReplicationMutation(CounterMutation.java:90)
at
org.apache.cassandra.service.StorageProxy$7$1.runMayThrow(StorageProxy.java:772)
at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1593)
... 3 more
Caused by: java.io.EOFException
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:416)
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:394)
at
org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:380)
at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
at
org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:371)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:116)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:60)
... 15 more


Re: Conference or training recommendations

2013-12-16 Thread Ike Walker
Datstax recently started offering free virtual training. You may want to
try that first:

http://www.datastax.com/what-we-offer/products-services/training/virtual-training

There are also many Cassandra meetups around the world:

http://cassandra.meetup.com/

Datstax also offers classroom training, but it is not cheap:

http://www.datastax.com/what-we-offer/products-services/training

As for conferences, this year's Cassandra Summits were in San Francisco in
June and London in October. I have not seen an announcement of next year's
summit(s).

-Ike


On Sun, Dec 15, 2013 at 12:07 PM, Robert Wille  wrote:

> I’d like to attend a conference or some form of training to become more
> proficient and knowledgable about Cassandra. Any suggestions?
>


CQL workaround for modifying a primary key

2013-12-03 Thread Ike Walker
What is the best practice for modifying the primary key definition of a table 
in Cassandra 1.2.9?

Say I have this table:

CREATE TABLE temperature (
   weatherstation_id text,
   event_time timestamp,
   temperature text,
   PRIMARY KEY (weatherstation_id,event_time)
);

I want to add a new column named version and include that column in the primary 
key.

CQL will let me add the column, but you can't change the primary key for an 
existing table.

So I drop the table and recreate it:

DROP TABLE temperature;

CREATE TABLE temperature (
   weatherstation_id text,
   version int,
   event_time timestamp,
   temperature text,
   PRIMARY KEY (weatherstation_id,version,event_time)
);

But then I start getting errors like this:

java.io.FileNotFoundException: 
/var/lib/cassandra/data/test/temperature/test-temperature-ic-8316-Data.db (No 
such file or directory)

So I guess the drop table doesn't actually delete the data, and I end up with a 
problem like this:

https://issues.apache.org/jira/browse/CASSANDRA-4857

What's a good workaround for this, assuming I don;t want to change the name of 
my table? Should I just truncate the table, then drop it and recreate it?

Thanks.

-Ike Walker

Re: Output of "nodetool ring" with virtual nodes

2013-10-15 Thread Ike Walker
Hi Paulo,
Yes, that is expected. Now that you are using virtual nodes you should 
use "nodetool status" to see an output similar to what you saw with "nodetool 
ring" before you enabled virtual nodes.

        -Ike Walker

On Oct 15, 2013, at 11:45 AM, Paulo Motta  wrote:

> Hello,
> 
> I recently did the "Enabling virtual nodes on an existing production cluster" 
> procedure 
> (http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/configuration/configVnodesProduction_t.html),
>  and noticed that the output of the command "nodetool ring" changes 
> significantly when virtual nodes are enabled in a new data center.
> 
> Before, it showed only 1 token per node, now it shows 256 tokens per node 
> (output below). So, that means 256*N entries, which makes the command 
> unreadable, while before it was pretty useful to check the cluster status in 
> a human-readable format. Moreover, the command is taking much longer to 
> execute.
> 
> Is this expected behavior, or did I make any mistake during the procedure?
> 
> Cassandra version: 1.2.10
> 
> Before it was like this:
> 
> Datacenter: VNodesDisabled
> ==
> Replicas: 3
> 
> Address  RackStatus State   LoadOwns  
>   Token   
>   
>   28356863910078205239614050619314017619  
> AAA.BBB.CCC.1   x Up Normal  236.49 GB   20.83%  
> 113427455640312821154458002477256070480 
> AAA.BBB.CCC.2   x Up Normal  347.6 GB29.17%  
> 77981375752715064543690004203113548455  
> AAA.BBB.CCC.3   x Up Normal  332.46 GB   37.50%  
> 106338614526609105785626408013334622686 
> AAA.BBB.CCC.4   x Up Normal  198.94 GB   20.83%  
> 141784319550391026443072753090570088104 
> AAA.BBB.CCC.5   x Up Normal  330.68 GB   33.33%  
> 92159807707754167187997289512070557265  
> AAA.BBB.CCC.6   x Up Normal  268.64 GB   25.00%  
> 155962751505430129087380028400227096915 
> AAA.BBB.CCC.7   x Up Normal  262.43 GB   25.00%  
> 163051967482949680409533666060055601314 
> AAA.BBB.CCC.8   x Up Normal  200.18 GB   16.67%  
> 1   
> AAA.BBB.CCC.9   x Up Normal  189.13 GB   16.67%  
> 120516671617832372476611040132084574885 
> AAA.BBB.CCC.10  x Up Normal  220.7 GB25.00%  
> 42535295865117307932921025928971026429  
> AAA.BBB.CCC.11  x Up Normal  259.36 GB   25.00%  
> 35446079887597756610768088274142522024  
> AAA.BBB.CCC.12  x Up Normal  270.32 GB   25.00%  
> 28356863910078205088614550619314017619  
> 
> 
> Now it is like this:
> Datacenter: VNodesEnabled
> ==
> Replicas: 3
> 
> Address  RackStatus State   LoadOwns  
>   Token   
>   
>   168998414504718061309167200639854699955 
> XXX.YYY.ZZZ.1y  Up Normal  122.84 KB   0.00%  
>  4176479009577065052560790400565254  
> XXX.YYY.ZZZ.1y  Up Normal  122.84 KB   0.00%  
>  291517050854558940844583227825291566
> XXX.YYY.ZZZ.1y  Up Normal  122.84 KB   0.00%  
>  389126351568277133928956802249918052
> XXX.YYY.ZZZ.1y  Up Normal  122.84 KB   0.00%  
>  504218791605899949008255495493335240
> XXX.YYY.ZZZ.2y  Up Normal  122.84 KB   0.00%  
>  4176479009577065052560790400565254  
> XXX.YYY.ZZZ.2y  Up Normal  122.84 KB   0.00%  
>  291517050854558940844583227825291566
> XXX.YYY.ZZZ.2y  Up Normal  122.84 KB   0.00%  
>  389126351568277133928956802249918052
> XXX.YYY.ZZZ.2y  Up Normal  122.84 KB   0.00%  
>  504218791605899949008255495493335240   
> XXX.YYY.ZZZ.3y  Up Normal  122.84 KB   0.00%  
>  4176479009577065052560790400565254  
> XXX.YYY.ZZZ.3y  Up Normal  122.84 KB   0.00%  
>  291517050854558940844583227825291566
> XXX.YYY.ZZZ.3y  Up Normal  122.84 KB   0.00% 

Re: Long running nodetool move operation

2013-09-11 Thread Ike Walker
The restart worked.

Thanks, Rob!

After the restart I ran 'nodetool move' again, used 'nodetool netstats | grep 
-v "0%"' to verify that data was actively streaming, and the move completed 
successfully.

-Ike

On Sep 10, 2013, at 11:04 AM, Ike Walker  wrote:

> Below is the output of "nodetool netstats".
> 
> I've never run that before, but from what I can read it shows no incoming 
> streams, and a bunch of outgoing streams to two other nodes, all at 0%.
> 
> I'll try the restart.
> 
> Thanks.
> 
> nodetool netstats
> Mode: MOVING
> Streaming to: /10.xxx.xx.xx
> 
> ...
> Streaming to: /10.xxx.xx.xxx
> 
> ...
> Not receiving any streams.
> Pool NameActive   Pending  Completed
> Commandsn/a 0  243401039
> Responses   n/a     0  295522535
> 
> On Sep 9, 2013, at 10:54 PM, Robert Coli  wrote:
> 
>>   On Mon, Sep 9, 2013 at 7:08 PM, Ike Walker  wrote:
>> I've been using nodetool move to rebalance my cluster. Most of the moves 
>> take under an hour, or a few hours at most. The current move has taken 4+ 
>> days so I'm afraid it will never complete. What's the best way to cancel it 
>> and try again?
>> 
>> What does "nodetool netstats" say? If it shows no streams in progress, the 
>> move is probably hung...
>> 
>> Restart the affected node. If that doesn't work, restart other nodes which 
>> might have been receiving a stream. I think in the case of "move" it should 
>> work to just restart the affected node. Restart the move, you will re-stream 
>> anything you already streamed once.
>> 
>> https://issues.apache.org/jira/browse/CASSANDRA-3486
>> 
>> If this ticket were completed, it would presumably include the ability to 
>> stop other hung streaming operations, like "move".
>> 
>> =Rob
> 



Re: Long running nodetool move operation

2013-09-10 Thread Ike Walker
Below is the output of "nodetool netstats".

I've never run that before, but from what I can read it shows no incoming 
streams, and a bunch of outgoing streams to two other nodes, all at 0%.

I'll try the restart.

Thanks.

nodetool netstats
Mode: MOVING
Streaming to: /10.xxx.xx.xx

...
Streaming to: /10.xxx.xx.xxx

...
Not receiving any streams.
Pool NameActive   Pending  Completed
Commandsn/a 0  243401039
Responses   n/a 0  295522535

On Sep 9, 2013, at 10:54 PM, Robert Coli  wrote:

>   On Mon, Sep 9, 2013 at 7:08 PM, Ike Walker  wrote:
> I've been using nodetool move to rebalance my cluster. Most of the moves take 
> under an hour, or a few hours at most. The current move has taken 4+ days so 
> I'm afraid it will never complete. What's the best way to cancel it and try 
> again?
> 
> What does "nodetool netstats" say? If it shows no streams in progress, the 
> move is probably hung...
> 
> Restart the affected node. If that doesn't work, restart other nodes which 
> might have been receiving a stream. I think in the case of "move" it should 
> work to just restart the affected node. Restart the move, you will re-stream 
> anything you already streamed once.
> 
> https://issues.apache.org/jira/browse/CASSANDRA-3486
> 
> If this ticket were completed, it would presumably include the ability to 
> stop other hung streaming operations, like "move".
> 
> =Rob



Long running nodetool move operation

2013-09-09 Thread Ike Walker
I've been using nodetool move to rebalance my cluster. Most of the moves take 
under an hour, or a few hours at most. The current move has taken 4+ days so 
I'm afraid it will never complete. What's the best way to cancel it and try 
again?

I'm running a cluster of 12 nodes at AWS. Each node runs Cassandra 1.2.5 on an 
m1.xlarge EC2 instance, and they are spread across 3 availability zones within 
a single region.

I've seen some of these errors in the log. I'm not sure if it's related or not:

ERROR [CompactionExecutor:4092] 2013-09-10 01:31:49,783 CassandraDaemon.java 
(line 175) Exception in thread Thread[CompactionExecutor:4092,1,main]
java.lang.IndexOutOfBoundsException: index (1) must be less than size (1)
at 
com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:305)
at 
com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:284)
at 
com.google.common.collect.SingletonImmutableList.get(SingletonImmutableList.java:45)
at 
org.apache.cassandra.db.marshal.CompositeType.getComparator(CompositeType.java:94)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:76)
at 
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
at 
org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:128)
at 
org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:114)
at 
org.apache.cassandra.db.AbstractColumnContainer.addColumn(AbstractColumnContainer.java:109)
at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:219)
at 
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumnsFromSSTable(ColumnFamilySerializer.java:149)
at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:234)
at 
org.apache.cassandra.db.compaction.PrecompactedRow.merge(PrecompactedRow.java:114)
at 
org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:98)
at 
org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:160)
at 
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:76)
at 
org.apache.cassandra.db.compaction.CompactionIterable$Reducer.getReduced(CompactionIterable.java:57)
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:114)
at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:97)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:134)
at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)
at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:211)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

Here's the status of the cluster as reported by nodetool ring, showing the one 
node in "Moving" state:

Datacenter: us-east
==
Address RackStatus State   LoadOwns
Token   
   
127605887595351923798765477786913079290 
10.xxx.xxx.xxx  1c  Up Normal  224.53 GB   25.00%  
0   
10.xxx.xxx.xxx  1d  Up Moving  297.46 GB   2.44%   
4150051970709140963435425752946440221   
10.xxx.xxx.xxx  1d  Up Normal  107.75 GB   5.89%   
14178431955039102644307275309657008810  
10.xxx.xxx.xxx  1e  Up Normal  82.75 GB8.33%   
28356863910078205288614550619314017620  
10.xxx.xxx.xxx  1e  Up Normal  173.83 GB   2.99%   
33451586107772559423309548485325625873  
10.xxx.xxx.xxx  1c  Up Normal  64.4 GB 

How many seed nodes should I use?

2013-08-28 Thread Ike Walker
What is the best practice for how many seed nodes to have in a Cassandra 
cluster? I remember reading a recommendation of 2 seeds per datacenter in 
Datastax documentation for 0.7, but I'm interested to know what other people 
are doing these days, especially in AWS.

I'm running a cluster of 12 nodes at AWS. Each node runs Cassandra 1.2.5 on an 
m1.xlarge EC2 instance, and they are spread across 3 availability zones within 
a single region.

To keep things simple I currently have all 12 nodes listed as seeds. That seems 
like overkill to me, but I don't know the pros and cons of too many or too few 
seeds.

Any advice is appreciated.

Thanks!

-Ike Walker