Re: Running multiple sstable loaders

2012-03-28 Thread Sanjeev Kulkarni
Hi,

Here is the stack trace that we get from sstableloader

org.apache.thrift.transport.TTransportException: java.net.ConnectException:
Connection refused
java.lang.RuntimeException:
org.apache.thrift.transport.TTransportException: java.net.ConnectException:
Connection refused
at
org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:229)
at
org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:104)
at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:61)
Caused by: org.apache.thrift.transport.TTransportException:
java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
at
org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at
org.apache.cassandra.tools.BulkLoader$ExternalClient.createThriftClient(BulkLoader.java:249)
at
org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:197)
... 2 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at org.apache.thrift.transport.TSocket.open(TSocket.java:178)


Thanks!

On Wed, Mar 28, 2012 at 2:57 PM, Sanjeev Kulkarni wrote:

> Hey guys,
> We have a fresh 4 node 0.8.10 cluster that we want to pump lots of data
> into.
> The data resides on 5 data machines that are different from Cassandra
> nodes. Each of these data nodes has 7 disks where the data resides.
> In order to get maximum load performance, we are assigning 7 ips to
> each data node on the same interface(eth2:0, eth2:1, ...). I also make
> multiple copies of cassandra conf directory(/etc/cassandra). Each
> cassandra conf is identical except for listen and rpc address. We then
> start multiple simultaneous sstable loaders each pointing to different
> config. We do this on all data nodes.
> What we are seeing is the first sstableloader on each machine starts
> loading. However the rest fail with connection refused error. They log
> this message 8 times and then bail out. Any idea what could be wrong?
> Thanks!
>
>
> Sent from my iPhone
>


Running multiple sstable loaders

2012-03-28 Thread Sanjeev Kulkarni
Hey guys,
We have a fresh 4 node 0.8.10 cluster that we want to pump lots of data into.
The data resides on 5 data machines that are different from Cassandra
nodes. Each of these data nodes has 7 disks where the data resides.
In order to get maximum load performance, we are assigning 7 ips to
each data node on the same interface(eth2:0, eth2:1, ...). I also make
multiple copies of cassandra conf directory(/etc/cassandra). Each
cassandra conf is identical except for listen and rpc address. We then
start multiple simultaneous sstable loaders each pointing to different
config. We do this on all data nodes.
What we are seeing is the first sstableloader on each machine starts
loading. However the rest fail with connection refused error. They log
this message 8 times and then bail out. Any idea what could be wrong?
Thanks!


Sent from my iPhone


Re: Question regarding secondary indices

2012-03-16 Thread Sanjeev Kulkarni
Thanks Aaron for the response. I see those logs.
I had one more question. Looks like sstableloader takes only one directory
at a time. Is it possible to load multiple directories in one call.
Something like sstableloader /drive1/keyspace1 /drive2/keyspace1...
This way one can take adv of the speedup that you get from reading accross
multiple drives.
Or alternatively is it possible to run multiple instances of sstableloader
on the same machine concurrently?
Thanks!

On Thu, Mar 15, 2012 at 6:54 PM, aaron morton wrote:

> You should see a log line with "Index build of {} complete".
>
> You can also see which indexes are built using the describe command in
> cassandra-cli.
>
>
> -
> Aaron Morton[default@XX] describe;
> Keyspace: XX:
> ...
>   Column Families:
> ColumnFamily: XXX
> ...
>   Built indexes: []
>
> Cheers
>
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 16/03/2012, at 10:04 AM, Sanjeev Kulkarni wrote:
>
> Hi,
> I'm using a 4 node cassandra cluster running 0.8.10 with rf=3. Its a brand
> new setup.
> I have a single col family which contains about 10 columns. I have enabled
> secondary indices on 3 of them. I used sstableloader to bulk load some data
> into this cluster.
> I poked around the logs and saw the following messages
> Submitting index build of attr_001 ..
> which indicates that cassandra has started building indices.
> How will I know when the building of the indices is done? Is there some
> log messages that I should look for?
> Thanks!
>
>
>


Question regarding secondary indices

2012-03-15 Thread Sanjeev Kulkarni
Hi,
I'm using a 4 node cassandra cluster running 0.8.10 with rf=3. Its a brand
new setup.
I have a single col family which contains about 10 columns. I have enabled
secondary indices on 3 of them. I used sstableloader to bulk load some data
into this cluster.
I poked around the logs and saw the following messages
Submitting index build of attr_001 ..
which indicates that cassandra has started building indices.
How will I know when the building of the indices is done? Is there some log
messages that I should look for?
Thanks!


Re: nodetools cfstats question

2011-09-29 Thread Sanjeev Kulkarni
Hi Thamizh,
Thanks for the answer.
I understand the part about the Key cache capacity being 20 which is the
default value.
But Key cache size being 99k? Does this represent that cassandra has
allocated 99k for key cache even though the actual keys are far less?

On Thu, Sep 29, 2011 at 3:47 AM, Thamizh  wrote:

> please check [default@unknown] help create column family;
> These are default values,  until you explicitly mentioned on CF creation.
>
> Regards,
> Thamizhannal
> ------
> *From:* Sanjeev Kulkarni 
> *To:* user@cassandra.apache.org
> *Sent:* Thursday, 29 September 2011 10:33 AM
> *Subject:* nodetools cfstats question
>
> Hey guys,
> I'm using a three node cluster running 0.8.6 with rf of 3. Its a freshly
> installed cluster with no upgrade history.
> I have 6 cfs and only one of them is written into. That cf has around one
> thousand keys. A quick key_range_scan verifies this.
> However when I do cfstats, I see the following for this cf.
>
> Number of Keys (estimate): 5248
> Key cache capacity: 20
> Key cache size: 99329
>
> What is the definition of these three output values? Both the Number of
> Keys and Key Cache size are way over what they should be.
> Thanks!
>
>
>


nodetools cfstats question

2011-09-28 Thread Sanjeev Kulkarni
Hey guys,
I'm using a three node cluster running 0.8.6 with rf of 3. Its a freshly
installed cluster with no upgrade history.
I have 6 cfs and only one of them is written into. That cf has around one
thousand keys. A quick key_range_scan verifies this.
However when I do cfstats, I see the following for this cf.

Number of Keys (estimate): 5248
Key cache capacity: 20
Key cache size: 99329

What is the definition of these three output values? Both the Number of Keys
and Key Cache size are way over what they should be.
Thanks!


Increasing thrift_framed_transport_size_in_mb

2011-09-23 Thread Sanjeev Kulkarni
Hey guys,
Are there any side-effects of increasing
the thrift_framed_transport_size_in_mb and thrift_max_message_length_in_mb
variables from their default values to something like 100mb?
Thanks!


get_range_slices efficiency question

2011-09-05 Thread Sanjeev Kulkarni
Hey guys,
We are designing our data model for our app and this question came up.
Lets say that I have a large number of rows(say 1M). And just one column
family.
Each row contains either columns (A, B, C) or (X, Y, Z). I want to run a
get_range_slices query to fetch columns (A, B, C).
Does cassandra actually iterate over all 1M rows, and pass me half of them
which contain (A, B, C). Or somehow magically it iterates over only half
million rows containing (A, B, C). The latter case should run twice as fast
as former since it avoids iterating half the data.
Thanks in advance!


Re: Commitlog Disk Full

2011-05-17 Thread Sanjeev Kulkarni
Here is the snapshot of the logs from one of the machine between the time
when I mailed to when it finally started to look at the commitlogs. At the
end you can see that it is discarding obsolete commitlogs. Not sure what
that means.


INFO [HintedHandoff:1] 2011-05-17 01:17:10,510 ColumnFamilyStore.java (line
1070) Enqueuing flush of Memtable-HintsColumnFamily@866759847(153737 bytes,
3271 operations)
 INFO [FlushWriter:309] 2011-05-17 01:17:10,511 Memtable.java (line 158)
Writing Memtable-HintsColumnFamily@866759847(153737 bytes, 3271 operations)
 INFO [CompactionExecutor:1] 2011-05-17 01:17:10,511 CompactionManager.java
(line 395) Compacting
[SSTableReader(path='/mnt/cassandra/data/system/HintsColumnFamily-f-92-Data.db'),SSTableReader(path='/mnt/cassandra/data/system/HintsColumnFamily-f-91-Data.db')]
 INFO [FlushWriter:309] 2011-05-17 01:17:10,652 Memtable.java (line 165)
Completed flushing /mnt/cassandra/data/system/HintsColumnFamily-f-93-Data.db
(375072 bytes)
 INFO [CompactionExecutor:1] 2011-05-17 01:17:10,654 CompactionManager.java
(line 482) Compacted to
/mnt/cassandra/data/system/HintsColumnFamily-tmp-f-94-Data.db.  514,078 to
511,862 (~99% of original) bytes for 2 keys.  Time: 142ms.
 INFO [HintedHandoff:1] 2011-05-17 01:17:10,654 HintedHandOffManager.java
(line 360) Finished hinted handoff of 3271 rows to endpoint /10.32.6.238
 INFO [COMMIT-LOG-WRITER] 2011-05-17 01:17:10,753 CommitLog.java (line 440)
Discarding obsolete commit
log:CommitLogSegment(/mnt/cassandra/commitlog/CommitLog-1305593058271.log)
 INFO [COMMIT-LOG-WRITER] 2011-05-17 01:17:10,754 CommitLog.java (line 440)
Discarding obsolete commit
log:CommitLogSegment(/mnt/cassandra/commitlog/CommitLog-1305593776786.log)
 INFO [COMMIT-LOG-WRITER] 2011-05-17 01:17:10,754 CommitLog.java (line 440)
Discarding obsolete commit
log:CommitLogSegment(/mnt/cassandra/commitlog/CommitLog-1305593808049.log)
 INFO [COMMIT-LOG-WRITER] 2011-05-17 01:17:10,754 CommitLog.java (line 440)
Discarding obsolete commit
log:CommitLogSegment(/mnt/cassandra/commitlog/CommitLog-1305593840228.log)
 INFO [HintedHandoff:1] 2011-05-17 01:17:58,010 HintedHandOffManager.java
(line 304) Started hinted handoff for endpoint /10.32.6.238
 INFO [HintedHandoff:1] 2011-05-17 01:17:58,111 ColumnFamilyStore.java (line
1070) Enqueuing flush of Memtable-HintsColumnFamily@1025764186(47 bytes, 1
operations)
 INFO [FlushWriter:309] 2011-05-17 01:17:58,111 Memtable.java (line 158)
Writing Memtable-HintsColumnFamily@1025764186(47 bytes, 1 operations)
 INFO [CompactionExecutor:1] 2011-05-17 01:17:58,112 CompactionManager.java
(line 395) Compacting
[SSTableReader(path='/mnt/cassandra/data/system/HintsColumnFamily-f-94-Data.db'),SSTableReader(path='/mnt/cassandra/data/system/HintsColumnFamily-f-93-Data.db')]
 INFO [FlushWriter:309] 2011-05-17 01:17:58,119 Memtable.java (line 165)
Completed flushing /mnt/cassandra/data/system/HintsColumnFamily-f-95-Data.db
(127 bytes)
 INFO [CompactionExecutor:1] 2011-05-17 01:17:58,192 CompactionManager.java
(line 482) Compacted to
/mnt/cassandra/data/system/HintsColumnFamily-tmp-f-96-Data.db.  886,934 to
123,830 (~13% of original) bytes for 2 keys.  Time: 80ms.
 INFO [HintedHandoff:1] 2011-05-17 01:17:58,192 HintedHandOffManager.java
(line 360) Finished hinted handoff of 1 rows to endpoint /10.32.6.238
 INFO [NonPeriodicTasks:1] 2011-05-17 02:58:51,856 SSTable.java (line 147)
Deleted /mnt/cassandra/data/system/HintsColumnFamily-f-93
 INFO [NonPeriodicTasks:1] 2011-05-17 02:58:51,857 SSTable.java (line 147)
Deleted /mnt/cassandra/data/system/HintsColumnFamily-f-94
 INFO [CompactionExecutor:1] 2011-05-17 04:39:16,679 CacheWriter.java (line
96) Saved ObjectKeySpace-Profile-KeyCache (20 items) in 352 ms
 INFO [CompactionExecutor:1] 2011-05-17 04:42:24,773 CacheWriter.java (line
96) Saved ObjectKeySpace-Location-KeyCache (20 items) in 392 ms
 INFO [CompactionExecutor:1] 2011-05-17 04:43:05,117 CacheWriter.java (line
96) Saved ObjectKeySpace-SpaceLocation-KeyCache (186874 items) in 358 ms
 INFO [CompactionExecutor:1] 2011-05-17 04:43:35,001 CacheWriter.java (line
96) Saved ObjectKeySpace-PartitionInfo-KeyCache (29073 items) in 38 ms
 INFO [MutationStage:886] 2011-05-17 06:32:02,024 ColumnFamilyStore.java
(line 1070) Enqueuing flush of Memtable-PartitionInfo@1500472048(136213003
bytes, 2081174 operations)
 INFO [FlushWriter:310] 2011-05-17 06:32:02,036 Memtable.java (line 158)
Writing Memtable-PartitionInfo@1500472048(136213003 bytes, 2081174
operations)
 INFO [FlushWriter:310] 2011-05-17 06:32:11,001 Memtable.java (line 165)
Completed flushing
/mnt/cassandra/data/ObjectKeySpace/PartitionInfo-f-350-Data.db (137487577
bytes)
 INFO [NonPeriodicTasks:1] 2011-05-17 06:35:47,390 SSTable.java (line 147)
Deleted /mnt/cassandra/data/system/HintsColumnFamily-f-92
 INFO [NonPeriodicTasks:1] 2011-05-17 06:35:47,391 SSTable.java (line 147)
Deleted /mnt/cassandra/data/system/HintsColumnFamily-f-91
 INFO [MutationStage:899] 2011-05-17 06:41:37,170 ColumnFamilyStore.java

Re: Commitlog Disk Full

2011-05-16 Thread Sanjeev Kulkarni
Its now almost 4 hours. I still see commitlogs worth 1.2G on the machines. I
see no activity

On Mon, May 16, 2011 at 6:33 PM, Sanjeev Kulkarni wrote:

> After I updated the memtable_throughput, I stopped all my writing
> processes. I did a du /commitlog to find how much was cassandra commitlog at
> that time. For the three nodes it was around 1.4G each.
> I waited for about 30 minutes to see whether cassandra flushes things. When
> I look at du now, it still is around 1.4G.
> The ls -l on one of the machines shows the following
>
> -rw--- 1 cassandra cassandra 147190162 2011-05-12 17:36
> CommitLog-1305221517682.log
> -rw--- 1 cassandra cassandra28 2011-05-17 01:26
> CommitLog-1305221517682.log.header
> -rw-r--r-- 1 cassandra cassandra 134217815 2011-05-17 00:09
> CommitLog-1305590456606.log
> -rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
> CommitLog-1305590456606.log.header
> -rw-r--r-- 1 cassandra cassandra 134217757 2011-05-17 00:18
> CommitLog-1305590957399.log
> -rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
> CommitLog-1305590957399.log.header
> -rw-r--r-- 1 cassandra cassandra 134217757 2011-05-17 00:26
> CommitLog-1305591492565.log
> -rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
> CommitLog-1305591492565.log.header
> -rw-r--r-- 1 cassandra cassandra 134218024 2011-05-17 00:34
> CommitLog-1305591987515.log
> -rw-r--r-- 1 cassandra cassandra36 2011-05-17 01:26
> CommitLog-1305591987515.log.header
> -rw-r--r-- 1 cassandra cassandra 137919712 2011-05-17 00:43
> CommitLog-1305592441509.log
> -rw-r--r-- 1 cassandra cassandra36 2011-05-17 01:26
> CommitLog-1305592441509.log.header
> -rw-r--r-- 1 cassandra cassandra 136446581 2011-05-17 00:59
> CommitLog-1305593006344.log
> -rw-r--r-- 1 cassandra cassandra36 2011-05-17 01:26
> CommitLog-1305593006344.log.header
> -rw-r--r-- 1 cassandra cassandra 193306617 2011-05-17 01:09
> CommitLog-1305594484986.log
> -rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
> CommitLog-1305594484986.log.header
> -rw-r--r-- 1 cassandra cassandra 134986562 2011-05-17 01:21
> CommitLog-1305595243108.log
> -rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
> CommitLog-1305595243108.log.header
> -rw-r--r-- 1 cassandra cassandra 134754264 2011-05-17 01:26
> CommitLog-1305595537828.log
> -rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
> CommitLog-1305595537828.log.header
> -rw-r--r-- 1 cassandra cassandra  10616832 2011-05-17 01:26
> CommitLog-1305595602692.log
> -rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
> CommitLog-1305595602692.log.header
>
> There are a couple things that strike me as odd.
> 1. The first file CommitLog-1305221517682.log is dated 2011/5/12. I wonder
> why its still lingering around?
> 2. The times on all the other files range from current to about 1.5 hours
> ago. Shouldn't this be a smaller list?
>
> Thanks!
>
> On Mon, May 16, 2011 at 5:44 PM, Sanjeev Kulkarni 
> wrote:
>
>> Hey guys,
>> I have updated all my column families with 32 as the memtable_throughput.
>> I will let you know how cassandra behaves.
>> Thanks!
>>
>>
>> On Mon, May 16, 2011 at 3:52 PM, mcasandra wrote:
>>
>>> You can try to update column family using cassandra-cli. Try to set
>>> memtable_throughput to 32 first.
>>>
>>> [default@unknown] help update column family;
>>> update column family Bar;
>>> update column family Bar with =;
>>> update column family Bar with = and =...;
>>>
>>> Update a column family with the specified values for the given set of
>>> attributes. Note that you must be using a keyspace.
>>>
>>> valid attributes are:
>>>- column_type: Super or Standard
>>>- comment: Human-readable column family description. Any string is
>>> acceptable
>>>- rows_cached: Number or percentage of rows to cache
>>>- row_cache_save_period: Period with which to persist the row cache,
>>> in
>>> seconds
>>>- keys_cached: Number or percentage of keys to cache
>>>- key_cache_save_period: Period with which to persist the key cache,
>>> in
>>> seconds
>>>- read_repair_chance: Probability (0.0-1.0) with which to perform read
>>> repairs on CL.ONE reads
>>>- gc_grace: Discard tombstones after this many seconds
>>>- column_metadata: null
>>>- memtable_operations: Flush memtables after this many operations (in
>>> millions)
>>>- memtable_throughput: ... or after this many MB have been written
>>>

Re: Commitlog Disk Full

2011-05-16 Thread Sanjeev Kulkarni
After I updated the memtable_throughput, I stopped all my writing processes.
I did a du /commitlog to find how much was cassandra commitlog at that time.
For the three nodes it was around 1.4G each.
I waited for about 30 minutes to see whether cassandra flushes things. When
I look at du now, it still is around 1.4G.
The ls -l on one of the machines shows the following

-rw--- 1 cassandra cassandra 147190162 2011-05-12 17:36
CommitLog-1305221517682.log
-rw--- 1 cassandra cassandra28 2011-05-17 01:26
CommitLog-1305221517682.log.header
-rw-r--r-- 1 cassandra cassandra 134217815 2011-05-17 00:09
CommitLog-1305590456606.log
-rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
CommitLog-1305590456606.log.header
-rw-r--r-- 1 cassandra cassandra 134217757 2011-05-17 00:18
CommitLog-1305590957399.log
-rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
CommitLog-1305590957399.log.header
-rw-r--r-- 1 cassandra cassandra 134217757 2011-05-17 00:26
CommitLog-1305591492565.log
-rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
CommitLog-1305591492565.log.header
-rw-r--r-- 1 cassandra cassandra 134218024 2011-05-17 00:34
CommitLog-1305591987515.log
-rw-r--r-- 1 cassandra cassandra36 2011-05-17 01:26
CommitLog-1305591987515.log.header
-rw-r--r-- 1 cassandra cassandra 137919712 2011-05-17 00:43
CommitLog-1305592441509.log
-rw-r--r-- 1 cassandra cassandra36 2011-05-17 01:26
CommitLog-1305592441509.log.header
-rw-r--r-- 1 cassandra cassandra 136446581 2011-05-17 00:59
CommitLog-1305593006344.log
-rw-r--r-- 1 cassandra cassandra36 2011-05-17 01:26
CommitLog-1305593006344.log.header
-rw-r--r-- 1 cassandra cassandra 193306617 2011-05-17 01:09
CommitLog-1305594484986.log
-rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
CommitLog-1305594484986.log.header
-rw-r--r-- 1 cassandra cassandra 134986562 2011-05-17 01:21
CommitLog-1305595243108.log
-rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
CommitLog-1305595243108.log.header
-rw-r--r-- 1 cassandra cassandra 134754264 2011-05-17 01:26
CommitLog-1305595537828.log
-rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
CommitLog-1305595537828.log.header
-rw-r--r-- 1 cassandra cassandra  10616832 2011-05-17 01:26
CommitLog-1305595602692.log
-rw-r--r-- 1 cassandra cassandra28 2011-05-17 01:26
CommitLog-1305595602692.log.header

There are a couple things that strike me as odd.
1. The first file CommitLog-1305221517682.log is dated 2011/5/12. I wonder
why its still lingering around?
2. The times on all the other files range from current to about 1.5 hours
ago. Shouldn't this be a smaller list?

Thanks!

On Mon, May 16, 2011 at 5:44 PM, Sanjeev Kulkarni wrote:

> Hey guys,
> I have updated all my column families with 32 as the memtable_throughput. I
> will let you know how cassandra behaves.
> Thanks!
>
>
> On Mon, May 16, 2011 at 3:52 PM, mcasandra  wrote:
>
>> You can try to update column family using cassandra-cli. Try to set
>> memtable_throughput to 32 first.
>>
>> [default@unknown] help update column family;
>> update column family Bar;
>> update column family Bar with =;
>> update column family Bar with = and =...;
>>
>> Update a column family with the specified values for the given set of
>> attributes. Note that you must be using a keyspace.
>>
>> valid attributes are:
>>- column_type: Super or Standard
>>- comment: Human-readable column family description. Any string is
>> acceptable
>>- rows_cached: Number or percentage of rows to cache
>>- row_cache_save_period: Period with which to persist the row cache, in
>> seconds
>>- keys_cached: Number or percentage of keys to cache
>>- key_cache_save_period: Period with which to persist the key cache, in
>> seconds
>>- read_repair_chance: Probability (0.0-1.0) with which to perform read
>> repairs on CL.ONE reads
>>- gc_grace: Discard tombstones after this many seconds
>>- column_metadata: null
>>- memtable_operations: Flush memtables after this many operations (in
>> millions)
>>- memtable_throughput: ... or after this many MB have been written
>>- memtable_flush_after: ... or after this many minutes
>>- default_validation_class: null
>>- min_compaction_threshold: Avoid minor compactions of less than this
>> number of sstable files
>>- max_compaction_threshold: Compact no more than this number of sstable
>> files at once
>>- column_metadata: Metadata which describes columns of column family.
>>Supported format is [{ k:v, k:v, ... }, { ... }, ...]
>>Valid attributes: column_name, validation_class (see comparator),
>>  index_type (integer), index_name.
>>
>>
>> --
>> View this message in context:
>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Commitlog-Disk-Full-tp6356797p6370913.html
>> Sent from the cassandra-u...@incubator.apache.org mailing list archive at
>> Nabble.com.
>>
>
>


Re: Commitlog Disk Full

2011-05-16 Thread Sanjeev Kulkarni
Hey guys,
I have updated all my column families with 32 as the memtable_throughput. I
will let you know how cassandra behaves.
Thanks!

On Mon, May 16, 2011 at 3:52 PM, mcasandra  wrote:

> You can try to update column family using cassandra-cli. Try to set
> memtable_throughput to 32 first.
>
> [default@unknown] help update column family;
> update column family Bar;
> update column family Bar with =;
> update column family Bar with = and =...;
>
> Update a column family with the specified values for the given set of
> attributes. Note that you must be using a keyspace.
>
> valid attributes are:
>- column_type: Super or Standard
>- comment: Human-readable column family description. Any string is
> acceptable
>- rows_cached: Number or percentage of rows to cache
>- row_cache_save_period: Period with which to persist the row cache, in
> seconds
>- keys_cached: Number or percentage of keys to cache
>- key_cache_save_period: Period with which to persist the key cache, in
> seconds
>- read_repair_chance: Probability (0.0-1.0) with which to perform read
> repairs on CL.ONE reads
>- gc_grace: Discard tombstones after this many seconds
>- column_metadata: null
>- memtable_operations: Flush memtables after this many operations (in
> millions)
>- memtable_throughput: ... or after this many MB have been written
>- memtable_flush_after: ... or after this many minutes
>- default_validation_class: null
>- min_compaction_threshold: Avoid minor compactions of less than this
> number of sstable files
>- max_compaction_threshold: Compact no more than this number of sstable
> files at once
>- column_metadata: Metadata which describes columns of column family.
>Supported format is [{ k:v, k:v, ... }, { ... }, ...]
>Valid attributes: column_name, validation_class (see comparator),
>  index_type (integer), index_name.
>
>
> --
> View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Commitlog-Disk-Full-tp6356797p6370913.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at
> Nabble.com.
>


Re: Commitlog Disk Full

2011-05-16 Thread Sanjeev Kulkarni
Hi,
Are you referring to the binary_memtable_throughput_in_mb which is a global
parameter or the per col fam specific memtable_throughput_in_mb? The former
is set to 256 and we dont override the default col fam specific value. Would
just re-setting the global binary_memtable_throughput_in_mb to something
like 64 be enough?
Thanks!

On Fri, May 13, 2011 at 10:30 PM, mcasandra  wrote:

> 5G in one hour is actually very low. Something else is wrong. Peter pointed
> to something related to memtable size could be causing this problem, can
> you
> turn down memtable_throughput and see if that helps.
>
> --
> View this message in context:
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Commitlog-Disk-Full-tp6356797p6362301.html
> Sent from the cassandra-u...@incubator.apache.org mailing list archive at
> Nabble.com.
>


Re: Commitlog Disk Full

2011-05-13 Thread Sanjeev Kulkarni
our write happen in bursts. So often times, clients write data as fast as
they can. Conceivably one can write 5G in one hour.
The other setting that we have is that our replication factor is 3 and we
write using QUORUM. Not sure if that will affect things.

On Fri, May 13, 2011 at 12:04 AM, Peter Schuller <
peter.schul...@infidyne.com> wrote:

> > I haven't explictly set a value for the memtable_flush_after_mins
> parameter.
> > Looks like the default is 60minutes.
> > I will try to play around this value to see if that fixes things.
>
> Is the amount of data in the commit log consistent with what you might
> have been writing during 60 minutes? Including overwrites. If not, I'm
> not sure what's going on. Since you said it took about a day of
> traffic it feels fishy.
>
> --
> / Peter Schuller
>


Re: Commitlog Disk Full

2011-05-12 Thread Sanjeev Kulkarni
Hi Peter,
Thanks for the response.
I haven't explictly set a value for the memtable_flush_after_mins parameter.
Looks like the default is 60minutes.
I will try to play around this value to see if that fixes things.
Thanks again!

On Thu, May 12, 2011 at 11:41 AM, Peter Schuller <
peter.schul...@infidyne.com> wrote:

> > I understand that cassandra periodically cleans up the commitlog
> directories
> > by generating sstables in datadir. Is there any way to speed up this
> > movement from commitog to datadir?
>
> commitlog_rotation_threshold_in_mb could cause problems if it was set
> very very high, but with the default of 128mb it should not be an
> issue.
>
> I suspect the most likely reason is that you have a column family
> whose memtable flush settings are extreme. A commit log segment cannot
> be removed until the corresponding data has been flushed to an
> sstable. For high-throughput memtables where you flush regularly this
> should happen often. For idle or almost idle memtables you may be
> waiting on the timeout criteria to trigger. So in general, having a
> memtable with a long expiry time will have the potential to generate
> commit logs of whatever size is implied by the write traffic during
> that periods.
>
> The memtable setting in question is the "memtable_flush_after"
> setting. Do you have that set to something very high on one of your
> column families?
>
> You can use "describe keyspace name_of_keyspace" in cassandra-cli to
> check current settings.
>
> --
> / Peter Schuller
>


Commitlog Disk Full

2011-05-12 Thread Sanjeev Kulkarni
Hey guys,
I have a ec2 debian cluster consisting of several nodes running 0.7.5 on
ephimeral disks.
These are fresh installs and not upgrades.
The commitlog is set to the smaller of the disks which is around 10G in size
and the datadir is set to the bigger disk.
The config file is basically the same as the one supplied by the default
installation.
Our applications write to the cluster. After about a day of writing we
started noticing the commitlog disk filling up. Soon we went over the disk
limit and writes started failing. At this point we stopped the cluster.
Over the course of the day we inserted around 25G of data. Our columns
values are pretty small.
I understand that cassandra periodically cleans up the commitlog directories
by generating sstables in datadir. Is there any way to speed up this
movement from commitog to datadir?
Thanks!


Re: Memory Usage During Read

2011-05-09 Thread Sanjeev Kulkarni
Hi Adam,
We have been facing some similar issues of late. Wondering if Jonathan's
suggestions worked for you.
Thanks!

On Sat, May 7, 2011 at 6:37 PM, Jonathan Ellis  wrote:

> The live:serialized size ratio depends on what your data looks like
> (small columns will be less efficient than large blobs) but using the
> rule of thumb of 10x, around 1G * (1 + memtable_flush_writers +
> memtable_flush_queue_size).
>
> So first thing I would do is drop writers and queue to 1 and 1.
>
> Then I would drop the max heap to 1G, memtable size to 8MB so the heap
> dump is easier to analyze. Then let it OOM and look at the dump with
> http://www.eclipse.org/mat/
>
> On Sat, May 7, 2011 at 3:54 PM, Serediuk, Adam
>  wrote:
> > How much memory should a single hot cf with a 128mb memtable take with
> row and key caching disabled during read?
> >
> > Because I'm seeing heap go from 3.5gb skyrocketing straight to max
> (regardless of the size, 8gb and 24gb both do the same) at which time the
> jvm will do nothing but full gc and is unable to reclaim any meaningful
> amount of memory. Cassandra then becomes unusable.
> >
> > I see the same behavior with smaller memtables, eg 64mb.
> >
> > This happens well into the read operation an only on a small number of
> nodes in the cluster(1-4 out of a total of 60 nodes.)
> >
> > Sent from my iPhone
> >
> > On May 6, 2011, at 22:45, "Jonathan Ellis"  wrote:
> >
> >> You don't GC storm without legitimately having a too-full heap.  It's
> >> normal to see occasional full GCs from fragmentation, but that will
> >> actually compact the heap and everything goes back to normal IF you
> >> had space actually freed up.
> >>
> >> You say you've played w/ memtable size but that would still be my bet.
> >> Most people severely underestimate how much space this takes (10x in
> >> memory over serialized size), which will bite you when you have lots
> >> of CFs defined.
> >>
> >> Otherwise, force a heap dump after a full GC and take a look to see
> >> what's referencing all the memory.
> >>
> >> On Fri, May 6, 2011 at 12:25 PM, Serediuk, Adam
> >>  wrote:
> >>> We're troubleshooting a memory usage problem during batch reads. We've
> spent the last few days profiling and trying different GC settings. The
> symptoms are that after a certain amount of time during reads one or more
> nodes in the cluster will exhibit extreme memory pressure followed by a gc
> storm. We've tried every possible JVM setting and different GC methods and
> the issue persists. This is pointing towards something instantiating a lot
> of objects and keeping references so that they can't be cleaned up.
> >>>
> >>> Typically nothing is ever logged other than the GC failures however
> just now one of the nodes emitted logs we've never seen before:
> >>>
> >>>  INFO [ScheduledTasks:1] 2011-05-06 15:04:55,085 StorageService.java
> (line 2218) Unable to reduce heap usage since there are no dirty column
> families
> >>>
> >>> We have tried increasing the heap on these nodes to large values, eg
> 24GB and still run into the same issue. We're running 8GB of heap normally
> and only one or two nodes will ever exhibit this issue, randomly. We don't
> use key/row caching and our memtable sizing is 64mb/0.3. Larger or smaller
> memtables make no difference in avoiding the issue. We're on 0.7.5, mmap,
> jna and jdk 1.6.0_24
> >>>
> >>> We've somewhat hit the wall in troubleshooting and any advice is
> greatly appreciated.
> >>>
> >>> --
> >>> Adam
> >>>
> >>
> >>
> >>
> >> --
> >> Jonathan Ellis
> >> Project Chair, Apache Cassandra
> >> co-founder of DataStax, the source for professional Cassandra support
> >> http://www.datastax.com
> >>
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>


Re: New node not joining

2011-05-09 Thread Sanjeev Kulkarni
Thanks!

On Sun, May 8, 2011 at 3:40 PM, aaron morton wrote:

> Ah, I see the case you are talking about.
>
> If the node will auto bootstrap on startup if when it joins the ring: it is
> not already bootstrapped, auto bootstrap is enabled, and the node is not in
> it's own seed list.
>
> In the auto bootstrap process then finds the token it wants, but aborts the
> process if there are no non system tables defined.That may happen because
> the bootstrap code finds the node with the highest load and splits it's
> range, if all the nodes have zero load (no user data) then that process is
> unreliable. But it's also unreliable if there is a schema and no data.
>
> Created https://issues.apache.org/jira/browse/CASSANDRA-2625 to see if it
> can be changed.
>
> Thanks
>
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 7 May 2011, at 05:25, Len Bucchino wrote:
>
> While I agree that what you suggested is a very good idea the bootstrapping
> process _*should*_ work properly.
>
> Here is some additional detail on the original problem.  If the current
> node that you are trying to bootstrap has itself listed in seeds in its yaml
> then it will be able to bootstrap on an empty schema.  If it does not have
> itself listed in seeds in its yaml and you have and empty schema then the
> bootstrap process will not complete and no errors will be reported in the
> logs even with debug enabled.
>
> *From:* aaron morton [mailto:aa...@thelastpickle.com]
> *Sent:* Thursday, May 05, 2011 6:51 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: New node not joining
>
> When adding nodes it is a *very* good idea to manually set the tokens, see
> http://wiki.apache.org/cassandra/Operations#Load_balancing
>
> bootstrap is a process that happens only once on a node, where as well as
> telling the other nodes it's around it asks them to stream over the data it
> will no be responsible for.
>
> nodetool loadbalance is an old utility that should have better warnings not
> to use it. The best way to load balance the cluster is manually creating the
> tokens and assigning them either using the initial_token config param or
> using nodetool move.
>
> Hope that helps.
>
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 6 May 2011, at 08:37, Sanjeev Kulkarni wrote:
>
>
> Here is what I did.
> I booted up the first one. After that I started the second one with
> bootstrap turned off.
> Then I did a nodetool loadbalance on the second node.
> After which I added the third node again with bootstrap turned off. Then
> did the loadbalance again on the third node.
> This seems to have successfully completed and I am now able to read/write
> into my system.
>
> Thanks!
> On Thu, May 5, 2011 at 1:22 PM, Len Bucchino 
> wrote:
> I just rebuilt the cluster in the same manner as I did originally except
> after I setup the first node I added a keyspace and column family before
> adding any new nodes.  This time the 3rd node auto bootstrapped
> successfully.
>
> *From:* Len Bucchino [mailto:len.bucch...@veritix.com]
> *Sent:* Thursday, May 05, 2011 1:31 PM
>
> *To:* user@cassandra.apache.org
> *Subject:* RE: New node not joining
>
>
> Also, setting auto_bootstrap to false and setting token to the one that it
> said it would use in the logs allows the new node to join the ring.
>
> *From:* Len Bucchino [mailto:len.bucch...@veritix.com]
> *Sent:* Thursday, May 05, 2011 1:25 PM
> *To:* user@cassandra.apache.org
> *Subject:* RE: New node not joining
>
> Adding the fourth node to the cluster with an empty schema using
> auto_bootstrap was not successful.  A nodetool netstats on the new node
> shows “Mode: Joining: getting bootstrap token” similar to what the third
> node did before it was manually added.  Also, there are no exceptions in the
> logs but it never joins the ring.
>
> *From:* Sanjeev Kulkarni [mailto:sanj...@locomatix.com]
> *Sent:* Thursday, May 05, 2011 11:47 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: New node not joining
>
> Hi Len,
> This looks like a decent workaround. I would be very interested to see how
> the addition of the 4th node went. Please post it whenever you get a chance.
> Thanks!
>
> On Thu, May 5, 2011 at 6:47 AM, Len Bucchino 
> wrote:
> I have the same problem on 0.7.5 auto bootstrapping a 3rd node onto an
> empty 2 node test cluster (the two nodes were manually added) and the it
> currently has an empty schema.  My log entries look similar to yours.  I
> took the new token it says its going to use from the 

Re: New node not joining

2011-05-05 Thread Sanjeev Kulkarni
Here is what I did.
I booted up the first one. After that I started the second one with
bootstrap turned off.
Then I did a nodetool loadbalance on the second node.
After which I added the third node again with bootstrap turned off. Then did
the loadbalance again on the third node.
This seems to have successfully completed and I am now able to read/write
into my system.
Thanks!

On Thu, May 5, 2011 at 1:22 PM, Len Bucchino wrote:

>  I just rebuilt the cluster in the same manner as I did originally except
> after I setup the first node I added a keyspace and column family before
> adding any new nodes.  This time the 3rd node auto bootstrapped
> successfully.
>
>
>
> *From:* Len Bucchino [mailto:len.bucch...@veritix.com]
> *Sent:* Thursday, May 05, 2011 1:31 PM
>
> *To:* user@cassandra.apache.org
> *Subject:* RE: New node not joining
>
>
>
> Also, setting auto_bootstrap to false and setting token to the one that it
> said it would use in the logs allows the new node to join the ring.
>
>
>
> *From:* Len Bucchino [mailto:len.bucch...@veritix.com]
> *Sent:* Thursday, May 05, 2011 1:25 PM
> *To:* user@cassandra.apache.org
> *Subject:* RE: New node not joining
>
>
>
> Adding the fourth node to the cluster with an empty schema using
> auto_bootstrap was not successful.  A nodetool netstats on the new node
> shows “Mode: Joining: getting bootstrap token” similar to what the third
> node did before it was manually added.  Also, there are no exceptions in the
> logs but it never joins the ring.
>
>
>
> *From:* Sanjeev Kulkarni [mailto:sanj...@locomatix.com]
> *Sent:* Thursday, May 05, 2011 11:47 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: New node not joining
>
>
>
> Hi Len,
>
> This looks like a decent workaround. I would be very interested to see how
> the addition of the 4th node went. Please post it whenever you get a chance.
>
> Thanks!
>
>
>
> On Thu, May 5, 2011 at 6:47 AM, Len Bucchino 
> wrote:
>
> I have the same problem on 0.7.5 auto bootstrapping a 3rd node onto an
> empty 2 node test cluster (the two nodes were manually added) and the it
> currently has an empty schema.  My log entries look similar to yours.  I
> took the new token it says its going to use from the log file added it to
> the yaml and turned off auto bootstrap and the node added fine.  I'm
> bringing up a 4th node now and will see if it has the same problem auto
> bootstrapping.
>
>
>  --
>
> *From:* Sanjeev Kulkarni [sanj...@locomatix.com]
> *Sent:* Thursday, May 05, 2011 2:18 AM
> *To:* user@cassandra.apache.org
> *Subject:* New node not joining
>
> Hey guys,
>
> I'm running into what seems like a very basic problem.
>
> I have a one node cassandra instance. Version 0.7.5. Freshly installed.
> Contains no data.
>
> The cassandra.yaml is the same as the default one that is supplied, except
> for data/commitlog/saved_caches directories.
>
> I also changed the addresses to point to a externally visible ip address.
>
> The cassandra comes up nicely and is ready to accept thrift connections.
>
> I do a nodetool and this is what I get.
>
>
>
> 10.242.217.124  Up Normal  6.54 KB 100.00%
> 110022862993086789903543147927259579701
>
>
>
> Which seems right to me.
>
>
>
> Now I start another node. Almost identical configuration to the first one.
> Except the bootstrap is turned true and seeds appropriately set.
>
> When I start the second, I notice that the second one contacts the first
> node to get the new token.
>
> I see the following lines in the first machine(the seed machine).
>
>
>
> INFO [GossipStage:1] 2011-05-05 07:00:20,427 Gossiper.java (line 628) Node
> /10.83.111.80 has restarted,
>
> now UP again
>
>  INFO [HintedHandoff:1] 2011-05-05 07:00:55,162 HintedHandOffManager.java
> (line 304) Started hinted handoff for endpoint /10.83.111.80
>
>  INFO [HintedHandoff:1] 2011-05-05 07:00:55,164 HintedHandOffManager.java
> (line 360) Finished hinted hand
>
> off of 0 rows to endpoint /10.83.111.80
>
>
>
> However when i do a node ring, I still get
>
>
>
> 10.242.217.124  Up Normal  6.54 KB 100.00%
> 110022862993086789903543147927259579701
>
>
>
> Even though the second node has come up. On the second machine the logs say
>
>
>
> INFO [main] 2011-05-05 07:00:19,124 StorageService.java (line 504) Joining:
> getting load information
>
>  INFO [main] 2011-05-05 07:00:19,124 StorageLoadBalancer.java (line 351)
> Sleeping 9 ms to wait for load information...
>
>  INFO [GossipStage:1] 2011-05-05 07:00:20,828 Gossiper.java (line 628) Node
> /10.242

Re: New node not joining

2011-05-05 Thread Sanjeev Kulkarni
Hi Len,
This looks like a decent workaround. I would be very interested to see how
the addition of the 4th node went. Please post it whenever you get a chance.
Thanks!

On Thu, May 5, 2011 at 6:47 AM, Len Bucchino wrote:

>  I have the same problem on 0.7.5 auto bootstrapping a 3rd node onto an
> empty 2 node test cluster (the two nodes were manually added) and the it
> currently has an empty schema.  My log entries look similar to yours.  I
> took the new token it says its going to use from the log file added it to
> the yaml and turned off auto bootstrap and the node added fine.  I'm
> bringing up a 4th node now and will see if it has the same problem auto
> bootstrapping.
>
>  ------
> *From:* Sanjeev Kulkarni [sanj...@locomatix.com]
> *Sent:* Thursday, May 05, 2011 2:18 AM
> *To:* user@cassandra.apache.org
> *Subject:* New node not joining
>
>  Hey guys,
> I'm running into what seems like a very basic problem.
> I have a one node cassandra instance. Version 0.7.5. Freshly installed.
> Contains no data.
> The cassandra.yaml is the same as the default one that is supplied, except
> for data/commitlog/saved_caches directories.
> I also changed the addresses to point to a externally visible ip address.
> The cassandra comes up nicely and is ready to accept thrift connections.
> I do a nodetool and this is what I get.
>
>  10.242.217.124  Up Normal  6.54 KB 100.00%
> 110022862993086789903543147927259579701
>
>  Which seems right to me.
>
>  Now I start another node. Almost identical configuration to the first
> one. Except the bootstrap is turned true and seeds appropriately set.
> When I start the second, I notice that the second one contacts the first
> node to get the new token.
> I see the following lines in the first machine(the seed machine).
>
>  INFO [GossipStage:1] 2011-05-05 07:00:20,427 Gossiper.java (line 628)
> Node /10.83.111.80 has restarted,
> now UP again
>  INFO [HintedHandoff:1] 2011-05-05 07:00:55,162 HintedHandOffManager.java
> (line 304) Started hinted handoff for endpoint /10.83.111.80
>  INFO [HintedHandoff:1] 2011-05-05 07:00:55,164 HintedHandOffManager.java
> (line 360) Finished hinted hand
> off of 0 rows to endpoint /10.83.111.80
>
>  However when i do a node ring, I still get
>
>  10.242.217.124  Up Normal  6.54 KB 100.00%
> 110022862993086789903543147927259579701
>
>  Even though the second node has come up. On the second machine the logs
> say
>
>  INFO [main] 2011-05-05 07:00:19,124 StorageService.java (line 504)
> Joining: getting load information
>  INFO [main] 2011-05-05 07:00:19,124 StorageLoadBalancer.java (line 351)
> Sleeping 9 ms to wait for load information...
>  INFO [GossipStage:1] 2011-05-05 07:00:20,828 Gossiper.java (line 628) Node
> /10.242.217.124 has restarted, now UP again
>  INFO [HintedHandoff:1] 2011-05-05 07:00:29,548 HintedHandOffManager.java
> (line 304) Started hinted handoff for endpoint /10.242.217.124
>  INFO [HintedHandoff:1] 2011-05-05 07:00:29,550 HintedHandOffManager.java
> (line 360) Finished hinted handoff of 0 rows to endpoint /10.242.217.124
>  INFO [main] 2011-05-05 07:01:49,137 StorageService.java (line 504)
> Joining: getting bootstrap token
>  INFO [main] 2011-05-05 07:01:49,148 BootStrapper.java (line 148) New token
> will be 24952271262852174037699496069317526837 to assume load from /
> 10.242.217.124
>  INFO [main] 2011-05-05 07:01:49,150 Mx4jTool.java (line 72) Will not load
> MX4J, mx4j-tools.jar is not in the classpath
>  INFO [main] 2011-05-05 07:01:49,259 CassandraDaemon.java (line 112)
> Binding thrift service to /10.83.111.80:9160
>  INFO [main] 2011-05-05 07:01:49,262 CassandraDaemon.java (line 126) Using
> TFastFramedTransport with a max frame size of 15728640 bytes.
>  INFO [Thread-5] 2011-05-05 07:01:49,266 CassandraDaemon.java (line 154)
> Listening for thrift clients...
>
>  This seems to indicate that the second node has joined the ring. And has
> gotten its key range.
> Am I missing anything?
>
> Thanks!
>
>


New node not joining

2011-05-05 Thread Sanjeev Kulkarni
Hey guys,
I'm running into what seems like a very basic problem.
I have a one node cassandra instance. Version 0.7.5. Freshly installed.
Contains no data.
The cassandra.yaml is the same as the default one that is supplied, except
for data/commitlog/saved_caches directories.
I also changed the addresses to point to a externally visible ip address.
The cassandra comes up nicely and is ready to accept thrift connections.
I do a nodetool and this is what I get.

10.242.217.124  Up Normal  6.54 KB 100.00%
110022862993086789903543147927259579701

Which seems right to me.

Now I start another node. Almost identical configuration to the first one.
Except the bootstrap is turned true and seeds appropriately set.
When I start the second, I notice that the second one contacts the first
node to get the new token.
I see the following lines in the first machine(the seed machine).

INFO [GossipStage:1] 2011-05-05 07:00:20,427 Gossiper.java (line 628) Node /
10.83.111.80 has restarted,
now UP again
 INFO [HintedHandoff:1] 2011-05-05 07:00:55,162 HintedHandOffManager.java
(line 304) Started hinted handoff for endpoint /10.83.111.80
 INFO [HintedHandoff:1] 2011-05-05 07:00:55,164 HintedHandOffManager.java
(line 360) Finished hinted hand
off of 0 rows to endpoint /10.83.111.80

However when i do a node ring, I still get

10.242.217.124  Up Normal  6.54 KB 100.00%
110022862993086789903543147927259579701

Even though the second node has come up. On the second machine the logs say

INFO [main] 2011-05-05 07:00:19,124 StorageService.java (line 504) Joining:
getting load information
 INFO [main] 2011-05-05 07:00:19,124 StorageLoadBalancer.java (line 351)
Sleeping 9 ms to wait for load information...
 INFO [GossipStage:1] 2011-05-05 07:00:20,828 Gossiper.java (line 628) Node
/10.242.217.124 has restarted, now UP again
 INFO [HintedHandoff:1] 2011-05-05 07:00:29,548 HintedHandOffManager.java
(line 304) Started hinted handoff for endpoint /10.242.217.124
 INFO [HintedHandoff:1] 2011-05-05 07:00:29,550 HintedHandOffManager.java
(line 360) Finished hinted handoff of 0 rows to endpoint /10.242.217.124
 INFO [main] 2011-05-05 07:01:49,137 StorageService.java (line 504) Joining:
getting bootstrap token
 INFO [main] 2011-05-05 07:01:49,148 BootStrapper.java (line 148) New token
will be 24952271262852174037699496069317526837 to assume load from /
10.242.217.124
 INFO [main] 2011-05-05 07:01:49,150 Mx4jTool.java (line 72) Will not load
MX4J, mx4j-tools.jar is not in the classpath
 INFO [main] 2011-05-05 07:01:49,259 CassandraDaemon.java (line 112) Binding
thrift service to /10.83.111.80:9160
 INFO [main] 2011-05-05 07:01:49,262 CassandraDaemon.java (line 126) Using
TFastFramedTransport with a max frame size of 15728640 bytes.
 INFO [Thread-5] 2011-05-05 07:01:49,266 CassandraDaemon.java (line 154)
Listening for thrift clients...

This seems to indicate that the second node has joined the ring. And has
gotten its key range.
Am I missing anything?

Thanks!


Re: 0.7.4 Bad sstables?

2011-04-25 Thread Sanjeev Kulkarni
BTW where do i download 0.7.5? I went to
http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.7.5/apache-cassandra-0.7.5-bin.tar.gzbut
all the links there are broken.
I was thinking if I just skip 0.7.5 and go with 0.8-beta1, would that be
more advisable?
Thanks!

On Mon, Apr 25, 2011 at 9:30 PM, Jonathan Ellis  wrote:

> No. You'll need to run scrub.
>
> On Mon, Apr 25, 2011 at 11:19 PM, Sanjeev Kulkarni
>  wrote:
> > Hi,
> > Thanks for pointing out the fix. My followup question is if I install
> 0.7.5
> > will the problem go away with the current data?
> > Thanks!
> > On Mon, Apr 25, 2011 at 8:25 PM, Jonathan Ellis 
> wrote:
> >>
> >> Ah...  could be https://issues.apache.org/jira/browse/CASSANDRA-2349
> >> (fixed for 0.7.5)
> >>
> >> On Mon, Apr 25, 2011 at 9:47 PM, Sanjeev Kulkarni <
> sanj...@locomatix.com>
> >> wrote:
> >> > The only other interesting information is that the columns of these
> rows
> >> > all
> >> > had some ttl attached to them. Not sure if that matters.
> >> > Thanks!
> >> >
> >> > On Mon, Apr 25, 2011 at 5:27 PM, Terje Marthinussen
> >> >  wrote:
> >> >>
> >> >> First column in the row has offset in the file of 190226525, last
> valid
> >> >> column is at 380293592, about 181MB from first column to last.
> >> >> in_memory_compaction_limit was 128MB, so almost certainly above the
> >> >> limit.
> >> >> Terje
> >> >>
> >> >> On Tue, Apr 26, 2011 at 8:53 AM, Terje Marthinussen
> >> >>  wrote:
> >> >>>
> >> >>> In my case, probably yes. From thw rows I have looked at, I think I
> >> >>> have
> >> >>> only seen this on rows with 1 million plus columns/supercolumns.
> >> >>>
> >> >>> May very well been larger than in memory limit. I think the
> compacted
> >> >>> row
> >> >>> I looked closer at was about 200MB and the in memory limit may have
> >> >>> been
> >> >>> 256MB.
> >> >>>
> >> >>> I will see if we still got files around to verify.
> >> >>>
> >> >>> Regards,
> >> >>> Terje
> >> >>>
> >> >>> On 26 Apr 2011, at 02:08, Jonathan Ellis  wrote:
> >> >>>
> >> >>> > Was it on a "large" row?  (> in_memory_compaction_limit?)
> >> >>> >
> >> >>> > I'm starting to suspect that LazilyCompactedRow is computing row
> >> >>> > size
> >> >>> > incorrectly in some cases.
> >> >>> >
> >> >>> > On Mon, Apr 25, 2011 at 11:47 AM, Terje Marthinussen
> >> >>> >  wrote:
> >> >>> >> I have been hunting similar looking corruptions, especially in
> the
> >> >>> >> hints
> >> >>> >> column family, but I believe it occurs somewhere while
>  compacting.
> >> >>> >> I looked in greater detail on one sstable and the row length was
> >> >>> >> longer than
> >> >>> >> the actual data in the row, and as far as I could see, either the
> >> >>> >> length was
> >> >>> >> wrong or the row was missing data as there was was no extra data
> in
> >> >>> >> the row
> >> >>> >> after the last column.
> >> >>> >> This was however on a somewhat aging dataset, so suspected it
> could
> >> >>> >> be
> >> >>> >> related to 2376.
> >> >>> >>
> >> >>> >> Playing around with 0.8 at the moment and not seen it there
> yet
> >> >>> >> (bet it
> >> >>> >> will show up tomorrow once I wrote that.. :))
> >> >>> >> Terje
> >> >>> >>
> >> >>> >> On Tue, Apr 26, 2011 at 12:44 AM, Sanjeev Kulkarni
> >> >>> >> 
> >> >>> >> wrote:
> >> >>> >>>
> >> >>> >>> Hi Sylvain,
> >> >>> >>> I started it from 0.7.4 with the patch 2376. No upgrade.
> >> >>> >>> Thanks!
> >> >>> >>>
> >> >>> >>> On Mon, Apr 25, 2

Re: 0.7.4 Bad sstables?

2011-04-25 Thread Sanjeev Kulkarni
Hi,
Thanks for pointing out the fix. My followup question is if I install 0.7.5
will the problem go away with the current data?
Thanks!

On Mon, Apr 25, 2011 at 8:25 PM, Jonathan Ellis  wrote:

> Ah...  could be https://issues.apache.org/jira/browse/CASSANDRA-2349
> (fixed for 0.7.5)
>
> On Mon, Apr 25, 2011 at 9:47 PM, Sanjeev Kulkarni 
> wrote:
> > The only other interesting information is that the columns of these rows
> all
> > had some ttl attached to them. Not sure if that matters.
> > Thanks!
> >
> > On Mon, Apr 25, 2011 at 5:27 PM, Terje Marthinussen
> >  wrote:
> >>
> >> First column in the row has offset in the file of 190226525, last valid
> >> column is at 380293592, about 181MB from first column to last.
> >> in_memory_compaction_limit was 128MB, so almost certainly above the
> limit.
> >> Terje
> >>
> >> On Tue, Apr 26, 2011 at 8:53 AM, Terje Marthinussen
> >>  wrote:
> >>>
> >>> In my case, probably yes. From thw rows I have looked at, I think I
> have
> >>> only seen this on rows with 1 million plus columns/supercolumns.
> >>>
> >>> May very well been larger than in memory limit. I think the compacted
> row
> >>> I looked closer at was about 200MB and the in memory limit may have
> been
> >>> 256MB.
> >>>
> >>> I will see if we still got files around to verify.
> >>>
> >>> Regards,
> >>> Terje
> >>>
> >>> On 26 Apr 2011, at 02:08, Jonathan Ellis  wrote:
> >>>
> >>> > Was it on a "large" row?  (> in_memory_compaction_limit?)
> >>> >
> >>> > I'm starting to suspect that LazilyCompactedRow is computing row size
> >>> > incorrectly in some cases.
> >>> >
> >>> > On Mon, Apr 25, 2011 at 11:47 AM, Terje Marthinussen
> >>> >  wrote:
> >>> >> I have been hunting similar looking corruptions, especially in the
> >>> >> hints
> >>> >> column family, but I believe it occurs somewhere while  compacting.
> >>> >> I looked in greater detail on one sstable and the row length was
> >>> >> longer than
> >>> >> the actual data in the row, and as far as I could see, either the
> >>> >> length was
> >>> >> wrong or the row was missing data as there was was no extra data in
> >>> >> the row
> >>> >> after the last column.
> >>> >> This was however on a somewhat aging dataset, so suspected it could
> be
> >>> >> related to 2376.
> >>> >>
> >>> >> Playing around with 0.8 at the moment and not seen it there yet
> >>> >> (bet it
> >>> >> will show up tomorrow once I wrote that.. :))
> >>> >> Terje
> >>> >>
> >>> >> On Tue, Apr 26, 2011 at 12:44 AM, Sanjeev Kulkarni
> >>> >> 
> >>> >> wrote:
> >>> >>>
> >>> >>> Hi Sylvain,
> >>> >>> I started it from 0.7.4 with the patch 2376. No upgrade.
> >>> >>> Thanks!
> >>> >>>
> >>> >>> On Mon, Apr 25, 2011 at 7:48 AM, Sylvain Lebresne
> >>> >>> 
> >>> >>> wrote:
> >>> >>>>
> >>> >>>> Hi Sanjeev,
> >>> >>>>
> >>> >>>> What's the story of the cluster ? Did you started with 0.7.4, or
> is
> >>> >>>> it
> >>> >>>> upgraded from
> >>> >>>> some earlier version ?
> >>> >>>>
> >>> >>>> On Mon, Apr 25, 2011 at 5:54 AM, Sanjeev Kulkarni
> >>> >>>> 
> >>> >>>> wrote:
> >>> >>>>> Hey guys,
> >>> >>>>> Running a one node cassandra server with version 0.7.4 patched
> >>> >>>>> with https://issues.apache.org/jira/browse/CASSANDRA-2376
> >>> >>>>> The system was running fine for a couple of days when we started
> >>> >>>>> noticing
> >>> >>>>> something strange with cassandra. I stopped all applications and
> >>> >>>>> restarted
> >>> >>>>> cassandra. And then did a scrub. During scrub, I noticed these in
> >>> >&g

Re: 0.7.4 Bad sstables?

2011-04-25 Thread Sanjeev Kulkarni
The only other interesting information is that the columns of these rows all
had some ttl attached to them. Not sure if that matters.
Thanks!

On Mon, Apr 25, 2011 at 5:27 PM, Terje Marthinussen  wrote:

> First column in the row has offset in the file of 190226525, last valid
> column is at 380293592, about 181MB from first column to last.
>
> in_memory_compaction_limit was 128MB, so almost certainly above the limit.
>
> Terje
>
>
> On Tue, Apr 26, 2011 at 8:53 AM, Terje Marthinussen <
> tmarthinus...@gmail.com> wrote:
>
>> In my case, probably yes. From thw rows I have looked at, I think I have
>> only seen this on rows with 1 million plus columns/supercolumns.
>>
>> May very well been larger than in memory limit. I think the compacted row
>> I looked closer at was about 200MB and the in memory limit may have been
>> 256MB.
>>
>> I will see if we still got files around to verify.
>>
>> Regards,
>> Terje
>>
>> On 26 Apr 2011, at 02:08, Jonathan Ellis  wrote:
>>
>> > Was it on a "large" row?  (> in_memory_compaction_limit?)
>> >
>> > I'm starting to suspect that LazilyCompactedRow is computing row size
>> > incorrectly in some cases.
>> >
>> > On Mon, Apr 25, 2011 at 11:47 AM, Terje Marthinussen
>> >  wrote:
>> >> I have been hunting similar looking corruptions, especially in the
>> hints
>> >> column family, but I believe it occurs somewhere while  compacting.
>> >> I looked in greater detail on one sstable and the row length was longer
>> than
>> >> the actual data in the row, and as far as I could see, either the
>> length was
>> >> wrong or the row was missing data as there was was no extra data in the
>> row
>> >> after the last column.
>> >> This was however on a somewhat aging dataset, so suspected it could be
>> >> related to 2376.
>> >>
>> >> Playing around with 0.8 at the moment and not seen it there yet
>> (bet it
>> >> will show up tomorrow once I wrote that.. :))
>> >> Terje
>> >>
>> >> On Tue, Apr 26, 2011 at 12:44 AM, Sanjeev Kulkarni <
>> sanj...@locomatix.com>
>> >> wrote:
>> >>>
>> >>> Hi Sylvain,
>> >>> I started it from 0.7.4 with the patch 2376. No upgrade.
>> >>> Thanks!
>> >>>
>> >>> On Mon, Apr 25, 2011 at 7:48 AM, Sylvain Lebresne <
>> sylv...@datastax.com>
>> >>> wrote:
>> >>>>
>> >>>> Hi Sanjeev,
>> >>>>
>> >>>> What's the story of the cluster ? Did you started with 0.7.4, or is
>> it
>> >>>> upgraded from
>> >>>> some earlier version ?
>> >>>>
>> >>>> On Mon, Apr 25, 2011 at 5:54 AM, Sanjeev Kulkarni <
>> sanj...@locomatix.com>
>> >>>> wrote:
>> >>>>> Hey guys,
>> >>>>> Running a one node cassandra server with version 0.7.4 patched
>> >>>>> with https://issues.apache.org/jira/browse/CASSANDRA-2376
>> >>>>> The system was running fine for a couple of days when we started
>> >>>>> noticing
>> >>>>> something strange with cassandra. I stopped all applications and
>> >>>>> restarted
>> >>>>> cassandra. And then did a scrub. During scrub, I noticed these in
>> the
>> >>>>> logs
>> >>>>> WARN [CompactionExecutor:1] 2011-04-24 23:37:07,561
>> >>>>> CompactionManager.java
>> >>>>> (line 607) Non-fatal error reading row (stacktrace follows)
>> >>>>> java.io.IOError: java.io.IOException: Impossible row size
>> >>>>> 1516029079813320210
>> >>>>> at
>> >>>>>
>> >>>>>
>> org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:589)
>> >>>>> at
>> >>>>>
>> >>>>>
>> org.apache.cassandra.db.CompactionManager.access$600(CompactionManager.java:56)
>> >>>>>at
>> >>>>>
>> >>>>>
>> org.apache.cassandra.db.CompactionManager$3.call(CompactionManager.java:195)
>> >>>>> at
>> >>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >>>>&

Re: 0.7.4 Bad sstables?

2011-04-25 Thread Sanjeev Kulkarni
I pepper my objects based on a hash so without reading the row I cant
tell how big it is.
Thanks!

Sent from my iPhone

On Apr 25, 2011, at 10:08 AM, Jonathan Ellis  wrote:

> Was it on a "large" row?  (> in_memory_compaction_limit?)
>
> I'm starting to suspect that LazilyCompactedRow is computing row size
> incorrectly in some cases.
>
> On Mon, Apr 25, 2011 at 11:47 AM, Terje Marthinussen
>  wrote:
>> I have been hunting similar looking corruptions, especially in the hints
>> column family, but I believe it occurs somewhere while  compacting.
>> I looked in greater detail on one sstable and the row length was longer than
>> the actual data in the row, and as far as I could see, either the length was
>> wrong or the row was missing data as there was was no extra data in the row
>> after the last column.
>> This was however on a somewhat aging dataset, so suspected it could be
>> related to 2376.
>>
>> Playing around with 0.8 at the moment and not seen it there yet (bet it
>> will show up tomorrow once I wrote that.. :))
>> Terje
>>
>> On Tue, Apr 26, 2011 at 12:44 AM, Sanjeev Kulkarni 
>> wrote:
>>>
>>> Hi Sylvain,
>>> I started it from 0.7.4 with the patch 2376. No upgrade.
>>> Thanks!
>>>
>>> On Mon, Apr 25, 2011 at 7:48 AM, Sylvain Lebresne 
>>> wrote:
>>>>
>>>> Hi Sanjeev,
>>>>
>>>> What's the story of the cluster ? Did you started with 0.7.4, or is it
>>>> upgraded from
>>>> some earlier version ?
>>>>
>>>> On Mon, Apr 25, 2011 at 5:54 AM, Sanjeev Kulkarni 
>>>> wrote:
>>>>> Hey guys,
>>>>> Running a one node cassandra server with version 0.7.4 patched
>>>>> with https://issues.apache.org/jira/browse/CASSANDRA-2376
>>>>> The system was running fine for a couple of days when we started
>>>>> noticing
>>>>> something strange with cassandra. I stopped all applications and
>>>>> restarted
>>>>> cassandra. And then did a scrub. During scrub, I noticed these in the
>>>>> logs
>>>>> WARN [CompactionExecutor:1] 2011-04-24 23:37:07,561
>>>>> CompactionManager.java
>>>>> (line 607) Non-fatal error reading row (stacktrace follows)
>>>>> java.io.IOError: java.io.IOException: Impossible row size
>>>>> 1516029079813320210
>>>>> at
>>>>>
>>>>> org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:589)
>>>>> at
>>>>>
>>>>> org.apache.cassandra.db.CompactionManager.access$600(CompactionManager.java:56)
>>>>>at
>>>>>
>>>>> org.apache.cassandra.db.CompactionManager$3.call(CompactionManager.java:195)
>>>>> at
>>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>>>at
>>>>> java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>>> at
>>>>>
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>>>at
>>>>>
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>>>at java.lang.Thread.run(Thread.java:662)
>>>>> Caused by: java.io.IOException: Impossible row size 1516029079813320210
>>>>>... 8 more
>>>>>  INFO [CompactionExecutor:1] 2011-04-24 23:37:07,640
>>>>> CompactionManager.java
>>>>> (line 613) Retrying from row index; data is -1768177699 bytes starting
>>>>> at
>>>>> 2626524914
>>>>>  WARN [CompactionExecutor:1] 2011-04-24 23:37:07,641
>>>>> CompactionManager.java
>>>>> (line 633) Retry failed too.  Skipping to next row (retry's stacktrace
>>>>> follows)
>>>>> java.io.IOError: java.io.EOFException: bloom filter claims to be
>>>>> 1868982636
>>>>> bytes, longer than entire row size -1768177699at
>>>>>
>>>>> org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:117)
>>>>> at
>>>>>
>>>>> org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:618)
>>>>>at
>>>>>
>>>>> org.apache.cassandra.db.CompactionManager.access$600(CompactionMana

Re: 0.7.4 Bad sstables?

2011-04-25 Thread Sanjeev Kulkarni
Hi Sylvain,
I started it from 0.7.4 with the patch 2376. No upgrade.
Thanks!

On Mon, Apr 25, 2011 at 7:48 AM, Sylvain Lebresne wrote:

> Hi Sanjeev,
>
> What's the story of the cluster ? Did you started with 0.7.4, or is it
> upgraded from
> some earlier version ?
>
> On Mon, Apr 25, 2011 at 5:54 AM, Sanjeev Kulkarni 
> wrote:
> > Hey guys,
> > Running a one node cassandra server with version 0.7.4 patched
> > with https://issues.apache.org/jira/browse/CASSANDRA-2376
> > The system was running fine for a couple of days when we started noticing
> > something strange with cassandra. I stopped all applications and
> restarted
> > cassandra. And then did a scrub. During scrub, I noticed these in the
> logs
> > WARN [CompactionExecutor:1] 2011-04-24 23:37:07,561
> CompactionManager.java
> > (line 607) Non-fatal error reading row (stacktrace follows)
> > java.io.IOError: java.io.IOException: Impossible row size
> > 1516029079813320210
> > at
> >
> org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:589)
> > at
> >
> org.apache.cassandra.db.CompactionManager.access$600(CompactionManager.java:56)
> >at
> >
> org.apache.cassandra.db.CompactionManager$3.call(CompactionManager.java:195)
> > at
> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>  at
> > java.util.concurrent.FutureTask.run(FutureTask.java:138)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> >at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> >at java.lang.Thread.run(Thread.java:662)
> > Caused by: java.io.IOException: Impossible row size 1516029079813320210
> >... 8 more
> >  INFO [CompactionExecutor:1] 2011-04-24 23:37:07,640
> CompactionManager.java
> > (line 613) Retrying from row index; data is -1768177699 bytes starting at
> > 2626524914
> >  WARN [CompactionExecutor:1] 2011-04-24 23:37:07,641
> CompactionManager.java
> > (line 633) Retry failed too.  Skipping to next row (retry's stacktrace
> > follows)
> > java.io.IOError: java.io.EOFException: bloom filter claims to be
> 1868982636
> > bytes, longer than entire row size -1768177699at
> >
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:117)
> > at
> >
> org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:618)
> >at
> >
> org.apache.cassandra.db.CompactionManager.access$600(CompactionManager.java:56)
> > at
> >
> org.apache.cassandra.db.CompactionManager$3.call(CompactionManager.java:195)
> >at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> > at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >  at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> >at java.lang.Thread.run(Thread.java:662)
> > Caused by: java.io.EOFException: bloom filter claims to be 1868982636
> bytes,
> > longer than entire row size -1768177699at
> >
> org.apache.cassandra.io.sstable.IndexHelper.defreezeBloomFilter(IndexHelper.java:116)
> > at
> >
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:87)
> >... 8 more
> > WARN [CompactionExecutor:1] 2011-04-24 23:37:16,545
> CompactionManager.java
> > (line 607) Non-fatal error reading row (stacktrace follows)
> > java.io.IOError: java.io.EOFException
> > at
> >
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:144)
> > at
> >
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:40)
> > at
> >
> org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
> > at
> >
> org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
> > at
> >
> org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
> > at
> >
> org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)
> > at
> >
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
> > at
> >
> com.google.co

0.7.4 Bad sstables?

2011-04-24 Thread Sanjeev Kulkarni
Hey guys,
Running a one node cassandra server with version 0.7.4 patched with
https://issues.apache.org/jira/browse/CASSANDRA-2376

The system was running fine for a couple of days when we started noticing
something strange with cassandra. I stopped all applications and restarted
cassandra. And then did a scrub. During scrub, I noticed these in the logs

WARN [CompactionExecutor:1] 2011-04-24 23:37:07,561 CompactionManager.java
(line 607) Non-fatal error reading row (stacktrace follows)
java.io.IOError: java.io.IOException: Impossible row size
1516029079813320210
at
org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:589)
at
org.apache.cassandra.db.CompactionManager.access$600(CompactionManager.java:56)
   at
org.apache.cassandra.db.CompactionManager$3.call(CompactionManager.java:195)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)at
java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Impossible row size 1516029079813320210
   ... 8 more
 INFO [CompactionExecutor:1] 2011-04-24 23:37:07,640 CompactionManager.java
(line 613) Retrying from row index; data is -1768177699 bytes starting at
2626524914
 WARN [CompactionExecutor:1] 2011-04-24 23:37:07,641 CompactionManager.java
(line 633) Retry failed too.  Skipping to next row (retry's stacktrace
follows)
java.io.IOError: java.io.EOFException: bloom filter claims to be 1868982636
bytes, longer than entire row size -1768177699at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:117)
at
org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:618)
   at
org.apache.cassandra.db.CompactionManager.access$600(CompactionManager.java:56)
at
org.apache.cassandra.db.CompactionManager$3.call(CompactionManager.java:195)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.EOFException: bloom filter claims to be 1868982636 bytes,
longer than entire row size -1768177699at
org.apache.cassandra.io.sstable.IndexHelper.defreezeBloomFilter(IndexHelper.java:116)
at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:87)
   ... 8 more

WARN [CompactionExecutor:1] 2011-04-24 23:37:16,545 CompactionManager.java
(line 607) Non-fatal error reading row (stacktrace follows)
java.io.IOError: java.io.EOFException
at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:144)
at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:40)
at
org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
at
org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
at
org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
at
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
at
com.google.common.collect.Iterators$7.computeNext(Iterators.java:604)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
at
org.apache.cassandra.db.ColumnIndexer.serializeInternal(ColumnIndexer.java:76)
at
org.apache.cassandra.db.ColumnIndexer.serialize(ColumnIndexer.java:50)
at
org.apache.cassandra.io.LazilyCompactedRow.(LazilyCompactedRow.java:90)
at
org.apache.cassandra.db.CompactionManager.getCompactedRow(CompactionManager.java:778)
at
org.apache.cassandra.db.CompactionManager.doScrub(CompactionManager.java:591)
at
org.apache.cassandra.db.CompactionManager.access$600(CompactionManager.java:56)
at
org.apache.cassandra.db.CompactionManager$3.call(CompactionManager.java:195)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja

Re: Cassandra Crash upon restart from hard system crash

2011-03-25 Thread Sanjeev Kulkarni
Hey Jonathan,
Thanks for the response. I applied the patch to 0.7.4 and things have
started working again nicely.
Looks like this fix is going in 0.7.5. Any idea when 0.7.5 will be released?
Thanks again!

On Wed, Mar 23, 2011 at 9:56 PM, Jonathan Ellis  wrote:

> This looks like a bug
> (https://issues.apache.org/jira/browse/CASSANDRA-2376), but not one
> that would cause a crash. Actual process death is only caused by (a)
> running out of memory or (2) JVM bugs.
>
> On Wed, Mar 23, 2011 at 9:17 PM, Sanjeev Kulkarni 
> wrote:
> > Hey guys,
> > I have a one node system(with replication factor of 1) running cassandra.
> > The machine has two disks. One is used as the commitlog and the other as
> > cassandra's data directory. The node just had gotten unresponsive and had
> to
> > be hard rebooted.
> > After restart, cassandra started off fine. But when I run a process that
> > reads from it, after a while cassandra crashes. The log is attached.
> > This is a new 0.7.4 installation and not a upgrade. I have run nodetool
> > scrub and clean which ran fine. fsck on disks is also fine.
> > Any help would be appreciated.
> > ERROR [ReadStage:5] 2011-03-23 22:14:47,779 AbstractCassandraDaemon.java
> > (line 112) Fatal exception in thread Thread[ReadStage:5,5,main]
> > java.lang.AssertionError
> > at
> >
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:176)
> > at
> >
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
> > at
> >
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:72)
> > at
> >
> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:58)
> > at
> >
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
> > at
> >
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1353)
> > at
> >
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1245)
> > at
> >
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1173)
> > at org.apache.cassandra.db.Table.getRow(Table.java:333)
> > at
> >
> org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
> > at
> >
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
> > at
> > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > at java.lang.Thread.run(Thread.java:662)
> > ERROR [ReadStage:2] 2011-03-23 22:14:47,781 AbstractCassandraDaemon.java
> > (line 112) Fatal exception in thread Thread[ReadStage:2,5,main]
> > java.lang.AssertionError
> > at
> >
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:176)
> > at
> >
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
> > at
> >
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:72)
> > at
> >
> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:58)
> > at
> >
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
> > at
> >
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1353)
> > at
> >
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1245)
> > at
> >
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1173)
> > at org.apache.cassandra.db.Table.getRow(Table.java:333)
> > at
> >
> org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
> > at
> >
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
> > at
> > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > at
> >
>

Cassandra Crash upon restart from hard system crash

2011-03-23 Thread Sanjeev Kulkarni
Hey guys,
I have a one node system(with replication factor of 1) running cassandra.
The machine has two disks. One is used as the commitlog and the other as
cassandra's data directory. The node just had gotten unresponsive and had to
be hard rebooted.
After restart, cassandra started off fine. But when I run a process that
reads from it, after a while cassandra crashes. The log is attached.
This is a new 0.7.4 installation and not a upgrade. I have run nodetool
scrub and clean which ran fine. fsck on disks is also fine.
Any help would be appreciated.

ERROR [ReadStage:5] 2011-03-23 22:14:47,779 AbstractCassandraDaemon.java
(line 112) Fatal exception in thread Thread[ReadStage:5,5,main]
java.lang.AssertionError
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:176)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:72)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:58)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1353)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1245)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1173)
at org.apache.cassandra.db.Table.getRow(Table.java:333)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
ERROR [ReadStage:2] 2011-03-23 22:14:47,781 AbstractCassandraDaemon.java
(line 112) Fatal exception in thread Thread[ReadStage:2,5,main]
java.lang.AssertionError
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:176)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:72)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:58)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1353)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1245)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1173)
at org.apache.cassandra.db.Table.getRow(Table.java:333)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
ERROR [ReadStage:28] 2011-03-23 22:14:47,779 AbstractCassandraDaemon.java
(line 112) Fatal exception in thread Thread[ReadStage:28,5,main]
java.lang.AssertionError
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:176)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:72)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:58)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1353)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1245)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1173)
at org.apache.cassandra.db.Table.getRow(Table.java:333)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)

Re: Cassandra Crash

2011-03-15 Thread Sanjeev Kulkarni
Hey Jonathan,
Thanks for the reply.
I was earlier running 0.7.2 and upgraded it to 0.7.3. Looks like I had to
run the nodetool scrub command to sanitize the sstables because of the
bloomfilter bug. I did that and the Assert error went away but I'm getting
Java Heap Space Out of Memory error. I again upgraded to 0.7.4 which is just
released but the OOM crash remains.
The log is attached below. Row caching is disabled and key caching is set to
default(20). The max heap space that I'm giving is pretty large(6G). Do
you think reducing the key caching will help?
Thanks again!

java.lang.OutOfMemoryError: Java heap space
at
org.apache.cassandra.io.util.BufferedRandomAccessFile.readBytes(BufferedRandomAccessFile.java:269)
at
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:315)
at
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:272)
at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:180)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:130)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:72)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:58)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1353)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1245)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1173)
at org.apache.cassandra.db.Table.getRow(Table.java:333)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)



On Tue, Mar 15, 2011 at 2:27 PM, Jonathan Ellis  wrote:

> Did you upgrade from an earlier version?  Did you read NEWS.txt?
>
> On Tue, Mar 15, 2011 at 4:21 PM, Sanjeev Kulkarni 
> wrote:
> > Hey guys,
> > Have started facing a crash in my cassandra while reading. Here are the
> > details.
> > 1. single node. replication factor of 1
> > 2. Cassandra version 0.7.3
> > 3. Single keyspace. 5 column families.
> > 4. No super columns
> > 5. My data model is a little bit skewed. It results in having several
> small
> > rows and one really big row(lots of columns). cfstats say that on one
> column
> > family the Compacted row maximum size is 5960319812. Not sure if this is
> the
> > problem.
> > 6. Starting cassandra has no issues. I give a max heap size of 6G.
> > 7. I then start reading a bunch of rows including the really long row.
> After
> > some point cassandra starts crashing.
> > The log is
> > ERROR [ReadStage:30] 2011-03-15 16:52:27,598 AbstractCassandraDaemon.java
> > (line 114) Fatal exception in thread Thread[ReadStage:30,5,main]
> > java.lang.AssertionError
> > at
> >
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:180)
> > at
> >
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:134)
> > at
> >
> org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:72)
> > at
> >
> org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
> > at
> >
> org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
> > at
> >
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1311)
> > at
> >
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1203)
> > at
> >
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
> > at org.apache.cassandra.db.Table.getRow(Table.java:333)
> > at
> >
> org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)

Cassandra Crash

2011-03-15 Thread Sanjeev Kulkarni
Hey guys,
Have started facing a crash in my cassandra while reading. Here are the
details.
1. single node. replication factor of 1
2. Cassandra version 0.7.3
3. Single keyspace. 5 column families.
4. No super columns
5. My data model is a little bit skewed. It results in having several small
rows and one really big row(lots of columns). cfstats say that on one column
family the Compacted row maximum size is 5960319812. Not sure if this is the
problem.
6. Starting cassandra has no issues. I give a max heap size of 6G.
7. I then start reading a bunch of rows including the really long row. After
some point cassandra starts crashing.
The log is
ERROR [ReadStage:30] 2011-03-15 16:52:27,598 AbstractCassandraDaemon.java
(line 114) Fatal exception in thread Thread[ReadStage:30,5,main]
java.lang.AssertionError
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:180)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:134)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:72)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1311)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1203)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
at org.apache.cassandra.db.Table.getRow(Table.java:333)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
ERROR [ReadStage:13] 2011-03-15 16:52:27,600 AbstractCassandraDaemon.java
(line 114) Fatal exception in thread Thread[ReadStage:13,5,main]
java.lang.AssertionError
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.readIndexedColumns(SSTableNamesIterator.java:180)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:134)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:72)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:59)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:80)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1311)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1203)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1131)
at org.apache.cassandra.db.Table.getRow(Table.java:333)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:60)
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:453)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

and several more of these errors

Thanks for the help. Let me know if you need more info.