Re: What is the correct way of changing a partitioner?

2010-10-18 Thread Dan Washusen
http://wiki.apache.org/cassandra/DistributedDeletes

>From the http://wiki.apache.org/cassandra/StorageConfiguration page:

> Achtung! Changing this parameter requires wiping your data directories,
> since the partitioner can modify the !sstable on-disk format.


So delete your data and commit log dirs after shutting down Cassandra...

On Tue, Oct 19, 2010 at 4:09 PM, Wicked J  wrote:

> Hi,
> I deleted all the data (programmatically). Then I changed the partitioner
> from RandomPartitioner to OrderPreservingPartitioner and when I started
> Cassandra - I get the following error. What is the correct way of changing
> the partitioner and how can I get past this error?
>
> ERROR 17:28:28,985 Fatal exception during initialization
> java.io.IOException: Found system table files, but they couldn't be loaded.
> Did you change the partitioner?
> at
> org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:154)
> at
> org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:94)
> at
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:211)
>
> Thanks!


RE: Preventing an update of a CF row

2010-10-18 Thread Viktor Jevdokimov
Nice and simple!

-Original Message-
From: Oleg Anastasyev [mailto:olega...@gmail.com] 
Sent: Tuesday, October 19, 2010 9:00 AM
To: user@cassandra.apache.org
Subject: Re: Preventing an update of a CF row

kannan chandrasekaran  yahoo.com> writes:

> Hi All,I have a query regarding the insert operation. The insert operation by
default  inserts an new row or updates an existing row. Is it possible to
prevent an update but allow only inserts automatically  ( especially when
multiple clients are writing to cassandra)?  I was wondering if there is any
flag in cassandra that will validate this for me automatically ( something like
unique key constraint) ? If not, is it non-trivial to implement this ? Any
suggestions would be helpful.ThanksKannan

Always specify some constant value for timestamp. Only 1st insertion with that
timestamp will succeed. Others will be ignored, because will be considered
duplicates by cassandra.




Re: Preventing an update of a CF row

2010-10-18 Thread Oleg Anastasyev
kannan chandrasekaran  yahoo.com> writes:

> Hi All,I have a query regarding the insert operation. The insert operation by
default  inserts an new row or updates an existing row. Is it possible to
prevent an update but allow only inserts automatically  ( especially when
multiple clients are writing to cassandra)?  I was wondering if there is any
flag in cassandra that will validate this for me automatically ( something like
unique key constraint) ? If not, is it non-trivial to implement this ? Any
suggestions would be helpful.ThanksKannan

Always specify some constant value for timestamp. Only 1st insertion with that
timestamp will succeed. Others will be ignored, because will be considered
duplicates by cassandra.



What is the correct way of changing a partitioner?

2010-10-18 Thread Wicked J
Hi,
I deleted all the data (programmatically). Then I changed the partitioner
from RandomPartitioner to OrderPreservingPartitioner and when I started
Cassandra - I get the following error. What is the correct way of changing
the partitioner and how can I get past this error?

ERROR 17:28:28,985 Fatal exception during initialization
java.io.IOException: Found system table files, but they couldn't be loaded.
Did you change the partitioner?
at
org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:154)
at
org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:94)
at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:211)

Thanks!


Re: KeysCached - clarification and recommendations on settings

2010-10-18 Thread Aaron Morton
The lack of swearing did seem odd. AaronOn 19 Oct, 2010,at 12:41 PM, Brandon Williams  wrote:On Mon, Oct 18, 2010 at 3:21 PM, Aaron Morton  wrote:

Also have a read of this slide deck from Ben Black, caches are talked about from slide 35 onhttp://www.slideshare.net/driftx/cassandra-summit-2010-performance-tuning

There is also a video linked here http://www.riptano.com/blog/slides-and-videos-cassandra-summit-2010

Ben's decks are cooler.  That one is mine :)-Brandon


Re: Thift version

2010-10-18 Thread Jeremy Hanna
Check the source distro's lib directory - 
http://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.6.6/lib/ - it has 
libthrift-r917130.jar - so it's a build of thrift with revision 917130 of 
thrift.  There has been talk of stabilizing on a release of thrift, but that's 
not yet established.  Generally, you shouldn't need the build the thrift 
bindings.  However, if you do, see 
http://wiki.apache.org/cassandra/InstallThrift

On Oct 18, 2010, at 9:50 PM, JKnight JKnight wrote:

> Dear all, 
> 
> Which Thrift version does Cassandra 0.66 using?
> Thank a lot for support.
> 
> -- 
> Best regards,
> JKnight



Thift version

2010-10-18 Thread JKnight JKnight
Dear all,

Which Thrift version does Cassandra 0.66 using?
Thank a lot for support.

-- 
Best regards,
JKnight


Re: KeysCached - clarification and recommendations on settings

2010-10-18 Thread Brandon Williams
On Mon, Oct 18, 2010 at 3:21 PM, Aaron Morton wrote:

> Also have a read of this slide deck from Ben Black, caches are talked about
> from slide 35 on
>
> http://www.slideshare.net/driftx/cassandra-summit-2010-performance-tuning
>
> There is also a video linked here
> http://www.riptano.com/blog/slides-and-videos-cassandra-summit-2010
>

Ben's decks are cooler.  That one is mine :)

-Brandon


Re: KeysCached - clarification and recommendations on settings

2010-10-18 Thread Peter Schuller
> Is your entire keyset active? If not set a sane starting point (default for
> key cache is 200,000 http://wiki.apache.org/cassandra/StorageConfiguration )
>  and see what the cache hit's are like. How many keys do you have? What
> was your hit rate with 100% key cache?

Also, keep in mind that the key cache will only eliminate one seek
(finding row position in the index is exactly one seek, unless cached
by the OS). Even if you dedicate your entire memory to JVM heap and
fill it with key cache, you will never do better than avoiding the
*one* seek per read. If your entire memory is wasted on key cache,
you'll take the row seek anyway so you only eliminated at most half
the overhead.

In the best case, the row is cached and the key saved you from going
to disk. In such a case, the key cache gave you quite a lot. But keep
in mind that if your data size is such that most row reads are cached
by the OS, then probably most index accesses would be too assuming the
rows are significantly bigger than the index (which is normal).

I'd say the key cache is most effective when your active set is small
enough that a reasonably sized key cache will eliminate the majority
of seeks on reads without blowing away significant amounts of memory.
Especially now in 0.7 (is it backported to 0.6.x?) where the key cache
is efficiently saved and re-loaded on start, giving you guaranteed
hotness of the key cache. Also the bigger discrepancy between row size
and key size, the more useful I would expect the key cache to be
(i.e., the fatter the rows, the more useful the key cache).

Ok, that was a bit unclearly stated... I'm not sure how to phrase it
sensibly. I guess the bottom line is that unless you specifically know
you need to and there are special circumstances, the key cache should
likely not be huge in comparison to available memory.

-- 
/ Peter Schuller


Re: Read Latency

2010-10-18 Thread Jonathan Ellis
v3 attached to fix assertion failure.

Those last seconds after "sending to client" aren't fixable except
perhaps by using a faster client language.  That is, the client is
probably doing read-and-deserialize as one step rather than 2, so
being slow in the deserialize part means the server is sitting there
waiting to send the rest of a large response.  But while that's
happening it's using negligible resources on the server side so it's
not the end of the world for a multithreaded app.

On Mon, Oct 18, 2010 at 2:37 PM, Wayne  wrote:
> It is much faster again, but the last 3 seconds still linger. There a lot
> more logout complete messages getting logged all of the time, but the one
> below matches in time to the time the client registers it takes to get the
> data back. It also produces errors sometimes...the error example is listed
> below.
>
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,866 CassandraServer.java (line
> 219) get_slice
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,867 StorageProxy.java (line
> 471) strongread reading data for SliceFromReadCommand(table='table',
> key='key1', column_parent='QueryPath(columnFamilyName='fact',
> superColumnName='null', columnName='null')', start='503a', finish='503a7c',
> reversed=false, count=1000) from 698@/x.x.x.6
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,867 StorageProxy.java (line
> 471) strongread reading digest for SliceFromReadCommand(table='table',
> key='key1', column_parent='QueryPath(columnFamilyName='fact',
> superColumnName='null', columnName='null')', start='503a', finish='503a7c',
> reversed=false, count=1000) from 699@/x.x.x.7
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,867 StorageProxy.java (line
> 471) strongread reading digest for SliceFromReadCommand(table='table',
> key='key1', column_parent='QueryPath(columnFamilyName='fact',
> superColumnName='null', columnName='null')', start='503a', finish='503a7c',
> reversed=false, count=1000) from 699@/x.x.x.8
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:29,410 CassandraServer.java (line
> 667) logout complete
> DEBUG [RESPONSE-STAGE:5] 2010-10-18 19:25:30,864 ResponseVerbHandler.java
> (line 42) Processing response on a callback from
> 0952ED39-07CE-7971-8F06-0D611FCB5F34@/x.x.x.6
> DEBUG [RESPONSE-STAGE:6] 2010-10-18 19:25:31,449 ResponseVerbHandler.java
> (line 42) Processing response on a callback from
> 0952ED39-07CE-7971-8F06-0D611FCB5F34@/x.x.x.8
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,449 ReadResponseResolver.java
> (line 71) resolving 2 responses
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,608 ReadResponseResolver.java
> (line 116) digests verified
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,609 ReadResponseResolver.java
> (line 133) resolve: 160 ms.
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,609 StorageProxy.java (line
> 494) quorumResponseHandler: 2742 ms.
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,667 CassandraServer.java (line
> 191) Slice converted to thrift; sending to client
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:32,044 CassandraServer.java (line
> 667) logout complete
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:32,238 CassandraServer.java (line
> 667) logout complete
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:32,377 CassandraServer.java (line
> 667) logout complete
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:33,890 CassandraServer.java (line
> 667) logout complete
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:34,401 CassandraServer.java (line
> 667) logout complete
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:34,538 CassandraServer.java (line
> 667) logout complete
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:34,925 CassandraServer.java (line
> 667) logout complete
>
>
>
> Error:
>
> DEBUG [RESPONSE-STAGE:4] 2010-10-18 19:22:48,355 ResponseVerbHandler.java
> (line 42) Processing response on a callback from
> DDF1D615-BF1E-2CD1-34C6-96FA47D63259@/x.x.x.8
> ERROR [RESPONSE-STAGE:4] 2010-10-18 19:22:48,356 CassandraDaemon.java (line
> 87) Uncaught exception in thread Thread[RESPONSE-STAGE:4,5,main]
> java.lang.AssertionError
>     at
> org.apache.cassandra.service.ReadResponseResolver.getResult(ReadResponseResolver.java:218)
>     at
> org.apache.cassandra.service.ReadResponseResolver.isDataPresent(ReadResponseResolver.java:209)
>     at
> org.apache.cassandra.service.QuorumResponseHandler.response(QuorumResponseHandler.java:93)
>     at
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:44)
>     at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:49)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>     at java.lang.Thread.run(Thread.java:619)
> DEBUG [pool-1-thread-33] 2010-10-18 19:22:48,380 ReadResponseResolver.java
> (line 71) resolving 2 responses
> DEBUG [pool-1-thread-33] 2010-10-18 19:22:48,542 ReadResponseResolver.java
> (line 116) digests 

Re: Read Latency

2010-10-18 Thread Jonathan Ellis
v3 attached to fix assertion failure.

Those last seconds after "sending to client" aren't fixable except
perhaps by using a faster client language.  That is, the client is
probably doing read-and-deserialize as one step rather than 2, so
being slow in the deserialize part means the server is sitting there
waiting to send the rest of a large response.  But while that's
happening it's using negligible resources on the server side so it's
not the end of the world for a multithreaded app.

On Mon, Oct 18, 2010 at 2:37 PM, Wayne  wrote:
> It is much faster again, but the last 3 seconds still linger. There a lot
> more logout complete messages getting logged all of the time, but the one
> below matches in time to the time the client registers it takes to get the
> data back. It also produces errors sometimes...the error example is listed
> below.
>
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,866 CassandraServer.java (line
> 219) get_slice
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,867 StorageProxy.java (line
> 471) strongread reading data for SliceFromReadCommand(table='table',
> key='key1', column_parent='QueryPath(columnFamilyName='fact',
> superColumnName='null', columnName='null')', start='503a', finish='503a7c',
> reversed=false, count=1000) from 698@/x.x.x.6
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,867 StorageProxy.java (line
> 471) strongread reading digest for SliceFromReadCommand(table='table',
> key='key1', column_parent='QueryPath(columnFamilyName='fact',
> superColumnName='null', columnName='null')', start='503a', finish='503a7c',
> reversed=false, count=1000) from 699@/x.x.x.7
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,867 StorageProxy.java (line
> 471) strongread reading digest for SliceFromReadCommand(table='table',
> key='key1', column_parent='QueryPath(columnFamilyName='fact',
> superColumnName='null', columnName='null')', start='503a', finish='503a7c',
> reversed=false, count=1000) from 699@/x.x.x.8
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:29,410 CassandraServer.java (line
> 667) logout complete
> DEBUG [RESPONSE-STAGE:5] 2010-10-18 19:25:30,864 ResponseVerbHandler.java
> (line 42) Processing response on a callback from
> 0952ED39-07CE-7971-8F06-0D611FCB5F34@/x.x.x.6
> DEBUG [RESPONSE-STAGE:6] 2010-10-18 19:25:31,449 ResponseVerbHandler.java
> (line 42) Processing response on a callback from
> 0952ED39-07CE-7971-8F06-0D611FCB5F34@/x.x.x.8
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,449 ReadResponseResolver.java
> (line 71) resolving 2 responses
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,608 ReadResponseResolver.java
> (line 116) digests verified
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,609 ReadResponseResolver.java
> (line 133) resolve: 160 ms.
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,609 StorageProxy.java (line
> 494) quorumResponseHandler: 2742 ms.
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,667 CassandraServer.java (line
> 191) Slice converted to thrift; sending to client
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:32,044 CassandraServer.java (line
> 667) logout complete
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:32,238 CassandraServer.java (line
> 667) logout complete
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:32,377 CassandraServer.java (line
> 667) logout complete
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:33,890 CassandraServer.java (line
> 667) logout complete
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:34,401 CassandraServer.java (line
> 667) logout complete
> DEBUG [pool-1-thread-63] 2010-10-18 19:25:34,538 CassandraServer.java (line
> 667) logout complete
> DEBUG [pool-1-thread-64] 2010-10-18 19:25:34,925 CassandraServer.java (line
> 667) logout complete
>
>
>
> Error:
>
> DEBUG [RESPONSE-STAGE:4] 2010-10-18 19:22:48,355 ResponseVerbHandler.java
> (line 42) Processing response on a callback from
> DDF1D615-BF1E-2CD1-34C6-96FA47D63259@/x.x.x.8
> ERROR [RESPONSE-STAGE:4] 2010-10-18 19:22:48,356 CassandraDaemon.java (line
> 87) Uncaught exception in thread Thread[RESPONSE-STAGE:4,5,main]
> java.lang.AssertionError
>     at
> org.apache.cassandra.service.ReadResponseResolver.getResult(ReadResponseResolver.java:218)
>     at
> org.apache.cassandra.service.ReadResponseResolver.isDataPresent(ReadResponseResolver.java:209)
>     at
> org.apache.cassandra.service.QuorumResponseHandler.response(QuorumResponseHandler.java:93)
>     at
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:44)
>     at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:49)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>     at java.lang.Thread.run(Thread.java:619)
> DEBUG [pool-1-thread-33] 2010-10-18 19:22:48,380 ReadResponseResolver.java
> (line 71) resolving 2 responses
> DEBUG [pool-1-thread-33] 2010-10-18 19:22:48,542 ReadResponseResolver.java
> (line 116) digests 

Re: KeysCached - clarification and recommendations on settings

2010-10-18 Thread Aaron Morton
Also have a read of this slide deck from Ben Black, caches are talked about from slide 35 onhttp://www.slideshare.net/driftx/cassandra-summit-2010-performance-tuningThere is also a video linked here http://www.riptano.com/blog/slides-and-videos-cassandra-summit-2010AaronOn 19 Oct, 2010,at 09:10 AM, Aaron Morton  wrote:AFAIK the caches do not expire entries under memory pressure, they just hold the number of entries you specify or are unbound in the case of 100%. The caches is based on the ConcurrentLinkedHashMap (http://code.google.com/p/concurrentlinkedhashmap/) and uses the SecondChance eviction policy " to keep all keys in memory on all boxes in the cluster" - Not sure exactly what your are saying, but only the keys and rows stored on the node are kept in the nodes caches. Is your entire keyset active? If not set a sane starting point (default for key cache is 200,000 http://wiki.apache.org/cassandra/StorageConfiguration )  and see what the cache hit's are like. How many keys do you have? What was your hit rate with 100% key cache? There is a discussion here about cache and heap sizing http://www.mail-archive.com/user@cassandra.apache.org/msg04704.html With 16GB on the box, 3GB Heap seems a bit small. The cache settings are an easy way to shot yourself in foot. The best approach is to start small and increase as needed. Using 100% will mean your shoe budget will soon be cut in half :)Hope that helps. AaronOn 19 Oct, 2010,at 02:02 AM, Jedd Rashbrooke  wrote: Greetings,

 I would like to check my understanding is accurate on how
 KeysCached is understood by Cassandra (0.6.5), and then
 get some suggestions on settings / OS FS cache interplay.

 First - my initial understanding was that if you set KeysCached
 to 100%, Cassandra would do a best effort to keep all keys
 in memory on all boxes in the cluster - but it would let some
 expire if it was going to run out of memory.  *Now* I understand
 that it actually just exhausts all its memory and falls over if you
 set this value unrealistically high.  Is this right?

 Assuming this is right ... I'm curious on the optimum settings
 for KeysCached.  Previously we've had problems here with
 boxes doing massive STW GC's, among other things - and
 we've been working towards a configuration that is a bit more
 stable and predictable, if not as blazingly fast as a brand new
 cluster unburdened by much data but with a configuration that
 means it'll drive itself into oblivion after several days.  ;)

 With that in mind, we're trying to keep JVM heap small - about
 3GB is our sweet spot so far - after experimenting with numbers
 as big as 30GB.   Even at 30GB we'd be unable to have 100%
 KeysCached anyway - so it was always going to require a
 tradeoff - and now we're trying to guess at the best number.

 We're using boxes with 16GB, and have an expectation that
 the OS will cache a lot of this stuff pretty intelligently - but I
 fear that compactions and other activities will mean that the
 keys won't always have priority for FS cache.  Short of catting
 the index files to /dev/null periodically (not a serious suggestion)
 has anyone gleaned some insights on the best tradeoffs for
 KeysCached where the raw size of your Keys is going to be
 at least 10x the size of your available memory?  Do you go
 small and hope the OS saves you, or do you try to go as big
 (but finite) as your JVM will usefully let you and hope
 Cassandra caches the best set of keys for your usage profile?

 taa,
 Jedd.


Re: TimeUUID makes me crazy

2010-10-18 Thread Aaron Morton
Whats the call stack for the error and what client you using ? Is the error client side or server side? Is there an error in the server side log?My guess is there is something wrong in the way your are creating the UUID for the colum and super column names, rather than for the key. You can use UUID for the key as a string (or byte array in 0.7) so a badly formatted value will not matter. AaronOn 19 Oct, 2010,at 05:25 AM, "cbert...@libero.it"  wrote:I am getting crazy using TimeUUID in cassandra via Java. I've read the FAQ but 
it didn't help.
Can I use a TimeUUID as ROW identifier? (if converted to string)

I have a CF like this and SCF like these:


TIMEUUID OPECID (ROW) {
 phone: 123
 address: street xyz
}


CompareSubcolumnsWith="BytesType" />
String USERID (ROW) {
TIMEUUID OPECID (SuperColumnName)  {
collection of columns;
 }
}

In one situation the TimeUUID is a ROW identifier while in another is the 
SuperColumn name. I get many "UUID must be a 16 byte" when I try to read a data 
that did not give any exception during his save.

at a Time T0 this one works: mutator.writeColumns(UuidHelper.timeUuidFromBytes
(OpecID).toString(), opecfamily, notNull); // (notnull contains a list of 
columns also opecstatus)

Immediately after this one raise an exception: selector.getColumnFromRow
(UuidHelper.timeUuidFromBytes(OpecID).toString(), opecfamily, "opecstatus", 
ConsistencyLevel.ONE)

I hope that someone help me understanding it ..



Re: KeysCached - clarification and recommendations on settings

2010-10-18 Thread Aaron Morton
AFAIK the caches do not expire entries under memory pressure, they just hold the number of entries you specify or are unbound in the case of 100%. The caches is based on the ConcurrentLinkedHashMap (http://code.google.com/p/concurrentlinkedhashmap/) and uses the SecondChance eviction policy. " to keep all keys in memory on all boxes in the cluster" - Not sure exactly what your are saying, but only the keys and rows stored on the node are kept in the nodes caches. Is your entire keyset active? If not set a sane starting point (default for key cache is 200,000 http://wiki.apache.org/cassandra/StorageConfiguration )  and see what the cache hit's are like. How many keys do you have? What was your hit rate with 100% key cache? There is a discussion here about cache and heap sizing http://www.mail-archive.com/user@cassandra.apache.org/msg04704.html With 16GB on the box, 3GB Heap seems a bit small. The cache settings are an easy way to shot yourself in foot. The best approach is to start small and increase as needed. Using 100% will mean your shoe budget will soon be cut in half :)Hope that helps. AaronOn 19 Oct, 2010,at 02:02 AM, Jedd Rashbrooke  wrote: Greetings,

 I would like to check my understanding is accurate on how
 KeysCached is understood by Cassandra (0.6.5), and then
 get some suggestions on settings / OS FS cache interplay.

 First - my initial understanding was that if you set KeysCached
 to 100%, Cassandra would do a best effort to keep all keys
 in memory on all boxes in the cluster - but it would let some
 expire if it was going to run out of memory.  *Now* I understand
 that it actually just exhausts all its memory and falls over if you
 set this value unrealistically high.  Is this right?

 Assuming this is right ... I'm curious on the optimum settings
 for KeysCached.  Previously we've had problems here with
 boxes doing massive STW GC's, among other things - and
 we've been working towards a configuration that is a bit more
 stable and predictable, if not as blazingly fast as a brand new
 cluster unburdened by much data but with a configuration that
 means it'll drive itself into oblivion after several days.  ;)

 With that in mind, we're trying to keep JVM heap small - about
 3GB is our sweet spot so far - after experimenting with numbers
 as big as 30GB.   Even at 30GB we'd be unable to have 100%
 KeysCached anyway - so it was always going to require a
 tradeoff - and now we're trying to guess at the best number.

 We're using boxes with 16GB, and have an expectation that
 the OS will cache a lot of this stuff pretty intelligently - but I
 fear that compactions and other activities will mean that the
 keys won't always have priority for FS cache.  Short of catting
 the index files to /dev/null periodically (not a serious suggestion)
 has anyone gleaned some insights on the best tradeoffs for
 KeysCached where the raw size of your Keys is going to be
 at least 10x the size of your available memory?  Do you go
 small and hope the OS saves you, or do you try to go as big
 (but finite) as your JVM will usefully let you and hope
 Cassandra caches the best set of keys for your usage profile?

 taa,
 Jedd.


Re: Read Latency

2010-10-18 Thread Wayne
It is much faster again, but the last 3 seconds still linger. There a lot
more logout complete messages getting logged all of the time, but the one
below matches in time to the time the client registers it takes to get the
data back. It also produces errors sometimes...the error example is listed
below.

DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,866 CassandraServer.java (line
219) get_slice
DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,867 StorageProxy.java (line
471) strongread reading data for SliceFromReadCommand(table='table',
key='key1', column_parent='QueryPath(columnFamilyName='fact',
superColumnName='null', columnName='null')', start='503a', finish='503a7c',
reversed=false, count=1000) from 698@/x.x.x.6
DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,867 StorageProxy.java (line
471) strongread reading digest for SliceFromReadCommand(table='table',
key='key1', column_parent='QueryPath(columnFamilyName='fact',
superColumnName='null', columnName='null')', start='503a', finish='503a7c',
reversed=false, count=1000) from 699@/x.x.x.7
DEBUG [pool-1-thread-64] 2010-10-18 19:25:28,867 StorageProxy.java (line
471) strongread reading digest for SliceFromReadCommand(table='table',
key='key1', column_parent='QueryPath(columnFamilyName='fact',
superColumnName='null', columnName='null')', start='503a', finish='503a7c',
reversed=false, count=1000) from 699@/x.x.x.8
DEBUG [pool-1-thread-63] 2010-10-18 19:25:29,410 CassandraServer.java (line
667) logout complete
DEBUG [RESPONSE-STAGE:5] 2010-10-18 19:25:30,864 ResponseVerbHandler.java
(line 42) Processing response on a callback from
0952ED39-07CE-7971-8F06-0D611FCB5F34@/x.x.x.6
DEBUG [RESPONSE-STAGE:6] 2010-10-18 19:25:31,449 ResponseVerbHandler.java
(line 42) Processing response on a callback from
0952ED39-07CE-7971-8F06-0D611FCB5F34@/x.x.x.8
DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,449 ReadResponseResolver.java
(line 71) resolving 2 responses
DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,608 ReadResponseResolver.java
(line 116) digests verified
DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,609 ReadResponseResolver.java
(line 133) resolve: 160 ms.
DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,609 StorageProxy.java (line
494) quorumResponseHandler: 2742 ms.
DEBUG [pool-1-thread-64] 2010-10-18 19:25:31,667 CassandraServer.java (line
191) Slice converted to thrift; sending to client
DEBUG [pool-1-thread-63] 2010-10-18 19:25:32,044 CassandraServer.java (line
667) logout complete
DEBUG [pool-1-thread-63] 2010-10-18 19:25:32,238 CassandraServer.java (line
667) logout complete
DEBUG [pool-1-thread-63] 2010-10-18 19:25:32,377 CassandraServer.java (line
667) logout complete
DEBUG [pool-1-thread-63] 2010-10-18 19:25:33,890 CassandraServer.java (line
667) logout complete
DEBUG [pool-1-thread-63] 2010-10-18 19:25:34,401 CassandraServer.java (line
667) logout complete
DEBUG [pool-1-thread-63] 2010-10-18 19:25:34,538 CassandraServer.java (line
667) logout complete
DEBUG [pool-1-thread-64] 2010-10-18 19:25:34,925 CassandraServer.java (line
667) logout complete



Error:

DEBUG [RESPONSE-STAGE:4] 2010-10-18 19:22:48,355 ResponseVerbHandler.java
(line 42) Processing response on a callback from
DDF1D615-BF1E-2CD1-34C6-96FA47D63259@/x.x.x.8
ERROR [RESPONSE-STAGE:4] 2010-10-18 19:22:48,356 CassandraDaemon.java (line
87) Uncaught exception in thread Thread[RESPONSE-STAGE:4,5,main]
java.lang.AssertionError
at
org.apache.cassandra.service.ReadResponseResolver.getResult(ReadResponseResolver.java:218)
at
org.apache.cassandra.service.ReadResponseResolver.isDataPresent(ReadResponseResolver.java:209)
at
org.apache.cassandra.service.QuorumResponseHandler.response(QuorumResponseHandler.java:93)
at
org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:44)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:49)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
DEBUG [pool-1-thread-33] 2010-10-18 19:22:48,380 ReadResponseResolver.java
(line 71) resolving 2 responses
DEBUG [pool-1-thread-33] 2010-10-18 19:22:48,542 ReadResponseResolver.java
(line 116) digests verified
DEBUG [pool-1-thread-33] 2010-10-18 19:22:48,542 ReadResponseResolver.java
(line 133) resolve: 162 ms.
DEBUG [pool-1-thread-33] 2010-10-18 19:22:48,542 StorageProxy.java (line
494) quorumResponseHandler: 2688 ms.


On Sat, Oct 16, 2010 at 9:18 PM, Jonathan Ellis  wrote:

> Thanks.  Take 2 attached.
>
> On Sat, Oct 16, 2010 at 3:37 PM, Wayne  wrote:
> > ERROR [pool-1-thread-64] 2010-10-16 20:27:55,396 Cassandra.java (line
> 1280)
> > Internal error processing get_slice
> > java.lang.AssertionError
> > at
> >
> org.apache.cassandra.service.ReadResponseResolver.resolve(ReadResponseResolver.java:88)
> > at
> >
> org.apache.cassandra.service.ReadResponseResolver.resolve(ReadResponseR

Re: Cassandra security model? ( or, authentication docs ?)

2010-10-18 Thread Eric Evans
On Sun, 2010-10-17 at 21:26 -0700, Yang wrote:
> I searched around, it seems that this is not clearly documented yet;
> the closest I found is:
> http://www.riptano.com/docs/0.6.5/install/auth-config
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Authentication-td5285013.html#a5285013
> 
> I did start cassandra with the args mentioned above:
> 
> bin/cassandra -Dpasswd.properties=mypasswd.properties
> -Daccess.properties=myaccess.properties -f

Try
http://www.riptano.com/docs/0.6.5/install/storage-config#Authenticator


-- 
Eric Evans
eev...@rackspace.com



TimeUUID makes me crazy

2010-10-18 Thread cbert...@libero.it
I am getting crazy using TimeUUID in cassandra via Java. I've read the FAQ but 
it didn't help.
Can I use a TimeUUID as ROW identifier? (if converted to string)

I have a CF like this and SCF like these:


TIMEUUID OPECID (ROW) {
 phone: 123
 address: street xyz
}


String USERID (ROW) {
TIMEUUID OPECID (SuperColumnName)  {
collection of columns;
 }
}

In one situation the TimeUUID is a ROW identifier while in another is the 
SuperColumn name. I get many "UUID must be a 16 byte" when I try to read a data 
that did not give any exception during his save.

at a Time T0 this one works: mutator.writeColumns(UuidHelper.timeUuidFromBytes
(OpecID).toString(), opecfamily, notNull); // (notnull contains a list of 
columns also opecstatus)

Immediately after this one raise an exception: selector.getColumnFromRow
(UuidHelper.timeUuidFromBytes(OpecID).toString(), opecfamily, "opecstatus", 
ConsistencyLevel.ONE)

I hope that someone help me understanding it ...



KeysCached - clarification and recommendations on settings

2010-10-18 Thread Jedd Rashbrooke
 Greetings,

 I would like to check my understanding is accurate on how
 KeysCached is understood by Cassandra (0.6.5), and then
 get some suggestions on settings / OS FS cache interplay.

 First - my initial understanding was that if you set KeysCached
 to 100%, Cassandra would do a best effort to keep all keys
 in memory on all boxes in the cluster - but it would let some
 expire if it was going to run out of memory.  *Now* I understand
 that it actually just exhausts all its memory and falls over if you
 set this value unrealistically high.  Is this right?

 Assuming this is right ... I'm curious on the optimum settings
 for KeysCached.  Previously we've had problems here with
 boxes doing massive STW GC's, among other things - and
 we've been working towards a configuration that is a bit more
 stable and predictable, if not as blazingly fast as a brand new
 cluster unburdened by much data but with a configuration that
 means it'll drive itself into oblivion after several days.  ;)

 With that in mind, we're trying to keep JVM heap small - about
 3GB is our sweet spot so far - after experimenting with numbers
 as big as 30GB.   Even at 30GB we'd be unable to have 100%
 KeysCached anyway - so it was always going to require a
 tradeoff - and now we're trying to guess at the best number.

 We're using boxes with 16GB, and have an expectation that
 the OS will cache a lot of this stuff pretty intelligently - but I
 fear that compactions and other activities will mean that the
 keys won't always have priority for FS cache.  Short of catting
 the index files to /dev/null periodically (not a serious suggestion)
 has anyone gleaned some insights on the best tradeoffs for
 KeysCached where the raw size of your Keys is going to be
 at least 10x the size of your available memory?  Do you go
 small and hope the OS saves you, or do you try to go as big
 (but finite) as your JVM will usefully let you and hope
 Cassandra caches the best set of keys for your usage profile?

 taa,
 Jedd.