Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-14 Thread Sylvain Lebresne
As said by aaron, if the whole row is under 64k, it won't matter. But since you spoke of very wide row, I'll assume the whole will be much more than 64k. If so, the row is indexed by block (of 64k, configurable). Then the read performance depends on how many of those block are needed for the

Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-14 Thread Aditya Narayan
Thanks Sylvain, I guess I might have misunderstood the meaning of column_index_size_in_kb, My previous understanding about that was: it is the threshold size for a row to pass, after which its columns will be indexed. If I have understood it correctly, it implies the size of the blocks

RE: Question about seeds in tow node cluster.

2011-02-14 Thread nicolas lattuada
Hi I have a two nodes cluster running and have both of them in the seeds list. regards Date: Sun, 13 Feb 2011 16:04:41 +0800 Subject: Question about seeds in tow node cluster. From: guxiaobo1...@gmail.com To: user@cassandra.apache.org Hi, If the cluster only have tow nodes, should

Re: Question about seeds in tow node cluster.

2011-02-14 Thread Xiaobo Gu
Thanks. On Mon, Feb 14, 2011 at 7:59 PM, nicolas lattuada nicolaslattu...@hotmail.fr wrote: Hi I have a two nodes cluster running and have both of them in the seeds list. regards Date: Sun, 13 Feb 2011 16:04:41 +0800 Subject: Question about seeds in tow node cluster. From:

Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-14 Thread Sylvain Lebresne
On Mon, Feb 14, 2011 at 11:27 AM, Aditya Narayan ady...@gmail.com wrote: Thanks Sylvain, I guess I might have misunderstood the meaning of column_index_size_in_kb, My previous understanding about that was: it is the threshold size for a row to pass, after which its columns will be indexed.

Re: Confused about get_slice SliceRange behavior with bloom filter

2011-02-14 Thread Aditya Narayan
Thanks for the clarifications.. On Mon, Feb 14, 2011 at 6:13 PM, Sylvain Lebresne sylv...@datastax.comwrote: On Mon, Feb 14, 2011 at 11:27 AM, Aditya Narayan ady...@gmail.com wrote: Thanks Sylvain, I guess I might have misunderstood the meaning of column_index_size_in_kb, My previous

Re: cassandra solaris x64 support

2011-02-14 Thread Xiaobo Gu
On Sun, Feb 13, 2011 at 2:28 AM, Sylvain Lebresne sylv...@datastax.com wrote: On Sat, Feb 12, 2011 at 2:52 AM, Xiaobo Gu guxiaobo1...@gmail.com wrote: On Fri, Feb 11, 2011 at 11:54 PM, Sylvain Lebresne sylv...@datastax.com wrote: On Fri, Feb 11, 2011 at 4:27 PM, Xiaobo Gu

Re: time to live rows

2011-02-14 Thread Kallin Nagelberg
Huh... I usually insert, compact, then flush. Apparently I've been doing it wrong my whole life. So it needs like a courtesy flush. Let me try that :) -Kal On Thu, Feb 10, 2011 at 3:06 AM, Sylvain Lebresne sylv...@datastax.com wrote: Kal, you may have to flush before compacting. If you insert

Re: Cassandra documentation

2011-02-14 Thread Sameer Farooqui
Here is a blog my team is working on at Accenture which is intended to be a complete beginner's guide to Cassandra. I'm still updating a few posts based on DataStax's recommendations and I need to add the last three posts (will get this done soon), but you can start checking it out via this link:

Re: cassandra as session store

2011-02-14 Thread Sasha Dolgy
hi, a few weeks back this topic had some discussion (cassandra as a session store). subsequently, i threw together a quick hack to have PHP use Cassandra as a session store. A benefit I quickly found is that I could rely on Cassandra to expire the sessions and not PHP session garbage

[RELEASE] 0.7.1

2011-02-14 Thread Eric Evans
Today is Valentine's Day[1] in many parts of the world, an annual commemoration of love and affection typically celebrated with candy, stuffed animals, and floral arrangements. She may seem a bit a fickle at times, but Cassandra loves you, and since most people would rather receive a gift of the

JNA.jar

2011-02-14 Thread mcasandra
In Cassandra documentation it recommends downloading jna.jar. However I am unable to see any jar files on the mentioned website http://java.net/projects/jna/ http://java.net/projects/jna/ Do I need to download the source instead and then compile it? -- View this message in context:

Re: Extra Large Memtables

2011-02-14 Thread Robert Coli
On Sat, Feb 12, 2011 at 11:17 PM, E S tr1skl...@yahoo.com wrote: While experimenting with this, I found a bug where you can't have memtable throughput configured past 2 gigs without an integer overflow screwing up the flushes.  That makes me feel like I'm in uncharted territory :). I am sure

Re: JNA.jar

2011-02-14 Thread Aaron Morton
Ouch, that redesign is a bit nasty.jna.jar in this folder is the same as the one I last got 3.2.7http://java.net/projects/jna/sources/svn/show/trunk/jnalib/dist?rev=1182AaronOn 15 Feb, 2011,at 09:48 AM, mcasandra mohitanch...@gmail.com wrote: In Cassandra documentation it recommends downloading

RandomPartitioner

2011-02-14 Thread mcasandra
I am trying to understand atleast to some level of detail about how random partitioner works. With the text I have seen on the website I am not able to clearly understand. Is there a place where it's described with an example, for eg how nodes are assigned random tokens? Is the range picked

Re: [RELEASE] 0.7.1

2011-02-14 Thread Norman Maurer
Huh, isn't that what mirrors are supposed to be for ? Bye, Norman 2011/2/14 Frank LoVecchio fr...@isidorey.com: Did the site get hacked? http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.7.1/apache-cassandra-0.7.1-bin.tar.gz Sources keep changing... On Mon, Feb 14, 2011 at 1:13 PM,

Re: RandomPartitioner

2011-02-14 Thread Dan Kuebrich
You may find this part of the wiki helpful: http://wiki.apache.org/cassandra/Operations#Range_changes If you explicitly specify an InitialToken in the configuration, the new node will bootstrap to that position on the ring. Otherwise, it will pick a Token that will give it half the keys from the

Re: [RELEASE] 0.7.1

2011-02-14 Thread Jake Luciani
It can take some time for the files to propagate to the mirrors. It's Eventually Consistent though :) On Mon, Feb 14, 2011 at 4:20 PM, Frank LoVecchio fr...@isidorey.com wrote: Ah, I meant quite a few of the mirror links keep showing up as links to gossip sites and whatnot. On Feb 14, 2011

Re: [RELEASE] 0.7.1

2011-02-14 Thread Eric Evans
On Mon, 2011-02-14 at 14:20 -0700, Frank LoVecchio wrote: Ah, I meant quite a few of the mirror links keep showing up as links to gossip sites and whatnot. I suspect those mirrors are broken. Can you submit a ticket to https://issues.apache.org/jira/browse/INFRA for any like that you find?

Re: [RELEASE] 0.7.1

2011-02-14 Thread Eric Evans
On Mon, 2011-02-14 at 16:50 -0500, Jake Luciani wrote: It can take some time for the files to propagate to the mirrors. It's Eventually Consistent though :) Preferably they'd 404 when that happens though. :) -- Eric Evans eev...@rackspace.com

Re: RandomPartitioner

2011-02-14 Thread mcasandra
Dan Kuebrich wrote: You may find this part of the wiki helpful: http://wiki.apache.org/cassandra/Operations#Range_changes If you explicitly specify an InitialToken in the configuration, the new node will bootstrap to that position on the ring. Otherwise, it will pick a Token that will

Fwd: SPA2011 - June 12th-15th - BCS London, UK - Call for Sessions

2011-02-14 Thread Jonathan Ellis
In case any of the London crowd is interested: -- Forwarded message -- From: Mike Hill mikewh...@gmail.com Date: Mon, Feb 14, 2011 at 4:14 PM Subject: SPA2011 - June 12th-15th - BCS London, UK - Call for Sessions To: mikewh...@gmail.com mikewh...@gmail.com SPA2011 - June

Re: RandomPartitioner

2011-02-14 Thread mcasandra
I installed cassandra and started it in multi-node. I set the InitialToken to 0. I ran nodetool and see: $ nodetool -h localhost ring Address Status State LoadOwnsToken 152896308109140433971537345591636551711

Re: RandomPartitioner

2011-02-14 Thread Matthew Dennis
nodes contain data for (prevTokenInRing, nodesOwnToken] (i.e. exclusive from previous token to inclusive of the nodes token). So .179 will contain things that hash in the range (152896308109140433971537345591636551711,0] and .12 will contain things that hash in range

Internal error processing insert

2011-02-14 Thread mcasandra
I am following example posted on http://www.datastax.com/dev/tutorials/getting_started_0.7/using_cli#cassandra-cli cli I am seeing: $ set users['jsmith']['password']='ch@ngem3'; Internal error processing insert In the logs I see: java.lang.AssertionError: invalid response count 1 for

Re: Internal error processing insert

2011-02-14 Thread Matthew Dennis
Is your ReplicationFactor (RF) really set to 0? Don't do that, it needs to be at least 1 and probably needs to be 3 in production if you care about your data. It must be greater than 0 and less than the number of nodes in your ring. It represents the number of nodes to copy/replicate data to.

Re: Internal error processing insert

2011-02-14 Thread mcasandra
No it's not set to 0. I am just following the example on datastax getting started site. Here are all the commands: [default@unknown] create keyspace twissandra with replication_factor=1 ... and placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy';

Re: Internal error processing insert

2011-02-14 Thread Aaron Morton
Not sure why the docs suggest to use the NetworkTopologyStrategy, if their are no data centres configured the NetworkTopologyStrategy will say the replication factor is 0. I think this is the source of the "invalid response count 1 for replication factor 0"message.Can you try with the

Re: Internal error processing insert

2011-02-14 Thread mcasandra
That's what I thought might be happening since network topology will try to find one node on the other data center. Message is little confusing though. [default@unknown] update keyspace twissandra placement_strategy='org.apache.cassandra.locator.SimpleStrategy'; Syntax error at position 28:

Re: Internal error processing insert

2011-02-14 Thread Aaron Morton
Will take a closer look at the code tonight, perhaps we should return an error if you try to using Network Topology it cannot detect any DC's .CheersAaronOn 15 Feb, 2011,at 01:22 PM, mcasandra mohitanch...@gmail.com wrote: That's what I thought might be happening since network topology will try to

Re: Internal error processing insert

2011-02-14 Thread Eric Gilmore
For now, I have committed a change in the misleading documentation, substituting SimpleStrategy for NTS. Sorry you ran into trouble due to that, mcasandra. On Mon, Feb 14, 2011 at 4:28 PM, Aaron Morton aa...@thelastpickle.comwrote: Will take a closer look at the code tonight, perhaps we should

Re: Internal error processing insert

2011-02-14 Thread Matthew Dennis
On Mon, Feb 14, 2011 at 6:28 PM, Aaron Morton aa...@thelastpickle.comwrote: Will take a closer look at the code tonight, perhaps we should return an error if you try to using Network Topology it cannot detect any DC's . +1

Data distribution

2011-02-14 Thread mcasandra
Couple of questions: 1) If I insert a key and want to verify which node it went to then how do I do that? 2) How can I verify if the replication is working. That is how do I check that CF row got inserted in 2 nodes if replication factor is set to 2. 3) What happens if I just update the keyspace

Re: Internal error processing insert

2011-02-14 Thread mcasandra
In earlier post same thread you mentioned that replication factor should be set to less than N. Currently I am testing on 2 node cluster and I was able to set replication_factor to 2 (=N) and also when I did cfstats (I don't quite understand cfstats in detail) and see some activity on both nodes

Re: Internal error processing insert

2011-02-14 Thread mcasandra
mcasandra wrote: In earlier post same thread you mentioned that replication factor should be set to less than N. Currently I am testing on 2 node cluster and I was able to set replication_factor to 2 (=N) and also when I did cfstats (I don't quite understand cfstats in detail) and see

Re: Fwd: SPA2011 - June 12th-15th - BCS London, UK - Call for Sessions

2011-02-14 Thread Courtney Robinson
Anyone else in London interested in this? -- From: Jonathan Ellis jbel...@gmail.com Sent: Monday, February 14, 2011 10:30 PM To: user user@cassandra.apache.org Subject: Fwd: SPA2011 - June 12th-15th - BCS London, UK - Call for Sessions In case

Re: Internal error processing insert

2011-02-14 Thread Aaron Morton
He probably meant in production. When playing around, and if you only have 2 nodes, you can set it to 2.From memory RF of 2 means the Quorum is also 2, so you cannot afford to lose one. Thats fine for playing.AaronOn 15 Feb, 2011,at 01:51 PM, mcasandra mohitanch...@gmail.com wrote: mcasandra

RE: Data distribution

2011-02-14 Thread Dan Hendry
1) If I insert a key and want to verify which node it went to then how do I do that? I don't think you can and there should be no reason to care. Cassandra abstracts where data is being stored, think in terms of consistency levels not actual nodes. 2) How can I verify if the replication is

Re: Extra Large Memtables

2011-02-14 Thread Matthew Dennis
On Mon, Feb 14, 2011 at 2:54 PM, Robert Coli rc...@digg.com wrote: Regarding very large memtables, it is important to recognize that throughput refers only to the size of the COLUMN VALUES, and not, for example, their names. That would be a bug in it's own right. There are lots of use cases

Request For 0.6.12 Release

2011-02-14 Thread Gregory Szorc
The latest official 0.6.x releases, 0.6.10 and 0.6.11, have a very serious bug/regression when performing some quorum reads (CASSANDRA-2081), which is fixed in the head of the 0.6 branch. If there aren't any plans to cut 0.6.12 any time soon, as an end user, I request that an official and blessed

Re: Internal error processing insert

2011-02-14 Thread Matthew Dennis
I had actually meant to (and thought I did) type greater than zero and less than *or equal* to number of nodes. That being said, you usually do want it less than the number of nodes in the cluster because otherwise your cluster essentially has the same performance as a single node. In general

Re: Data distribution

2011-02-14 Thread Matthew Dennis
On Mon, Feb 14, 2011 at 6:58 PM, Dan Hendry dan.hendry.j...@gmail.comwrote: 1) If I insert a key and want to verify which node it went to then how do I do that? I don't think you can and there should be no reason to care. Cassandra abstracts where data is being stored, think in terms of

Re: NFS instead of local storage

2011-02-14 Thread Matthew Dennis
no, it's actually worse to do that. 1) you're introducing single points of failure (your array). 2) you're introducing complexity and expense 3) you're introducing latency 4) you're introducing bottle necks 5) some other reasons... You do want your commit log on a separate disk though. The

RE: Data distribution

2011-02-14 Thread mcasandra
When I increase the replication factor does the repair happen automatically in background when client first tries to access data from the node where data does not exist. Or the nodetool repair need to run after increasing the replication factor. -- View this message in context:

Re: Cassandra documentation

2011-02-14 Thread mcasandra
Cassy Andra wrote: Here is a blog my team is working on at Accenture which is intended to be a complete beginner's guide to Cassandra. I'm still updating a few posts based on DataStax's recommendations and I need to add the last three posts (will get this done soon), but you can start

Re: Data distribution

2011-02-14 Thread Matthew Dennis
regardless of increasing RF or not, RR happens based on the read_repair_chance setting. RR happens after the request has been replied to though, so it's possible that if you increase the RF and then read that the read might get stale/missing data. RR would then put the correct value on all the

Re: Extra Large Memtables

2011-02-14 Thread E S
Already submitted and fixed! Thanks Jonathan for your help on this. I really appreciate it! https://issues.apache.org/jira/browse/CASSANDRA-2158

Re: Backend application for Cassandra

2011-02-14 Thread Michal Augustýn
Hi, it depends on your queries complexity - maybe secondary indexes would be sufficient for you - http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes If your queries are too complex then you could use Pig (over Hadoop) -