Re: commodity server spec

2011-09-02 Thread Radim Kolar
many smaller servers way better

Re: Limiting ColumnSlice range in second composite value

2011-09-02 Thread Anthony Ikeda
Okay, I reversed the composite and seem to have come up with a solution. Although the rows are sorted by the status, the statuses are sorted temporally which helps. I tell you this type of modeling really breaks the rules :) Anthony On Fri, Sep 2, 2011 at 3:54 PM, Anthony Ikeda wrote: > This is

commodity server spec

2011-09-02 Thread China Stoffen
Hi, Is there any recommendation about commodity server hardware specs if 100TB database size is expected and its heavily write application. Should I got with high powered CPU (12 cores) and 48TB HDD and 640GB RAM and total of 3 servers of this spec. Or many smaller commodity servers are recomme

Re: Limiting ColumnSlice range in second composite value

2011-09-02 Thread Anthony Ikeda
This is what I'm trying to do: Sample of the data: RowKey: localhost => (column=e3f3c900-d5b0-11e0-aa6b-005056c8:ACTIVE, value=, timestamp=1315001665761000) => (column=e4515250-d5b0-11e0-aa6b-005056c8:INACTIVE, value=, timestamp=1315001654271000) => (column=e45549f0-d5b0-11e0-aa6b-005056c0

Import JSON sstable data

2011-09-02 Thread Zhong Li
Hi, I try to upload sstable data on cassandra 0.8.4 cluster with json2sstable tool. Each time I have to restart the node with new file imported and do repair for the column family, otherwise new data will not show. Any thoughts? Thanks, Zhong Li

Re: HUnavailableException: : May not be enough replicas present to handle consistency level.

2011-09-02 Thread Oleg Tsvinev
Yes, I think I get it now. "quorum of replicas" != "quorum of nodes" and I don't think quorum of nodes is ever defined. Thank you, Konstantin. Now, I believe I need to change my cluster to store data in two remaining nodes in DC1, keeping 3 nodes in DC2. I believe nodetool removetoken is what I ne

Re: HUnavailableException: : May not be enough replicas present to handle consistency level.

2011-09-02 Thread Oleg Tsvinev
And now, when I have one node down with no chance of bringing it back anytime soon, can I still change RF to 3 and get restore functionality of my cluster? Should I run 'nodetool repair' or simple keyspace update will suffice? On Fri, Sep 2, 2011 at 1:55 PM, Nate McCall wrote: > Yes - you would n

Re: HUnavailableException: : May not be enough replicas present to handle consistency level.

2011-09-02 Thread Konstantin Naryshkin
I think that Oleg may have misunderstood how replicas are selected. If you have 3 nodes in your cluster and a RF of 2, Cassandra first selects what two nodes, out of the 3 will get data, then, and only then does it write it out. The selection is based on the row key, the token of the node, and y

Re: HUnavailableException: : May not be enough replicas present to handle consistency level.

2011-09-02 Thread Nate McCall
Yes - you would need at least 3 replicas per data center to use LOCAL_QUORUM and survive a node failure. On Fri, Sep 2, 2011 at 3:51 PM, Oleg Tsvinev wrote: > Do you mean I need to configure 3 replicas in each DC and keep using > LOCAL_QUORUM? In which case, if I'm following your logic, even one

Re: HUnavailableException: : May not be enough replicas present to handle consistency level.

2011-09-02 Thread Oleg Tsvinev
Do you mean I need to configure 3 replicas in each DC and keep using LOCAL_QUORUM? In which case, if I'm following your logic, even one of the 3 goes down I'll still have 2 to ensure LOCAL_QUORUM succeeds? On Fri, Sep 2, 2011 at 1:44 PM, Nate McCall wrote: > In your options, you have configured 2

Re: HUnavailableException: : May not be enough replicas present to handle consistency level.

2011-09-02 Thread Nate McCall
In your options, you have configured 2 replicas for each data center: Options: [DC2:2, DC1:2] If one of those replicas is down, then LOCAL_QUORUM will fail as there is only one replica left 'locally.' On Fri, Sep 2, 2011 at 3:35 PM, Oleg Tsvinev wrote: > from http://www.datastax.com/docs/0.8/co

Re: HUnavailableException: : May not be enough replicas present to handle consistency level.

2011-09-02 Thread Oleg Tsvinev
from http://www.datastax.com/docs/0.8/consistency/index: I have RF=2, so majority of replicas is 2/2+1=2 which I have after 3rd node goes down? On Fri, Sep 2, 2011 at 1:22 PM, Nate McCall wrote: > It looks like you only have 2 replicas configured in each data center? > > If so, LOCAL_QUORUM ca

Re: HUnavailableException: : May not be enough replicas present to handle consistency level.

2011-09-02 Thread Oleg Tsvinev
Well, this is the part I don't understand then. I thought that if I configure 2 replicas with 3 nodes and one of 3 nodes goes down, I'll still have 2 nodes to store 3 replicas. Is my logic flawed somehere? On Fri, Sep 2, 2011 at 1:22 PM, Nate McCall wrote: > It looks like you only have 2 replicas

Re: HUnavailableException: : May not be enough replicas present to handle consistency level.

2011-09-02 Thread Nate McCall
It looks like you only have 2 replicas configured in each data center? If so, LOCAL_QUORUM cannot be achieved with a host down same as with QUORUM on RF=2 in a single DC cluster. On Fri, Sep 2, 2011 at 1:40 PM, Oleg Tsvinev wrote: > I believe I don't quite understand semantics of this exception:

Re: Replicate On Write behavior

2011-09-02 Thread David Hawthorne
Does it always pick the node with the lowest IP address? All of my hosts are in the same /24. The fourth node in the 5 node cluster has the lowest value in the 4th octet (54). I erased the cluster and rebuilt it from scratch as a 3 node cluster using the first 3 nodes, and now the ReplicateOn

Re: Trying to understand QUORUM and Strategies

2011-09-02 Thread Jonathan Ellis
Note that this is an implementation detail, not something that inherently can't work with other strategies. LOCAL_QUORUM and EACH_QUORUM are logically equivalent to QUORUM when there is a single datacenter. We tried briefly to add support for non-NTS strategies in https://issues.apache.org/jira/b

Re: Limiting ColumnSlice range in second composite value

2011-09-02 Thread Nate McCall
Instead of empty strings, try Character.[MAX|MIN-]_VALUE. On Thu, Sep 1, 2011 at 8:27 PM, Anthony Ikeda wrote: > My Column name is of Composite(TimeUUIDType, UTF8Type) and I can query > across the TimeUUIDs correctly, but now I want to also range across the UTF8 > component. Is this possible? > >

Streaming stuck on one node during Repair

2011-09-02 Thread Jake Maizel
Hello, I have one node of a cluster that is stuck in a streaming out state sending to the node that is being repaired. If I looked the AE Thread in jconsole I see this trace: Name: AE-SERVICE-STAGE:1 State: WAITING on java.util.concurrent.FutureTask$Sync@7e3e0044 Total blocked: 0 Total waited:

HUnavailableException: : May not be enough replicas present to handle consistency level.

2011-09-02 Thread Oleg Tsvinev
I believe I don't quite understand semantics of this exception: me.prettyprint.hector.api.exceptions.HUnavailableException: : May not be enough replicas present to handle consistency level. Does it mean there *might be* enough? Does it mean there *is not* enough? My case is as following - I have

JMX TotalReadLatencyMicros sanity check

2011-09-02 Thread David Hawthorne
I've graphed the rate of change of the TotalReadLatencyMicros counter over the last 12 hours, and divided by 1,000,000 to get it in seconds. I'm grabbing it every 10 seconds, so I divided by another 10 to get per-second rates. The result is that I have a CF doing 10 seconds of read *every secon

Re: Trying to understand QUORUM and Strategies

2011-09-02 Thread Anthony Ikeda
Okay, great I just wanted to confirm that LOCAL_QUORUM will not work with SimpleStrategy. There was somewhat of a debate amongst my devs that said it should work. Anthon On Fri, Sep 2, 2011 at 9:55 AM, Evgeniy Ryabitskiy < evgeniy.ryabits...@wikimart.ru> wrote: > So. > You have created keyspace

Re: Replicate On Write behavior

2011-09-02 Thread Ian Danforth
That ticket explains a lot, looking forward to a resolution on it. (Sorry I don't have a patch to offer) Ian On Fri, Sep 2, 2011 at 12:30 AM, Sylvain Lebresne wrote: > On Thu, Sep 1, 2011 at 8:52 PM, David Hawthorne wrote: >> I'm curious... digging through the source, it looks like replicate on

Re: Trying to understand QUORUM and Strategies

2011-09-02 Thread Evgeniy Ryabitskiy
So. You have created keyspace with SimpleStrategy. If you want to use *LOCAL_QUORUM, *you should create keyspace (or change existing) with NetworkTopologyStrategy. I have provided CLI examples on how to do it. If you are creating keyspace from Hector, you have to do same via Java API. Evgeny.

Re: Cassandra prod environment

2011-09-02 Thread Jeremy Hanna
We moved off of ubuntu because of kernel issues in the AMIs we found in 10.04 and 10.10 in ec2. So we're now on debian squeeze with ext4. It's been great for us. One thing that bit us is we'd been using property file snitch and the availability zones as racks and had an equal number of nodes

Re: removing all column metadata via CLI

2011-09-02 Thread Jonathan Ellis
Then you'll want to create an issue: https://issues.apache.org/jira/browse/CASSANDRA On Fri, Sep 2, 2011 at 10:08 AM, Radim Kolar wrote: >> Is this 0.8.4? > yes > > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://w

Re: Cassandra prod environment

2011-09-02 Thread Eric Tamme
On 09/02/2011 11:30 AM, Sorin Julean wrote: Hey, Currently I'm running Cassandra on Ubuntu 10.4 x86_64 in EC2. I'm wondering if anyone observed a better performance / stability on other distros ( CentOS / RHEL / ...) or OS (eg. Solaris intel/SPARC) ? Is anyone running prod on VMs, not clou

Cassandra prod environment

2011-09-02 Thread Sorin Julean
Hey, Currently I'm running Cassandra on Ubuntu 10.4 x86_64 in EC2. I'm wondering if anyone observed a better performance / stability on other distros ( CentOS / RHEL / ...) or OS (eg. Solaris intel/SPARC) ? Is anyone running prod on VMs, not cloud, but ESXi or Solaris zones ? Is there love or

Re: removing all column metadata via CLI

2011-09-02 Thread Radim Kolar
> Is this 0.8.4? yes

Re: looking for information on composite columns

2011-09-02 Thread Yiming Sun
Thanks Edward. What's the link to your blog? On Fri, Sep 2, 2011 at 10:43 AM, Edward Capriolo wrote: > > On Fri, Sep 2, 2011 at 9:15 AM, Yiming Sun wrote: > >> Hi, >> >> I am looking for information/tutorials on the use of composite columns, >> including how to use it, what kind of indexing it

Re: looking for information on composite columns

2011-09-02 Thread Edward Capriolo
On Fri, Sep 2, 2011 at 9:15 AM, Yiming Sun wrote: > Hi, > > I am looking for information/tutorials on the use of composite columns, > including how to use it, what kind of indexing it can offer, and its > advantage over super columns. I googled but came up with very little > information. There

Re: removing all column metadata via CLI

2011-09-02 Thread Jonathan Ellis
Is this 0.8.4? 2011/9/2 Radim Kolar : > I cant find way how to remove all columns definitions without CF > import/export. > > [default@int4] update column family sipdb with column_metadata = []; > Syntax error at position 51: required (...)+ loop did not match anything at > input ']' > > [default@

Re: 15 seconds to increment 17k keys?

2011-09-02 Thread Richard Low
On Thu, Sep 1, 2011 at 5:16 PM, Ian Danforth wrote: > Does this scale with multiples of the replication factor or directly > with number of nodes? Or more succinctly, to double the writes per > second into the cluster how many more nodes would I need? The write throughput scales with number of n

removing all column metadata via CLI

2011-09-02 Thread Radim Kolar
I cant find way how to remove all columns definitions without CF import/export. [default@int4] update column family sipdb with column_metadata = []; Syntax error at position 51: required (...)+ loop did not match anything at input ']' [default@int4] update column family sipdb with column_meta

Re: Cassandra, CQL, Thrift Deprecation?? and Erlang

2011-09-02 Thread J T
Ok, thats good to know. If push came to shove I could probably write such a client myself after doing the necessary research but I'd prefer to save myself the hassle. Thanks. On Fri, Sep 2, 2011 at 1:59 PM, Jonathan Ellis wrote: > The Thrift API is not going anywhere any time soon. > > I'm not

looking for information on composite columns

2011-09-02 Thread Yiming Sun
Hi, I am looking for information/tutorials on the use of composite columns, including how to use it, what kind of indexing it can offer, and its advantage over super columns. I googled but came up with very little information. There is a blog article from high performance cassandra on the compos

RE: Removal of old data files

2011-09-02 Thread hiroyuki.watanabe
I see. Thank you for helpful information Yuki -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: Friday, September 02, 2011 3:40 AM To: user@cassandra.apache.org Subject: Re: Removal of old data files On Fri, Sep 2, 2011 at 12:11 AM, wrote: > Yes, I see

Re: Cassandra, CQL, Thrift Deprecation?? and Erlang

2011-09-02 Thread Jonathan Ellis
The Thrift API is not going anywhere any time soon. I'm not aware of anyone working on an erlang CQL client. On Fri, Sep 2, 2011 at 7:39 AM, J T wrote: > Hi, > > I'm a fan of erlang, and have been using successive cassandra versions via > the erlang thrift interface for a couple of years now. >

Re: cassandra-cli describe / dump command

2011-09-02 Thread J T
Thats brilliant, thanks. On Thu, Sep 1, 2011 at 7:07 PM, Jonathan Ellis wrote: > yes, cli "show schema" in 0.8.4+ > > On Thu, Sep 1, 2011 at 12:52 PM, J T wrote: > > Hi, > > > > I'm probably being blind .. but I can't see any way to dump the schema > > definition (and the data in it for that ma

Cassandra, CQL, Thrift Deprecation?? and Erlang

2011-09-02 Thread J T
Hi, I'm a fan of erlang, and have been using successive cassandra versions via the erlang thrift interface for a couple of years now. I see that cassandra seems to be moving to using CQL instead and so I was wondering if that means the thrift api will be deprecated and if so is there any effort u

Re: SSTableSimpleUnsortedWriter take long time when inserting big rows

2011-09-02 Thread Benoit Perroud
Thanks for your answer. 2011/9/2 Sylvain Lebresne : > On Fri, Sep 2, 2011 at 10:29 AM, Benoit Perroud wrote: >> Hi All, >> >> I started using SSTableSimpleUnsortedWriter to load data, and my data >> has a few rows but a lot of column name in each rows. >> >> I call SSTableSimpleUnsortedWriter.new

Re: SSTableSimpleUnsortedWriter take long time when inserting big rows

2011-09-02 Thread Sylvain Lebresne
On Fri, Sep 2, 2011 at 10:29 AM, Benoit Perroud wrote: > Hi All, > > I started using SSTableSimpleUnsortedWriter to load data, and my data > has a few rows but a lot of column name in each rows. > > I call SSTableSimpleUnsortedWriter.newRow every 10'000 columns inserted. > > But the time taken to

SSTableSimpleUnsortedWriter take long time when inserting big rows

2011-09-02 Thread Benoit Perroud
Hi All, I started using SSTableSimpleUnsortedWriter to load data, and my data has a few rows but a lot of column name in each rows. I call SSTableSimpleUnsortedWriter.newRow every 10'000 columns inserted. But the time taken to insert columns is increasing as the column family is increasing. The

Re: Replicate On Write behavior

2011-09-02 Thread David Hawthorne
That's interesting. I did an experiment wherein I added some entropy to the row name based on the time when the increment came in, (e.g. row = row + "/" + (timestamp - (timestamp % 300))) and now not only is the load (in GB) on my cluster more balanced, the performance has not decayed and has s

Re: Removal of old data files

2011-09-02 Thread Sylvain Lebresne
On Fri, Sep 2, 2011 at 12:11 AM, wrote: > Yes, I see files with name like >     Orders-g-6517-Compacted > > However, all of those file have a size of 0. > > Starting from Monday to Thurseday we have 5642 files for -Data.db, > -Filter.db and Statistics.db and only 128 -Compacted files. > and all o

Re: RF=1 w/ hadoop jobs

2011-09-02 Thread Patrik Modesto
On Fri, Sep 2, 2011 at 08:54, Mick Semb Wever wrote: > Patrik: is it possible to describe the use-case you have here? Sure. We use Cassandra as a storage for web-pages, we store the HTML, all URLs that has the same HTML data and some computed data. We run Hadoop MR jobs to compute lexical and th

Re: Replicate On Write behavior

2011-09-02 Thread Sylvain Lebresne
On Thu, Sep 1, 2011 at 8:52 PM, David Hawthorne wrote: > I'm curious... digging through the source, it looks like replicate on write > triggers a read of the entire row, and not just the columns/supercolumns that > are affected by the counter update.  Is this the case?  It would certainly > exp