Re: Incr/Decr Counters in Cassandra

2010-03-06 Thread simon.reavely

I am very new to Cassandra so I certainly don't have any credibility in
commenting on implementation
(FYI...reading stuff from you and other committers like Eric Evans is
certainly helping...cheers)

However, having hacked on top of DBMS's and custom in memory solutions I
think we can probably look at the following:
1. Obviously the first is availability. Thank you CAP Theorem...duh! I feel
like I need to write something deeper to maintain my credibility but instead
I'll just concur with "devil in details" and move on until I understand
Cassandra better to provide even a smidgen of help!
2. Build very specialized support for specific use-cases e.g. counters that
can be used in applications like mine where we just want to perform an
action when a threshold is exceeded. In my case we can maybe tolerate a
client being slightly out of date i.e. we can handle stale reads for
decision making, but also want to (eventually) capture every increment i.e.
the operation has to be INCR(column, incr_value) rather than
"SET(column)=new_value". Obviously this is easier since the increments can
be replayed in any order.

I noticed some similar comments and thoughts about common lock free
approaches that you've probably already reviewed:
http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01102.html

My justification for wanting some of this is really:
1. I want to minimize the number of database systems I use from a single
application. Naturally, I really need a lot of what Cassandra offers today
to get the availability and "big data" support needed in my application so
maximizing its use is a good thing as long as we avoid the "square peg,
round hole".
2. As a rule of thumb, I am not a big fan of trying to solve the difficult
problems at the expense of providing useful features that are easier to
implement in a shorter time frame.

Is there a place on the Cassandra wiki where the proposals/thinking on these
issues has been captured in one place?

Cheers,
Simon
p.s. I don't want to go off on a tangent, but out of interest, given the
original Dynamo article and comments in posts like this:
http://www.allthingsdistributed.com/2008/12/eventually_consistent.html, what
makes you think SimpleDB is not based on Dynamo?

On Sat, Mar 6, 2010 at 9:19 AM, Jonathan Ellis-3 [via
cassandra-u...@incubator.apache.org] <
ml-node+4686374-1732917648-463...@n2.nabble.com
> wrote:

> First, SimpleDB is probably not built on Dynamo.
>
> And the devil is in the details.  I haven't seen anyone propose a
> reasonable model for how Conditional Puts work (that is the tough
> one).
>
> On Sat, Mar 6, 2010 at 8:11 AM, simon.reavely <[hidden 
> email]>
> wrote:
>
> >
> > Werner Vogels had a recent post around Amazon's support for primitives in
>
> > SimpleDB that can be used to build counters. Given the historical
> influences
> > from Amazon
> > s Dynamo to Cassandra I would think a similar approach might work well.
> >
> http://www.allthingsdistributed.com/2010/02/strong_consistency_simpledb.html
> >
> > BTW...I would be VERY interested in such support.
> > --
> > View this message in context:
> http://n2.nabble.com/Incr-Decr-Counters-in-Cassandra-tp3948361p4686353.html
> > Sent from the [hidden 
> > email]mailing
> >  list archive at Nabble.com.
> >
>
>
> --
>  View message @
> http://n2.nabble.com/Incr-Decr-Counters-in-Cassandra-tp3948361p4686374.html
> To unsubscribe from Re: Incr/Decr Counters in Cassandra, click here< (link 
> removed) >.
>
>
>


-
Simon Reavely
simon.reav...@gmail.com
-- 
View this message in context: 
http://n2.nabble.com/Incr-Decr-Counters-in-Cassandra-tp3948361p4688165.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at 
Nabble.com.


Re: How to model hierarchical structure?

2010-03-06 Thread Jean-Denis Greze
This really depends on the operations you want to optimize for.  What's
important to you?  Aggregate queries?  Finding children/siblings/ancestors?
 Reorganizing the tree/hierarchy?

For Cassandra, you really need to spend time thinking about how you'll be
accessing things and design for that.

If it's a 2-3 level hierarchy, then straight forward approaches like what
Jeff suggested seem logical.

Otherwise, I'd say if you've got an arbitrary-level hierarchy, then you'll
have to think about how to efficiently adapt one of the usual suspects for
this stuff (adjacency lists, nested sets, materialized paths, etc.).  I, for
one, would be interested in knowing if anyone else's experienced with this
kind of stuff in Cassandra.

http://stackoverflow.com/questions/192220/what-is-the-most-efficient-elegant-way-to-parse-a-flat-table-into-a-tree/192462#192462


and the like might be good places to start.

On Sat, Mar 6, 2010 at 2:13 AM, Jeff Zhang  wrote:

> use the parent as column family and the child as the column under the
> column family if this is two-level.
> And you can use the super-column if there are more than two-levels
>
>
>
>
>
> On Sat, Mar 6, 2010 at 1:31 AM, HubertChang  wrote:
>
>>
>> For examples, like tags, many parents to many children.
>> --
>> View this message in context:
>> http://n2.nabble.com/How-to-model-hierarchical-structure-tp4685633p4685649.html
>> Sent from the cassandra-user@incubator.apache.org mailing list archive at
>> Nabble.com.
>>
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>



-- 
jeande...@6coders.com
(917) 951-0636

This email and any files transmitted with it are confidential and intended
solely for the use of the individual to whom they are addressed. If you have
received this email in error please notify the system manager. This message
contains confidential information and is intended only for the individual
named. If you are not the named addressee you should not disseminate,
distribute or copy this e-mail. Please notify the sender immediately by
e-mail if you have received this e-mail by mistake and delete this e-mail
from your system. If you are not the intended recipient you are notified
that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.


Re: ColumnFamilies vs composite rows in one table.

2010-03-06 Thread Erik Holstad
Thanks David and Jonathan!

@David
Yes rows doesn't have a name, I'm just using the word name for anything,
like cluster name,
table name, row name etc, that is my bad.

Yes, I did change two things, that was probably stupid, but the reason for
the second change
is space efficiency.

You are totally right that I'm choosing between scalability and performance
with the different
structures. What I really want to do is to just store indices in rows with a
composite key and
do range queries. Jonathan has firmly steered me away from this approach for
now in regards
to performance.

Thanks a lot!
Erik


Re: Cassandra hardware - balancing CPU/memory/iops/disk space

2010-03-06 Thread Krishna Sankar
Eric,
Couple of thoughts:
1. Hardware
Definitely dual quad core
12 X 4 DIMMS. This is the sweet spot for memory. I have many
machines with this config and some with the 12 X 2 configs
I haven¹t found the need for SATA and the higher price
Make sure you get good NICs
Are you using any virtualization layer ? I assume these are bare
metal with Ubuntu or RedHat.
2. Scaling
Naturally you should look at horizontal scaling than vertical.
An estimate of the application characteristics and data
properties would be helpful to get a first estimate
I think eventually you will end up with multiple boxes anyway,
so my philosophy has been to buy multiple optimal boxes
We are working on scaling characteristics (memory, network and
storage), unfortunately way too early to make any inferences
HTH.
Cheers

On 3/5/10 Fri Mar 5, 10, "Rosenberry, Eric" 
wrote:

> I am looking for advice from others that are further along in deploying
> Cassandra in production environments than we are.  I want to know what you are
> finding your bottlenecks to be.  I would feel silly purchasing dual processor
> quad core 2.93ghz Nehalem machines with 192 gigs of RAM just to find out that
> the two local SATA disks kept all that CPU and RAM from being useful (clearly
> that example would be a dumb).
>  
> I need to spec out hardware for an ³optimal² Cassandra node (though our
> read/write characteristics are not yet fully defined so let¹s go with an
> ³average² configuration).
>  
> My main concern is finding the right balance of:
> ·Available CPU
> 
> ·RAM amount
> 
> ·RAM speed (think Nehalem architecture where memory comes in a few
> speeds, though I doubt this is much of a concern as it is mainly dictated by
> which processor you buy and how many slots you populate)
> 
> ·Total iops available (i.e. number of disks)
> 
> ·Total disk space available (depending on the ratio of iops/space
> deciding on SAS vs. SATA and various rotational speeds)
> 
>  
> My current thinking is 1U boxes with four 3.5 inch disks since that seems to
> be a readily available config.  One big question is should I go with a single
> processor Nehalem system to go with those four disks, or would two CPU¹s be
> useful, and also, how much RAM is appropriate to match?  I am making the
> assumption that Cassandra nodes are going to be disk bound as they must do a
> random read to answer any given query (i.e. indexes in RAM, but all data lives
> on disk?).
>  
> The other big decision is what type of hard disks others are finding to
> provide the optimal ratio of iops to available space?  SAS or SATA?  And what
> rotational speed?
>  
> Let me throw out here an actual hardware config and feel free to tell me the
> error of my ways:
> ·A SuperMicro SuperServer 6016T-NTRF configured as follows:
> 
> o  2.26 ghz E5520 dual processor quad core hyperthreaded Nehalem architecture
> (this proc provides a lot of bang for the buck, faster procs get more
> expensive quickly)
> 
> o  Qty 12, 4 gig 1066mhz DIMMS for a total of 48 gigs RAM (the 4 gig DIMMS
> seem to be the price sweet spot)
> 
> o  Dual on board 1 gigabit NIC¹s (perhaps one for client connections and the
> other for cluster communication?)
> 
> o  Dual power supplies (I don¹t want to lose half my cluster due to a failure
> on one power leg)
> 
> o  4x 1TB SATA disks (this is a complete SWAG)
> 
> o  No RAID controller (all just single individual disks presented to the OS) ­
> Though is there any down side to using a RAID controller with RAID 0 (perhaps
> one single disk for the log for sequential io¹s, and 3x disks in a stripe for
> the random io¹s)
> 
> o  The on-board IPMI based OOB controller (so we can kick the boxes remotely
> if need be)
> 
> ·http://www.supermicro.com/products/system/1U/6016/SYS-6016T-NTRF.cfm
> 
>  
> I can¹t help but think the above config has way too much RAM and CPU and not
> enough iops capacity.  My understanding is that Cassandra does not cache much
> in RAM though?
>  
> Any thoughts are appreciated.  Thanks.
>  
> -Eric
> ___
> Eric Rosenberry
> Sr. Infrastructure Architect | Chief Bit Plumber
>  
>  
> iovation
> 111 SW Fifth Avenue
> Suite 3200
> Portland, OR 97204
> www.iovation.com 
>  
> The information contained in this email message may be privileged,
> confidential and protected from disclosure. If you are not the intended
> recipient, any dissemination, distribution or copying is strictly prohibited.
> If you think that you have received this email message in error, please notify
> the sender by reply email and delete the message and any attachments.
> 



Re: Unreliable transport layer

2010-03-06 Thread Avinash Lakshman
We have observed this. But in practice it doesn't cause any deleterious
effects. IMHO detecting false failures of nodes is the most dangerous thing
that could result from this kind of behavior. But that is why we have an
Accrual FD which reacts and adjusts to these conditions. But having said
that moving TCP is not a bad option at all at relatively small scale.

Cheers
Avinash

On Fri, Mar 5, 2010 at 4:54 PM, Ashwin Jayaprakash <
ashwin.jayaprak...@gmail.com> wrote:

> Hey guys! I have a simple question. I'm a casual observer, not a real
> Cassandra user yet. So, excuse my ignorance.
>
> I see that the Gossip feature uses UDP. I was curious to know if you guys
> faced issues with unreliable transports in your production clusters? Like
> faulty switches, dropped packets etc during heavy network loads?
>
> If I'm not mistaken are all client reads/writes doing point-to-point over
> TCP?
>
> Thanks,
> Ashwin.
>
>
>


Re: Cassandra hardware - balancing CPU/memory/iops/disk space

2010-03-06 Thread Jonathan Ellis
I think http://wiki.apache.org/cassandra/CassandraHardware answers
most of your questions.

If possible, it's definitely useful to try out a small fraction of
your anticipated workload against a test cluster, even a single node,
before finalizing your production hardware purchase.

On Sat, Mar 6, 2010 at 1:12 AM, Rosenberry, Eric
 wrote:
> I am looking for advice from others that are further along in deploying
> Cassandra in production environments than we are.  I want to know what you
> are finding your bottlenecks to be.  I would feel silly purchasing dual
> processor quad core 2.93ghz Nehalem machines with 192 gigs of RAM just to
> find out that the two local SATA disks kept all that CPU and RAM from being
> useful (clearly that example would be a dumb).
>
>
>
> I need to spec out hardware for an “optimal” Cassandra node (though our
> read/write characteristics are not yet fully defined so let’s go with an
> “average” configuration).
>
>
>
> My main concern is finding the right balance of:
>
> · Available CPU
>
> · RAM amount
>
> · RAM speed (think Nehalem architecture where memory comes in a few
> speeds, though I doubt this is much of a concern as it is mainly dictated by
> which processor you buy and how many slots you populate)
>
> · Total iops available (i.e. number of disks)
>
> · Total disk space available (depending on the ratio of iops/space
> deciding on SAS vs. SATA and various rotational speeds)
>
>
>
> My current thinking is 1U boxes with four 3.5 inch disks since that seems to
> be a readily available config.  One big question is should I go with a
> single processor Nehalem system to go with those four disks, or would two
> CPU’s be useful, and also, how much RAM is appropriate to match?  I am
> making the assumption that Cassandra nodes are going to be disk bound as
> they must do a random read to answer any given query (i.e. indexes in RAM,
> but all data lives on disk?).
>
>
>
> The other big decision is what type of hard disks others are finding to
> provide the optimal ratio of iops to available space?  SAS or SATA?  And
> what rotational speed?
>
>
>
> Let me throw out here an actual hardware config and feel free to tell me the
> error of my ways:
>
> · A SuperMicro SuperServer 6016T-NTRF configured as follows:
>
> o   2.26 ghz E5520 dual processor quad core hyperthreaded Nehalem
> architecture (this proc provides a lot of bang for the buck, faster procs
> get more expensive quickly)
>
> o   Qty 12, 4 gig 1066mhz DIMMS for a total of 48 gigs RAM (the 4 gig DIMMS
> seem to be the price sweet spot)
>
> o   Dual on board 1 gigabit NIC’s (perhaps one for client connections and
> the other for cluster communication?)
>
> o   Dual power supplies (I don’t want to lose half my cluster due to a
> failure on one power leg)
>
> o   4x 1TB SATA disks (this is a complete SWAG)
>
> o   No RAID controller (all just single individual disks presented to the
> OS) – Though is there any down side to using a RAID controller with RAID 0
> (perhaps one single disk for the log for sequential io’s, and 3x disks in a
> stripe for the random io’s)
>
> o   The on-board IPMI based OOB controller (so we can kick the boxes
> remotely if need be)
>
> ·
> http://www.supermicro.com/products/system/1U/6016/SYS-6016T-NTRF.cfm
>
>
>
> I can’t help but think the above config has way too much RAM and CPU and not
> enough iops capacity.  My understanding is that Cassandra does not cache
> much in RAM though?
>
>
>
> Any thoughts are appreciated.  Thanks.
>
>
>
> -Eric
>
> ___
> Eric Rosenberry
> Sr. Infrastructure Architect | Chief Bit Plumber
>
>
>
>
> iovation
> 111 SW Fifth Avenue
> Suite 3200
> Portland, OR 97204
> www.iovation.com
>
> The information contained in this email message may be privileged,
> confidential and protected from disclosure. If you are not the intended
> recipient, any dissemination, distribution or copying is strictly
> prohibited. If you think that you have received this email message in error,
> please notify the sender by reply email and delete the message and any
> attachments.


Re: Incr/Decr Counters in Cassandra

2010-03-06 Thread Jonathan Ellis
First, SimpleDB is probably not built on Dynamo.

And the devil is in the details.  I haven't seen anyone propose a
reasonable model for how Conditional Puts work (that is the tough
one).

On Sat, Mar 6, 2010 at 8:11 AM, simon.reavely  wrote:
>
> Werner Vogels had a recent post around Amazon's support for primitives in
> SimpleDB that can be used to build counters. Given the historical influences
> from Amazon
> s Dynamo to Cassandra I would think a similar approach might work well.
> http://www.allthingsdistributed.com/2010/02/strong_consistency_simpledb.html
>
> BTW...I would be VERY interested in such support.
> --
> View this message in context: 
> http://n2.nabble.com/Incr-Decr-Counters-in-Cassandra-tp3948361p4686353.html
> Sent from the cassandra-user@incubator.apache.org mailing list archive at 
> Nabble.com.
>


Re: Incr/Decr Counters in Cassandra

2010-03-06 Thread simon.reavely

Werner Vogels had a recent post around Amazon's support for primitives in
SimpleDB that can be used to build counters. Given the historical influences
from Amazon
s Dynamo to Cassandra I would think a similar approach might work well.
http://www.allthingsdistributed.com/2010/02/strong_consistency_simpledb.html

BTW...I would be VERY interested in such support.
-- 
View this message in context: 
http://n2.nabble.com/Incr-Decr-Counters-in-Cassandra-tp3948361p4686353.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at 
Nabble.com.


Re: How to model hierarchical structure?

2010-03-06 Thread Jeff Zhang
use the parent as column family and the child as the column under the column
family if this is two-level.
And you can use the super-column if there are more than two-levels




On Sat, Mar 6, 2010 at 1:31 AM, HubertChang  wrote:

>
> For examples, like tags, many parents to many children.
> --
> View this message in context:
> http://n2.nabble.com/How-to-model-hierarchical-structure-tp4685633p4685649.html
> Sent from the cassandra-user@incubator.apache.org mailing list archive at
> Nabble.com.
>



-- 
Best Regards

Jeff Zhang


Re: How to model hierarchical structure?

2010-03-06 Thread HubertChang

For examples, like tags, many parents to many children.
-- 
View this message in context: 
http://n2.nabble.com/How-to-model-hierarchical-structure-tp4685633p4685649.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at 
Nabble.com.


How to model hierarchical structure?

2010-03-06 Thread Hubert Chang
Like Category, Taxonomy, or folder/file, there will be multiple level
hierarchical relationship.
 How to model it in Cassandra?
Serialize all the parent id and the item id together as the key?
How to model it when one child has many parents?