Re: How to model hierarchical structure?

2010-03-06 Thread Jean-Denis Greze
This really depends on the operations you want to optimize for.  What's
important to you?  Aggregate queries?  Finding children/siblings/ancestors?
 Reorganizing the tree/hierarchy?

For Cassandra, you really need to spend time thinking about how you'll be
accessing things and design for that.

If it's a 2-3 level hierarchy, then straight forward approaches like what
Jeff suggested seem logical.

Otherwise, I'd say if you've got an arbitrary-level hierarchy, then you'll
have to think about how to efficiently adapt one of the usual suspects for
this stuff (adjacency lists, nested sets, materialized paths, etc.).  I, for
one, would be interested in knowing if anyone else's experienced with this
kind of stuff in Cassandra.

http://stackoverflow.com/questions/192220/what-is-the-most-efficient-elegant-way-to-parse-a-flat-table-into-a-tree/192462#192462


and the like might be good places to start.

On Sat, Mar 6, 2010 at 2:13 AM, Jeff Zhang  wrote:

> use the parent as column family and the child as the column under the
> column family if this is two-level.
> And you can use the super-column if there are more than two-levels
>
>
>
>
>
> On Sat, Mar 6, 2010 at 1:31 AM, HubertChang  wrote:
>
>>
>> For examples, like tags, many parents to many children.
>> --
>> View this message in context:
>> http://n2.nabble.com/How-to-model-hierarchical-structure-tp4685633p4685649.html
>> Sent from the cassandra-user@incubator.apache.org mailing list archive at
>> Nabble.com.
>>
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>



-- 
jeande...@6coders.com
(917) 951-0636

This email and any files transmitted with it are confidential and intended
solely for the use of the individual to whom they are addressed. If you have
received this email in error please notify the system manager. This message
contains confidential information and is intended only for the individual
named. If you are not the named addressee you should not disseminate,
distribute or copy this e-mail. Please notify the sender immediately by
e-mail if you have received this e-mail by mistake and delete this e-mail
from your system. If you are not the intended recipient you are notified
that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.


Re: How to retrieve keys from Cassandra ?

2010-02-02 Thread Jean-Denis Greze
Ok, so 0.6's https://issues.apache.org/jira/browse/CASSANDRA-745 permits
"someone using RandomPartitioner to pass start="" and finish="" to get all
of the rows in their cluster, although in an extremely inefficient way."

We are in a situation like Pierre's, where we need to know what's currently
in the DB so to speak -- except that we have a hundreds of millions of rows
(and increasing) and that maintaining an index of the keys in another CF, as
Brandon suggests, is becoming difficult (we also don't like the double write
on initial key inserts, in terms of transactionality especially).

Also, every once in a while, we need to enhance our data as part of some
functionality upgrade or refactoring.  So far, what we do is enhance on
reads (i.e., whenever we read a particular record, see if it's not up to the
latest version, and if so enhance), but there are many problems with this
approach. We've been considering doing background process enhancing by
running through all of the keys, which is why 745 is pretty exciting.  We'd
rather go through the inefficient operation once in a while as opposed to
doing a check on every read.

Anyway, partially to address the efficiency concern, I've been playing
around with the idea of having 745-like functionality on a per-node basis: a
call to get all of the keys on a particular node as opposed to the entire
cluster.  It just seems like with a very large cluster with billions, tens
of billions, or hundreds of billions of keys 745 would just get overwhelmed.
 Just a thought.







On Tue, Feb 2, 2010 at 7:31 AM, Jonathan Ellis  wrote:
>
> More or less (but see
> https://issues.apache.org/jira/browse/CASSANDRA-745, in 0.6).
>
> Think of it this way: when you have a few billion keys, how useful is
> it to list them?
>
> -Jonathan
>
> 2010/2/2 Sébastien Pierre :
> > Hi all,
> > I would like to know how to retrieve the list of available keys
available
> > for a specific column. There is the get_key_range method, but it is only
> > available when using the OrderPreservingPartitioner -- I use a
> > RandomPartitioner.
> > Does this mean that when using a RandomPartitioner, you cannot see which
> > keys are available in the database ?
> >  -- Sébastien



--
jeande...@6coders.com
(917) 951-0636

This email and any files transmitted with it are confidential and intended
solely for the use of the individual to whom they are addressed. If you have
received this email in error please notify the system manager. This message
contains confidential information and is intended only for the individual
named. If you are not the named addressee you should not disseminate,
distribute or copy this e-mail. Please notify the sender immediately by
e-mail if you have received this e-mail by mistake and delete this e-mail
from your system. If you are not the intended recipient you are notified
that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.


Re: [VOTE] Graduation

2010-01-25 Thread Jean-Denis Greze
+1

On Mon, Jan 25, 2010 at 1:11 PM, Eric Evans  wrote:

>
> There was some additional discussion[1] concerning Cassandra's
> graduation on the incubator list, and as a result we've altered the
> initial resolution to expand the size of the PMC by three to include our
> active mentors (new draft attached).
>
> I propose a vote for Cassandra's graduation to a top-level project.
>
> We'll leave this open for 72 hours, and assuming it passes, we can then
> take it to a vote with the Incubator PMC.
>
> +1 from me!
>
>
> [1] http://thread.gmane.org/gmane.comp.apache.incubator.general/24427
>
> --
> Eric Evans
> eev...@rackspace.com
>



-- 
jeande...@6coders.com
(917) 951-0636

This email and any files transmitted with it are confidential and intended
solely for the use of the individual to whom they are addressed. If you have
received this email in error please notify the system manager. This message
contains confidential information and is intended only for the individual
named. If you are not the named addressee you should not disseminate,
distribute or copy this e-mail. Please notify the sender immediately by
e-mail if you have received this e-mail by mistake and delete this e-mail
from your system. If you are not the intended recipient you are notified
that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.