Re: How to model hierarchical structure?
This really depends on the operations you want to optimize for. What's important to you? Aggregate queries? Finding children/siblings/ancestors? Reorganizing the tree/hierarchy? For Cassandra, you really need to spend time thinking about how you'll be accessing things and design for that. If it's a 2-3 level hierarchy, then straight forward approaches like what Jeff suggested seem logical. Otherwise, I'd say if you've got an arbitrary-level hierarchy, then you'll have to think about how to efficiently adapt one of the usual suspects for this stuff (adjacency lists, nested sets, materialized paths, etc.). I, for one, would be interested in knowing if anyone else's experienced with this kind of stuff in Cassandra. http://stackoverflow.com/questions/192220/what-is-the-most-efficient-elegant-way-to-parse-a-flat-table-into-a-tree/192462#192462 and the like might be good places to start. On Sat, Mar 6, 2010 at 2:13 AM, Jeff Zhang wrote: > use the parent as column family and the child as the column under the > column family if this is two-level. > And you can use the super-column if there are more than two-levels > > > > > > On Sat, Mar 6, 2010 at 1:31 AM, HubertChang wrote: > >> >> For examples, like tags, many parents to many children. >> -- >> View this message in context: >> http://n2.nabble.com/How-to-model-hierarchical-structure-tp4685633p4685649.html >> Sent from the cassandra-user@incubator.apache.org mailing list archive at >> Nabble.com. >> > > > > -- > Best Regards > > Jeff Zhang > -- jeande...@6coders.com (917) 951-0636 This email and any files transmitted with it are confidential and intended solely for the use of the individual to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
Re: How to retrieve keys from Cassandra ?
Ok, so 0.6's https://issues.apache.org/jira/browse/CASSANDRA-745 permits "someone using RandomPartitioner to pass start="" and finish="" to get all of the rows in their cluster, although in an extremely inefficient way." We are in a situation like Pierre's, where we need to know what's currently in the DB so to speak -- except that we have a hundreds of millions of rows (and increasing) and that maintaining an index of the keys in another CF, as Brandon suggests, is becoming difficult (we also don't like the double write on initial key inserts, in terms of transactionality especially). Also, every once in a while, we need to enhance our data as part of some functionality upgrade or refactoring. So far, what we do is enhance on reads (i.e., whenever we read a particular record, see if it's not up to the latest version, and if so enhance), but there are many problems with this approach. We've been considering doing background process enhancing by running through all of the keys, which is why 745 is pretty exciting. We'd rather go through the inefficient operation once in a while as opposed to doing a check on every read. Anyway, partially to address the efficiency concern, I've been playing around with the idea of having 745-like functionality on a per-node basis: a call to get all of the keys on a particular node as opposed to the entire cluster. It just seems like with a very large cluster with billions, tens of billions, or hundreds of billions of keys 745 would just get overwhelmed. Just a thought. On Tue, Feb 2, 2010 at 7:31 AM, Jonathan Ellis wrote: > > More or less (but see > https://issues.apache.org/jira/browse/CASSANDRA-745, in 0.6). > > Think of it this way: when you have a few billion keys, how useful is > it to list them? > > -Jonathan > > 2010/2/2 Sébastien Pierre : > > Hi all, > > I would like to know how to retrieve the list of available keys available > > for a specific column. There is the get_key_range method, but it is only > > available when using the OrderPreservingPartitioner -- I use a > > RandomPartitioner. > > Does this mean that when using a RandomPartitioner, you cannot see which > > keys are available in the database ? > > -- Sébastien -- jeande...@6coders.com (917) 951-0636 This email and any files transmitted with it are confidential and intended solely for the use of the individual to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
Re: [VOTE] Graduation
+1 On Mon, Jan 25, 2010 at 1:11 PM, Eric Evans wrote: > > There was some additional discussion[1] concerning Cassandra's > graduation on the incubator list, and as a result we've altered the > initial resolution to expand the size of the PMC by three to include our > active mentors (new draft attached). > > I propose a vote for Cassandra's graduation to a top-level project. > > We'll leave this open for 72 hours, and assuming it passes, we can then > take it to a vote with the Incubator PMC. > > +1 from me! > > > [1] http://thread.gmane.org/gmane.comp.apache.incubator.general/24427 > > -- > Eric Evans > eev...@rackspace.com > -- jeande...@6coders.com (917) 951-0636 This email and any files transmitted with it are confidential and intended solely for the use of the individual to whom they are addressed. If you have received this email in error please notify the system manager. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.