On Thu, Aug 13, 2009 at 5:12 PM, Arin Sarkissian<a...@rspot.net> wrote: > FWIW: I find that the only sane way to visually represent a data model > is to use a JSON-ish notation. > Picture type visualizations confuse me even more. > > I don't mean to be a downer but me and a lot of my peers found all the > picture type visual aides even more confusing
I agree, it's generally easier and pretty much everyone understands jsonish notation (though I find ruby's => notation for hashes is easier to follow ;)) Having said that, evan's pictures were really useful: http://blog.evanweaver.com/files/cassandra/twitter_small.jpg http://blog.evanweaver.com/files/cassandra/twitter.jpg > -arin > aka: phatduckk > > On Wed, Aug 12, 2009 at 8:35 PM, Jonathan Ellis<jbel...@gmail.com> wrote: >> Thanks for taking a stab at this, Mark. >> >> I'm not a fan of teaching this by showing CF-spanning rows. (The >> bigtable paper does this IIRC but it's wrong. :) >> >> You can have data in different CFs with the same key, yes, but all >> that means is they will be stored on the same nodes. Each CF is >> stored separately on disk and queried separately and the common case >> is that they _won't_ have keys in common, rather than the reverse. >> >> -Jonathan >> >> On Wed, Aug 12, 2009 at 10:24 PM, Mark McBride<mark.mcbr...@gmail.com> wrote: >>> Is this clearer? I had the key names set up as <type>:<id> just to >>> keep it simple and put everything in one keyspace. Ditto the super >>> column, although I guess that could be spread out into three things, >>> or you could spread it out into three keyspaces. Not sure what best >>> practices there are. >>> >>> What I'd like to do (and I'll get started on this tonight) is start >>> with a problem statement, and then go about building up a >>> storage-conf.xml file with this structure, showing API examples along >>> the way. So while this is a final picture, there would be simpler >>> ones up front. >>> >>> ---Mark >>> >>> On Wed, Aug 12, 2009 at 5:35 PM, Ryan King<r...@twitter.com> wrote: >>>> A few quick comments: >>>> >>>> * its not clear what column family the super column you're using is in. >>>> * it might be useful to include the timestamps in the columns (since >>>> they're user-supplied) >>>> * given that the colon-delimited api has been removed, it might be >>>> easier to explain the data model without such strings >>>> * why would you mix different kinds of data in the same column family, >>>> rather than having separate column families for each? (users, >>>> bookmarks, tags) >>>> >>>> -ryan >>>> >>>> On Wed, Aug 12, 2009 at 4:57 PM, Mark McBride<mark.mcbr...@gmail.com> >>>> wrote: >>>>> While working on an updated data model wiki page I'm trying to put >>>>> together a graphical representation of the data model. I threw this >>>>> together based on Curt's goal of modeling delicious. The basic gist >>>>> is descriptive data for tags, users, and bookmarks goes in the >>>>> Description column family. The relationships between bookmarks, tags >>>>> and users goes in the map supercolumn. I'm not sure this is how you >>>>> would do it in production (I'm guessing at the very least you'd want >>>>> separate supercolumns for bookmarks, tags and users), but it seems to >>>>> be simple enough for a new user to digest, and covers all the bases of >>>>> the data model (aside from ordering I guess). So two questions >>>>> >>>>> 1) did I get it right (I'm new to this as well)? >>>>> 2) is this a useful representation? >>>>> >>>>> ---Mark >>>>> >>>> >>> >> > -- Cheers Koz