Dean, Great point. I hadn't considered that either. Per my other email, think we would need a custom partitioner for this? (a mix of OrderPreservingPartitioner and RandomPartitioner, OPP for the prefix)
-brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive ? King of Prussia, PA ? 19406 M: 215.588.6024 ? @boneill42 <http://www.twitter.com/boneill42> ? healthmarketscience.com This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at the email above and delete this email and any attachments and destroy any copies thereof. Any review, retransmission, dissemination, copying or other use of, or taking any action in reliance upon, this information by persons or entities other than the intended recipient is strictly prohibited. On 10/2/12 8:35 AM, "Hiller, Dean" <dean.hil...@nrel.gov> wrote: >So basically, with moving towards the 1000's of CF all being put in one >CF, our performance is going to tank on map/reduce, correct? I mean, from >what I remember we could do map/reduce on a single CF, but by stuffing >1000's of virtual Cf's into one CF, our map/reduce will have to read in >all 999 virtual CF's rows that we don't want just to map/reduce the ONE >CF. > >Map/reduce VERY VERY SLOW when reading in 1000 times more rows :( :(. > >Is this correct? This really sounds like highly undesirable behavior. >There needs to be a way for people with 1000's of CF's to also run >map/reduce on any one CF. Doing Map/reduce on 1000 times the number of >rows will be 1000 times slowerŠ.and of course, we will most likely get up >to 20,000 tables from my most recent projectionsŠ.our last test load, we >ended up with 8k+ CF's. Since I kept two other keyspaces, cassandra >started getting really REALLY slow when we got up to 15k+ CF's in the >system though I didn't look into why. > >I don't mind having 1000's of virtual CF's in ONE CF, BUT I need to >map/reduce "just" the virtual CF!!!!! Ugh. > >Thanks, >Dean > >On 10/1/12 3:38 PM, "Ben Hood" <0x6e6...@gmail.com> wrote: > >>On Mon, Oct 1, 2012 at 9:38 PM, Brian O'Neill <b...@alumni.brown.edu> >>wrote: >>> Its just a convenient way of prefixing: >>> >>>http://hector-client.github.com/hector/build/html/content/virtual_keyspa >>>c >>>es.html >> >>So given that it is possible to use a CF per tenant, should we assume >>that there at sufficient scale that there is less overhead to prefix >>keys than there is to manage multiple CFs? >> >>Ben >