pache.org<mailto:user@cassandra.apache.org>"
mailto:user@cassandra.apache.org>>
Date: Wednesday, October 10, 2012 3:37 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>"
mailto:user@cassandra.apache.org>>
Subject: Re: 1000's of CF's.
Main proble
Main problem that this "sweet spot" is very narrow. We can't have lots
of CF, we can't have long rows and we end up with enormous amount of
huge composite row keys and stored metadata about that keys (keep in
mind overhead on such scheme, but looks like that nobody really cares
about it anymore
I'm not a Cassandra dev, so take what I say with a lot of salt, but
AFAICT, there is a certain amount of overhead in maintaining a CF, so
when you have large numbers of CFs, this adds up. From a layperson's
perspective, this observation sounds reasonable, since zero-cost CFs
would be tantamount to
So what solution should be for cassandra architecture when we need to make
Hadoop M\R jobs and not be restricted by number of CF?
What we have now is fair amount of CFs (> 2K) and this number is slowly
growing so we already planing to merge partitioned CFs. But our next goal is to
run hadoop ta
Okay, so it only took me two solid days not a week. PlayOrm in master branch
now supports virtual CF's or virtual tables in ONE CF, so you can have 1000's
or millions of virtual CF's in one CF now. It works with all the Scalable-SQL,
works with the joins, and works with the PlayOrm command lin
virtual Cf's into one CF, our map/reduce will have to read in
>>all 999 virtual CF's rows that we don't want just to map/reduce the ONE
>>CF.
>>
>>Map/reduce VERY VERY SLOW when reading in 1000 times more rows :( :(.
>>
>>Is this correct? This reall
's rows that we don't want just to map/reduce the ONE
>CF.
>
>Map/reduce VERY VERY SLOW when reading in 1000 times more rows :( :(.
>
>Is this correct? This really sounds like highly undesirable behavior.
>There needs to be a way for people with 1000's of CF's
#x27;s rows that we don't want just to map/reduce the ONE
>CF.
>
>Map/reduce VERY VERY SLOW when reading in 1000 times more rows :( :(.
>
>Is this correct? This really sounds like highly undesirable behavior.
>There needs to be a way for people with 1000's of CF's to a
read in
all 999 virtual CF's rows that we don't want just to map/reduce the ONE CF.
Map/reduce VERY VERY SLOW when reading in 1000 times more rows :( :(.
Is this correct? This really sounds like highly undesirable behavior.
There needs to be a way for people with 1000's of CF'