[ https://issues.apache.org/jira/browse/CASSANDRA-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870889#action_12870889 ]
Jonathan Ellis edited comment on CASSANDRA-1050 at 5/24/10 6:45 PM: -------------------------------------------------------------------- Belatedly realized that if we're changing the [thrift] method signature we should leave it alone in the stable 0.6 series. let's just commit to trunk. Sorry about the extra work. was (Author: jbellis): Ugh, belatedly realized that if we're changing the [thrift] method signature we should leave it alone in the stable 0.6 series. let's just commit to trunk. > Too many splits for ColumnFamily with only a few rows > ----------------------------------------------------- > > Key: CASSANDRA-1050 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1050 > Project: Cassandra > Issue Type: Bug > Components: Hadoop > Affects Versions: 0.6 > Reporter: Joost Ouwerkerk > Fix For: 0.6.3 > > Attachments: CASSANDRA-0.6-1050.patch, CASSANDRA-1050.patch > > > ColumnFamilyInputFormat creates splits for the entire Keyspace. If one > ColumnFamily has 100 Million rows and another has only 100 rows, the number > of splits will be the 1,526 (assuming 64k rows per split) for either one, > since it is based on the total number of unique keys across the whole > keyspace, and not on the number of rows in the ColumnFamily. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.