[jira] [Created] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce
Tony Zhao created CASSANDRA-5679: Summary: Wide Row calls map method once per column in Hadoop MapReduce Key: CASSANDRA-5679 URL: https://issues.apache.org/jira/browse/CASSANDRA-5679 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.2.4 Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4 Reporter: Tony Zhao When using Cassandra without wide row support in a Hadoop job, the map method gets a number of columns limited by the SlicePredicate every time the map method in the mapper is called; but when using wide row support, the map method is called once for every column. It seems like the limit in SlicePredicate is ignored when wide row set to true. This prevents in-map reducing code to work (i.e. emit top ten from a mapper). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce
[ https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tony Zhao updated CASSANDRA-5679: - Labels: cassandra hadoop (was: ) Wide Row calls map method once per column in Hadoop MapReduce - Key: CASSANDRA-5679 URL: https://issues.apache.org/jira/browse/CASSANDRA-5679 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.2.4 Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4 Reporter: Tony Zhao Labels: cassandra, hadoop When using Cassandra without wide row support in a Hadoop job, the map method gets a number of columns limited by the SlicePredicate every time the map method in the mapper is called; but when using wide row support, the map method is called once for every column. It seems like the limit in SlicePredicate is ignored when wide row set to true. This prevents in-map reducing code to work (i.e. emit top ten from a mapper). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce
[ https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689678#comment-13689678 ] Tony Zhao commented on CASSANDRA-5679: -- They are not the same problem. 4871 is saying that you can't bound the return from Cassandra with SlicePredicate. I am saying that the result set I get back is bounded to my SlicePredicate but it is not sliced correctly. Without wide row, each call to the map method has a chunk of columns limited by the number specified in the SlicePredicate. With wide row, each call to the map method has only one column. So if the SlicePredicate returns 1000 columns with wide row, the map method gets called 1000 times when it should ideally be called just once with 1000 columns. Wide Row calls map method once per column in Hadoop MapReduce - Key: CASSANDRA-5679 URL: https://issues.apache.org/jira/browse/CASSANDRA-5679 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.2.4 Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4 Reporter: Tony Zhao When using Cassandra without wide row support in a Hadoop job, the map method gets a number of columns limited by the SlicePredicate every time the map method in the mapper is called; but when using wide row support, the map method is called once for every column. It seems like the limit in SlicePredicate is ignored when wide row set to true. This prevents in-map reducing code to work (i.e. emit top ten from a mapper). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira