[
https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689678#comment-13689678
]
Tony Zhao commented on CASSANDRA-5679:
--
They are not the same problem. 4871 is saying that you can't bound the return
from Cassandra with SlicePredicate. I am saying that the result set I get back
is bounded to my SlicePredicate but it is not sliced correctly. Without wide
row, each call to the map method has a chunk of columns limited by the number
specified in the SlicePredicate. With wide row, each call to the map method has
only one column. So if the SlicePredicate returns 1000 columns with wide row,
the map method gets called 1000 times when it should ideally be called just
once with 1000 columns.
> Wide Row calls map method once per column in Hadoop MapReduce
> -
>
> Key: CASSANDRA-5679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
> Project: Cassandra
> Issue Type: Bug
> Components: Hadoop
>Affects Versions: 1.2.4
> Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
>Reporter: Tony Zhao
>
> When using Cassandra without wide row support in a Hadoop job, the map method
> gets a number of columns limited by the SlicePredicate every time the map
> method in the mapper is called; but when using wide row support, the map
> method is called once for every column. It seems like the limit in
> SlicePredicate is ignored when wide row set to true.
> This prevents in-map reducing code to work (i.e. emit top ten from a mapper).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira