[jira] [Commented] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce

2013-06-20 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689706#comment-13689706
 ] 

Jonathan Ellis commented on CASSANDRA-5679:
---

(Use CqlPagedInputFormat instead for "wide rows done right.")

> Wide Row calls map method once per column in Hadoop MapReduce
> -
>
> Key: CASSANDRA-5679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 1.2.4
> Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
>Reporter: Tony Zhao
>  Labels: cassandra, hadoop
>
> When using Cassandra without wide row support in a Hadoop job, the map method 
> gets a number of columns limited by the SlicePredicate every time the map 
> method in the mapper is called; but when using wide row support, the map 
> method is called once for every column. It seems like the limit in 
> SlicePredicate is ignored when wide row set to true. 
> This prevents in-map reducing code to work (i.e. emit top ten from a mapper).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce

2013-06-20 Thread Tony Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689678#comment-13689678
 ] 

Tony Zhao commented on CASSANDRA-5679:
--

They are not the same problem. 4871 is saying that you can't bound the return 
from Cassandra with SlicePredicate. I am saying that the result set I get back 
is bounded to my SlicePredicate but it is not sliced correctly. Without wide 
row, each call to the map method has a chunk of columns limited by the number 
specified in the SlicePredicate. With wide row, each call to the map method has 
only one column. So if the SlicePredicate returns 1000 columns with wide row, 
the map method gets called 1000 times when it should ideally be called just 
once with 1000 columns.

> Wide Row calls map method once per column in Hadoop MapReduce
> -
>
> Key: CASSANDRA-5679
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Affects Versions: 1.2.4
> Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
>Reporter: Tony Zhao
>
> When using Cassandra without wide row support in a Hadoop job, the map method 
> gets a number of columns limited by the SlicePredicate every time the map 
> method in the mapper is called; but when using wide row support, the map 
> method is called once for every column. It seems like the limit in 
> SlicePredicate is ignored when wide row set to true. 
> This prevents in-map reducing code to work (i.e. emit top ten from a mapper).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira