[jira] [Created] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce

2013-06-20 Thread Tony Zhao (JIRA)
Tony Zhao created CASSANDRA-5679:


 Summary: Wide Row calls map method once per column in Hadoop 
MapReduce
 Key: CASSANDRA-5679
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.4
 Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
Reporter: Tony Zhao


When using Cassandra without wide row support in a Hadoop job, the map method 
gets a number of columns limited by the SlicePredicate every time the map 
method in the mapper is called; but when using wide row support, the map method 
is called once for every column. It seems like the limit in SlicePredicate is 
ignored when wide row set to true. 

This prevents in-map reducing code to work (i.e. emit top ten from a mapper).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce

2013-06-20 Thread Tony Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tony Zhao updated CASSANDRA-5679:
-

Labels: cassandra hadoop  (was: )

 Wide Row calls map method once per column in Hadoop MapReduce
 -

 Key: CASSANDRA-5679
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.4
 Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
Reporter: Tony Zhao
  Labels: cassandra, hadoop

 When using Cassandra without wide row support in a Hadoop job, the map method 
 gets a number of columns limited by the SlicePredicate every time the map 
 method in the mapper is called; but when using wide row support, the map 
 method is called once for every column. It seems like the limit in 
 SlicePredicate is ignored when wide row set to true. 
 This prevents in-map reducing code to work (i.e. emit top ten from a mapper).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-5679) Wide Row calls map method once per column in Hadoop MapReduce

2013-06-20 Thread Tony Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13689678#comment-13689678
 ] 

Tony Zhao commented on CASSANDRA-5679:
--

They are not the same problem. 4871 is saying that you can't bound the return 
from Cassandra with SlicePredicate. I am saying that the result set I get back 
is bounded to my SlicePredicate but it is not sliced correctly. Without wide 
row, each call to the map method has a chunk of columns limited by the number 
specified in the SlicePredicate. With wide row, each call to the map method has 
only one column. So if the SlicePredicate returns 1000 columns with wide row, 
the map method gets called 1000 times when it should ideally be called just 
once with 1000 columns.

 Wide Row calls map method once per column in Hadoop MapReduce
 -

 Key: CASSANDRA-5679
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5679
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.4
 Environment: CentOS 6.3, Hadoop 1.1.2, Cassandra 1.2.4
Reporter: Tony Zhao

 When using Cassandra without wide row support in a Hadoop job, the map method 
 gets a number of columns limited by the SlicePredicate every time the map 
 method in the mapper is called; but when using wide row support, the map 
 method is called once for every column. It seems like the limit in 
 SlicePredicate is ignored when wide row set to true. 
 This prevents in-map reducing code to work (i.e. emit top ten from a mapper).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira