[ 
https://issues.apache.org/jira/browse/CASSANDRA-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna reopened CASSANDRA-5488:
-------------------------------------


There ended up being a secondary problem that was hidden by the first NPE.  It 
seems to be related to getting the AbstractType.  The NPE was for this line: 
https://github.com/apache/cassandra/blob/cassandra-1.1/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java#L307
 which I decomposed to find out what it was NPEing on, and got this:
{code}
            List<AbstractType> atList = getDefaultMarshallers(cfDef);
            AbstractType at = atList.get(2);
            Object o = at.compose(key); //NPE from this line
            setTupleValue(tuple, 0, o);
            //setTupleValue(tuple, 0, 
getDefaultMarshallers(cfDef).get(2).compose(key));
{code}

So it seems unrelated to the original NPE, but still matches the description of 
this ticket.

To reproduce, here is my schema:
{code}
CREATE KEYSPACE circus
with placement_strategy = 'SimpleStrategy'
and strategy_options = {replication_factor:1};

use circus;

CREATE COLUMN FAMILY acrobats
WITH comparator = UTF8Type
AND key_validation_class=UTF8Type
AND default_validation_class = UTF8Type;
{code}

Here is a pycassa script to create the data:
{code}
from pycassa.pool import ConnectionPool
from pycassa.columnfamily import ColumnFamily

pool = ConnectionPool('circus')
col_fam = pycassa.ColumnFamily(pool, 'acrobats')

for i in range(1, 10):
    for j in range(1, 200000):
        col_fam.insert('row_key' + str(i), {str(j): 'val'})
{code}

Here is the pig (0.9.2) that I'm running in local mode:
{code}
rows = LOAD 'cassandra://circus/acrobats?widerows=true&limit=200000' USING 
CassandraStorage();
filtered = filter rows by key == 'row_key1';
columns = foreach filtered generate flatten(columns);
counted = foreach (group columns all) generate COUNT($1);
dump counted;
{code}
                
> CassandraStorage throws NullPointerException (NPE) when widerows is set to 
> 'true'
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5488
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5488
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.1.9, 1.2.4
>         Environment: Ubuntu 12.04.1 x64, Cassandra 1.2.4
>            Reporter: Sheetal Gosrani
>            Assignee: Sheetal Gosrani
>            Priority: Minor
>              Labels: cassandra, hadoop, pig
>             Fix For: 1.1.12, 1.2.6
>
>         Attachments: 5488-2.txt, 5488.txt
>
>
> CassandraStorage throws NPE when widerows is set to 'true'. 
> 2 problems in getNextWide:
> 1. Creation of tuple without specifying size
> 2. Calling addKeyToTuple on lastKey instead of key
> java.lang.NullPointerException
>     at 
> org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167)
>     at 
> org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124)
>     at org.apache.cassandra.cql.jdbc.JdbcUTF8.getString(JdbcUTF8.java:73)
>     at org.apache.cassandra.cql.jdbc.JdbcUTF8.compose(JdbcUTF8.java:93)
>     at org.apache.cassandra.db.marshal.UTF8Type.compose(UTF8Type.java:34)
>     at org.apache.cassandra.db.marshal.UTF8Type.compose(UTF8Type.java:26)
>     at 
> org.apache.cassandra.hadoop.pig.CassandraStorage.addKeyToTuple(CassandraStorage.java:313)
>     at 
> org.apache.cassandra.hadoop.pig.CassandraStorage.getNextWide(CassandraStorage.java:196)
>     at 
> org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(CassandraStorage.java:224)
>     at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:194)
>     at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
>     at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>     at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:415)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>     at org.apache.hadoop.mapred.Child.main(Child.java:249)
> 2013-04-16 12:28:03,671 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to