[ 
https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116856#comment-13116856
 ] 

Hudson commented on CASSANDRA-2810:
-----------------------------------

Integrated in Cassandra-0.8 #348 (See 
[https://builds.apache.org/job/Cassandra-0.8/348/])
    Fix handling of integer types in pig.
Patch by brandonwilliams, reviewed by Jeremy Hanna for CASSANDRA-2810

brandonwilliams : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1177084
Files : 
* 
/cassandra/branches/cassandra-0.8/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java

                
> RuntimeException in Pig when using "dump" command on column name
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-2810
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.8.1
>         Environment: Ubuntu 10.10, 32 bits
> java version "1.6.0_24"
> Brisk beta-2 installed from Debian packages
>            Reporter: Silvère Lestang
>            Assignee: Brandon Williams
>             Fix For: 0.8.7
>
>         Attachments: 2810-v2.txt, 2810-v3.txt, 2810.txt
>
>
> This bug was previously report on [Brisk bug 
> tracker|https://datastax.jira.com/browse/BRISK-232].
> In cassandra-cli:
> {code}
> [default@unknown] create keyspace Test
>     with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
>     and strategy_options = [{replication_factor:1}];
> [default@unknown] use Test;
> Authenticated to keyspace: Test
> [default@Test] create column family test;
> [default@Test] set test[ascii('row1')][long(1)]=integer(35);
> set test[ascii('row1')][long(2)]=integer(36);
> set test[ascii('row1')][long(3)]=integer(38);
> set test[ascii('row2')][long(1)]=integer(45);
> set test[ascii('row2')][long(2)]=integer(42);
> set test[ascii('row2')][long(3)]=integer(33);
> [default@Test] list test;
> Using default limit of 100
> -------------------
> RowKey: 726f7731
> => (column=0000000000000001, value=35, timestamp=1308744931122000)
> => (column=0000000000000002, value=36, timestamp=1308744931124000)
> => (column=0000000000000003, value=38, timestamp=1308744931125000)
> -------------------
> RowKey: 726f7732
> => (column=0000000000000001, value=45, timestamp=1308744931127000)
> => (column=0000000000000002, value=42, timestamp=1308744931128000)
> => (column=0000000000000003, value=33, timestamp=1308744932722000)
> 2 Rows Returned.
> [default@Test] describe keyspace;
> Keyspace: Test:
>   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
>   Durable Writes: true
>     Options: [replication_factor:1]
>   Column Families:
>     ColumnFamily: test
>       Key Validation Class: org.apache.cassandra.db.marshal.BytesType
>       Default column value validator: 
> org.apache.cassandra.db.marshal.BytesType
>       Columns sorted by: org.apache.cassandra.db.marshal.BytesType
>       Row cache size / save period in seconds: 0.0/0
>       Key cache size / save period in seconds: 200000.0/14400
>       Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
>       GC grace seconds: 864000
>       Compaction min/max thresholds: 4/32
>       Read repair chance: 1.0
>       Replicate on write: false
>       Built indexes: []
> {code}
> In Pig command line:
> {code}
> grunt> test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
> (rowkey:chararray, columns: bag {T: (name:long, value:int)});
> grunt> value_test = foreach test generate rowkey, columns.name, columns.value;
> grunt> dump value_test;
> {code}
> In /var/log/cassandra/system.log, I have severals time this exception:
> {code}
> INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 
> TaskInProgress.java (line 551) Error from 
> attempt_201106210955_0051_m_000000_3: java.lang.RuntimeException: Unexpected 
> data type -1 found in stream.
>       at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
>       at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
>       at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
>       at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
>       at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
>       at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
>       at 
> org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
>       at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
>       at 
> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
>       at 
> org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
>       at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
>       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
>       at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>       at org.apache.hadoop.mapred.Child.main(Child.java:253)
> {code}
> and the request failed.
> {code}
> grunt> test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
> (rowkey:chararray, columns: bag {T: (name:long, value:int)});
> grunt> value_test = foreach test generate rowkey, columns.value;
> grunt> dump value_test;
> {code}
> This time, without the column name, it's work (but the value are displayed as 
> char instead of integer). Result:
> {code}
> (row1,{(#),($),(&)})
> (row2,{(-),(*),(!)})
> {code}
> Now we do the same test but we set a comparator to the CF.
> {code}
> [default@Test] create column family test with comparator = 'LongType';
> [default@Test] set test[ascii('row1')][long(1)]=integer(35);
> set test[ascii('row1')][long(2)]=integer(36);
> set test[ascii('row1')][long(3)]=integer(38);
> set test[ascii('row2')][long(1)]=integer(45);
> set test[ascii('row2')][long(2)]=integer(42);
> set test[ascii('row2')][long(3)]=integer(33);
> [default@Test] list test;
> Using default limit of 100
> -------------------
> RowKey: 726f7731
> => (column=1, value=35, timestamp=1308748643506000)
> => (column=2, value=36, timestamp=1308748643508000)
> => (column=3, value=38, timestamp=1308748643509000)
> -------------------
> RowKey: 726f7732
> => (column=1, value=45, timestamp=1308748643510000)
> => (column=2, value=42, timestamp=1308748643512000)
> => (column=3, value=33, timestamp=1308748645138000)
> 2 Rows Returned.
> [default@Test] describe keyspace;
> Keyspace: Test:
>   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
>   Durable Writes: true
>     Options: [replication_factor:1]
>   Column Families:
>     ColumnFamily: test
>       Key Validation Class: org.apache.cassandra.db.marshal.BytesType
>       Default column value validator: 
> org.apache.cassandra.db.marshal.BytesType
>       Columns sorted by: org.apache.cassandra.db.marshal.LongType
>       Row cache size / save period in seconds: 0.0/0
>       Key cache size / save period in seconds: 200000.0/14400
>       Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
>       GC grace seconds: 864000
>       Compaction min/max thresholds: 4/32
>       Read repair chance: 1.0
>       Replicate on write: false
>       Built indexes: []
> {code}
> {code}
> grunt> test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
> (rowkey:chararray, columns: bag {T: (name:long, value:int)});
> grunt> value_test = foreach test generate rowkey, columns.name, columns.value;
> grunt> dump value_test;
> {code}
> This time it's work as expected (appart from the value displayed as char). 
> Result:
> {code}
> (row1,{(1),(2),(3)},{(#),($),(&)})
> (row2,{(1),(2),(3)},{(-),(*),(!)})
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to