[ https://issues.apache.org/jira/browse/CASSANDRA-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268524#comment-14268524 ]
Oksana Danylyshyn commented on CASSANDRA-8577: ---------------------------------------------- [~philipthompson], updated description with reproduction code. > Values of set types not loading correctly into Pig > -------------------------------------------------- > > Key: CASSANDRA-8577 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8577 > Project: Cassandra > Issue Type: Bug > Reporter: Oksana Danylyshyn > Assignee: Brandon Williams > Fix For: 2.1.3 > > > Values of set types are not loading correctly from Cassandra (cql3 table, > Native protocol v3) into Pig using CqlNativeStorage. > When using Cassandra version 2.1.0 only empty values are loaded, and for > newer versions (2.1.1 and 2.1.2) the following error is received: > org.apache.cassandra.serializers.MarshalException: Unexpected extraneous > bytes after set value > at > org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94) > Steps to reproduce: > {code}cqlsh:socialdata> CREATE TABLE test ( > key varchar PRIMARY KEY, > tags set<varchar> > ); > cqlsh:socialdata> insert into test (key, tags) values ('key', {'Running', > 'onestep4red', 'running'}); > cqlsh:socialdata> select * from test; > key | tags > -----+--------------------------------------- > key | {'Running', 'onestep4red', 'running'} > (1 rows){code} > With version 2.1.0: > {code}grunt> data = load 'cql://socialdata/test' using > org.apache.cassandra.hadoop.pig.CqlNativeStorage(); > grunt> dump data; > (key,()){code} > With version 2.1.2: > {code}grunt> data = load 'cql://socialdata/test' using > org.apache.cassandra.hadoop.pig.CqlNativeStorage(); > grunt> dump data; > org.apache.cassandra.serializers.MarshalException: Unexpected extraneous > bytes after set value > at > org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94) > at > org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:27) > at > org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cassandraToObj(AbstractCassandraStorage.java:796) > at > org.apache.cassandra.hadoop.pig.CqlStorage.cqlColumnToObj(CqlStorage.java:195) > at > org.apache.cassandra.hadoop.pig.CqlNativeStorage.getNext(CqlNativeStorage.java:106) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) > at > org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532) > at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212){code} > Expected result: > {code}(key,(Running,onestep4red,running)){code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)