[jira] [Commented] (CASSANDRA-7287) Pig CqlStorage test fails with IAE

2015-01-07 Thread Oksana Danylyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14267720#comment-14267720
 ] 

Oksana Danylyshyn commented on CASSANDRA-7287:
--

[~slebresne],[~brandon.williams]
I am having issues with loading set types into Pig, and with reverting changes 
to this story, it works as expected.

Values of set types are not loading correctly from Cassandra (cql3 table, 
Native protocol v3) into Pig using CqlNativeStorage.  
When using Cassandra version 2.1.0 only empty values are loaded, and for newer 
versions (2.1.1 and 2.1.2) the following error is received: 
org.apache.cassandra.serializers.MarshalException: Unexpected extraneous bytes 
after set value
at 
org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)

Note, that it works correctly for version 2.1.0-rc4 and CqlStorage.

> Pig CqlStorage test fails with IAE
> --
>
> Key: CASSANDRA-7287
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7287
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop, Tests
>Reporter: Brandon Williams
>Assignee: Sylvain Lebresne
> Fix For: 2.1 rc1
>
> Attachments: 7287.txt
>
>
> {noformat}
> [junit] java.lang.IllegalArgumentException
> [junit] at java.nio.Buffer.limit(Buffer.java:267)
> [junit] at 
> org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:542)
> [junit] at 
> org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:117)
> [junit] at 
> org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:97)
> [junit] at 
> org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:28)
> [junit] at 
> org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:48)
> [junit] at 
> org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:66)
> [junit] at 
> org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cassandraToObj(AbstractCassandraStorage.java:792)
> [junit] at 
> org.apache.cassandra.hadoop.pig.CqlStorage.cqlColumnToObj(CqlStorage.java:195)
> [junit] at 
> org.apache.cassandra.hadoop.pig.CqlStorage.getNext(CqlStorage.java:118)
> {noformat}
> I'm guessing this is caused by CqlStorage passing an empty BB to BBU, but I 
> don't know if it's pig that's broken or is a deeper issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8577) Values of set types not loading correctly into Pig

2015-01-07 Thread Oksana Danylyshyn (JIRA)
Oksana Danylyshyn created CASSANDRA-8577:


 Summary: Values of set types not loading correctly into Pig
 Key: CASSANDRA-8577
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8577
 Project: Cassandra
  Issue Type: Bug
Reporter: Oksana Danylyshyn


Values of set types are not loading correctly from Cassandra (cql3 table, 
Native protocol v3) into Pig using CqlNativeStorage. 
When using Cassandra version 2.1.0 only empty values are loaded, and for newer 
versions (2.1.1 and 2.1.2) the following error is received: 
org.apache.cassandra.serializers.MarshalException: Unexpected extraneous bytes 
after set value
at 
org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8577) Values of set types not loading correctly into Pig

2015-01-07 Thread Oksana Danylyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268047#comment-14268047
 ] 

Oksana Danylyshyn commented on CASSANDRA-8577:
--

Note, that it might be related to CASSANDRA-7287, as with reverting changes to 
this story loading sets into Pig works as expected, as well as it does for 
version 2.1.0-rc4 with CqlStorage.

> Values of set types not loading correctly into Pig
> --
>
> Key: CASSANDRA-8577
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8577
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Oksana Danylyshyn
>
> Values of set types are not loading correctly from Cassandra (cql3 table, 
> Native protocol v3) into Pig using CqlNativeStorage. 
> When using Cassandra version 2.1.0 only empty values are loaded, and for 
> newer versions (2.1.1 and 2.1.2) the following error is received: 
> org.apache.cassandra.serializers.MarshalException: Unexpected extraneous 
> bytes after set value
> at 
> org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7287) Pig CqlStorage test fails with IAE

2015-01-07 Thread Oksana Danylyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268052#comment-14268052
 ] 

Oksana Danylyshyn commented on CASSANDRA-7287:
--

created a new ticket: https://issues.apache.org/jira/browse/CASSANDRA-8577

> Pig CqlStorage test fails with IAE
> --
>
> Key: CASSANDRA-7287
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7287
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop, Tests
>Reporter: Brandon Williams
>Assignee: Sylvain Lebresne
> Fix For: 2.1 rc1
>
> Attachments: 7287.txt
>
>
> {noformat}
> [junit] java.lang.IllegalArgumentException
> [junit] at java.nio.Buffer.limit(Buffer.java:267)
> [junit] at 
> org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:542)
> [junit] at 
> org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:117)
> [junit] at 
> org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:97)
> [junit] at 
> org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:28)
> [junit] at 
> org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:48)
> [junit] at 
> org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:66)
> [junit] at 
> org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cassandraToObj(AbstractCassandraStorage.java:792)
> [junit] at 
> org.apache.cassandra.hadoop.pig.CqlStorage.cqlColumnToObj(CqlStorage.java:195)
> [junit] at 
> org.apache.cassandra.hadoop.pig.CqlStorage.getNext(CqlStorage.java:118)
> {noformat}
> I'm guessing this is caused by CqlStorage passing an empty BB to BBU, but I 
> don't know if it's pig that's broken or is a deeper issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8577) Values of set types not loading correctly into Pig

2015-01-07 Thread Oksana Danylyshyn (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oksana Danylyshyn updated CASSANDRA-8577:
-
Description: 
Values of set types are not loading correctly from Cassandra (cql3 table, 
Native protocol v3) into Pig using CqlNativeStorage. 
When using Cassandra version 2.1.0 only empty values are loaded, and for newer 
versions (2.1.1 and 2.1.2) the following error is received: 
org.apache.cassandra.serializers.MarshalException: Unexpected extraneous bytes 
after set value
at 
org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)

Steps to reproduce:

{code}cqlsh:socialdata> CREATE TABLE test (
 key varchar PRIMARY KEY,
 tags set
   );
cqlsh:socialdata> insert into test (key, tags) values ('key', {'Running', 
'onestep4red', 'running'});
cqlsh:socialdata> select * from test;

 key | tags
-+---
 key | {'Running', 'onestep4red', 'running'}

(1 rows){code}


With version 2.1.0:
{code}grunt> data = load 'cql://socialdata/test' using 
org.apache.cassandra.hadoop.pig.CqlNativeStorage();
grunt> dump data;

(key,()){code}

With version 2.1.2:
{code}grunt> data = load 'cql://socialdata/test' using 
org.apache.cassandra.hadoop.pig.CqlNativeStorage();
grunt> dump data;

org.apache.cassandra.serializers.MarshalException: Unexpected extraneous bytes 
after set value
  at 
org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)
  at 
org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:27)
  at 
org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cassandraToObj(AbstractCassandraStorage.java:796)
  at 
org.apache.cassandra.hadoop.pig.CqlStorage.cqlColumnToObj(CqlStorage.java:195)
  at 
org.apache.cassandra.hadoop.pig.CqlNativeStorage.getNext(CqlNativeStorage.java:106)
  at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
  at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
  at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
  at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
  at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212){code}

Expected result:
{code}(key,(Running,onestep4red,running)){code}

  was:
Values of set types are not loading correctly from Cassandra (cql3 table, 
Native protocol v3) into Pig using CqlNativeStorage. 
When using Cassandra version 2.1.0 only empty values are loaded, and for newer 
versions (2.1.1 and 2.1.2) the following error is received: 
org.apache.cassandra.serializers.MarshalException: Unexpected extraneous bytes 
after set value
at 
org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)


> Values of set types not loading correctly into Pig
> --
>
> Key: CASSANDRA-8577
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8577
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Oksana Danylyshyn
>Assignee: Brandon Williams
> Fix For: 2.1.3
>
>
> Values of set types are not loading correctly from Cassandra (cql3 table, 
> Native protocol v3) into Pig using CqlNativeStorage. 
> When using Cassandra version 2.1.0 only empty values are loaded, and for 
> newer versions (2.1.1 and 2.1.2) the following error is received: 
> org.apache.cassandra.serializers.MarshalException: Unexpected extraneous 
> bytes after set value
> at 
> org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)
> Steps to reproduce:
> {code}cqlsh:socialdata> CREATE TABLE test (
>  key varchar PRIMARY KEY,
>  tags set
>);
> cqlsh:socialdata> insert into test (key, tags) values ('key', {'Running', 
> 'onestep4red', 'running'});
> cqlsh:socialdata> select * from test;
>  key | tags
> -+---
>  key | {'Running', 'onestep4red', 'running'}
> (1 rows){code}
> With version 2.1.0:
> {code}grunt> data = load 'cql://socialdata/test' using 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> grunt> dump data;
> (key,()){code}
> With version 2.1.2:
> {code}grunt> data = load 'cql://socialdata/test' using 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> grunt> dump data;
> org.apache.cassandra.serializers.MarshalException: Unexpected extraneous 
> bytes after set value
>   at 
> org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)
>   at 
> org.ap

[jira] [Commented] (CASSANDRA-8577) Values of set types not loading correctly into Pig

2015-01-07 Thread Oksana Danylyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14268524#comment-14268524
 ] 

Oksana Danylyshyn commented on CASSANDRA-8577:
--

[~philipthompson], updated description with reproduction code.

> Values of set types not loading correctly into Pig
> --
>
> Key: CASSANDRA-8577
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8577
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Oksana Danylyshyn
>Assignee: Brandon Williams
> Fix For: 2.1.3
>
>
> Values of set types are not loading correctly from Cassandra (cql3 table, 
> Native protocol v3) into Pig using CqlNativeStorage. 
> When using Cassandra version 2.1.0 only empty values are loaded, and for 
> newer versions (2.1.1 and 2.1.2) the following error is received: 
> org.apache.cassandra.serializers.MarshalException: Unexpected extraneous 
> bytes after set value
> at 
> org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)
> Steps to reproduce:
> {code}cqlsh:socialdata> CREATE TABLE test (
>  key varchar PRIMARY KEY,
>  tags set
>);
> cqlsh:socialdata> insert into test (key, tags) values ('key', {'Running', 
> 'onestep4red', 'running'});
> cqlsh:socialdata> select * from test;
>  key | tags
> -+---
>  key | {'Running', 'onestep4red', 'running'}
> (1 rows){code}
> With version 2.1.0:
> {code}grunt> data = load 'cql://socialdata/test' using 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> grunt> dump data;
> (key,()){code}
> With version 2.1.2:
> {code}grunt> data = load 'cql://socialdata/test' using 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> grunt> dump data;
> org.apache.cassandra.serializers.MarshalException: Unexpected extraneous 
> bytes after set value
>   at 
> org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)
>   at 
> org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:27)
>   at 
> org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cassandraToObj(AbstractCassandraStorage.java:796)
>   at 
> org.apache.cassandra.hadoop.pig.CqlStorage.cqlColumnToObj(CqlStorage.java:195)
>   at 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage.getNext(CqlNativeStorage.java:106)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
>   at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212){code}
> Expected result:
> {code}(key,(Running,onestep4red,running)){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8577) Values of set types not loading correctly into Pig

2015-01-12 Thread Oksana Danylyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274266#comment-14274266
 ] 

Oksana Danylyshyn commented on CASSANDRA-8577:
--

With the patch applied, I am getting values of set types successfully loaded as 
expected.
Thanks.

> Values of set types not loading correctly into Pig
> --
>
> Key: CASSANDRA-8577
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8577
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Oksana Danylyshyn
>Assignee: Brandon Williams
> Fix For: 2.1.3
>
> Attachments: cassandra-2.1-8577.txt
>
>
> Values of set types are not loading correctly from Cassandra (cql3 table, 
> Native protocol v3) into Pig using CqlNativeStorage. 
> When using Cassandra version 2.1.0 only empty values are loaded, and for 
> newer versions (2.1.1 and 2.1.2) the following error is received: 
> org.apache.cassandra.serializers.MarshalException: Unexpected extraneous 
> bytes after set value
> at 
> org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)
> Steps to reproduce:
> {code}cqlsh:socialdata> CREATE TABLE test (
>  key varchar PRIMARY KEY,
>  tags set
>);
> cqlsh:socialdata> insert into test (key, tags) values ('key', {'Running', 
> 'onestep4red', 'running'});
> cqlsh:socialdata> select * from test;
>  key | tags
> -+---
>  key | {'Running', 'onestep4red', 'running'}
> (1 rows){code}
> With version 2.1.0:
> {code}grunt> data = load 'cql://socialdata/test' using 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> grunt> dump data;
> (key,()){code}
> With version 2.1.2:
> {code}grunt> data = load 'cql://socialdata/test' using 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> grunt> dump data;
> org.apache.cassandra.serializers.MarshalException: Unexpected extraneous 
> bytes after set value
>   at 
> org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:94)
>   at 
> org.apache.cassandra.serializers.SetSerializer.deserializeForNativeProtocol(SetSerializer.java:27)
>   at 
> org.apache.cassandra.hadoop.pig.AbstractCassandraStorage.cassandraToObj(AbstractCassandraStorage.java:796)
>   at 
> org.apache.cassandra.hadoop.pig.CqlStorage.cqlColumnToObj(CqlStorage.java:195)
>   at 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage.getNext(CqlNativeStorage.java:106)
>   at 
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211)
>   at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
>   at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212){code}
> Expected result:
> {code}(key,(Running,onestep4red,running)){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8166) Not all data is loaded to Pig using CqlNativeStorage

2014-10-22 Thread Oksana Danylyshyn (JIRA)
Oksana Danylyshyn created CASSANDRA-8166:


 Summary: Not all data is loaded to Pig using CqlNativeStorage
 Key: CASSANDRA-8166
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8166
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Oksana Danylyshyn
 Attachments: sorted.zip

Not all the data from Cassandra table is loaded into Pig using CqlNativeStorage 
function.

Steps to reproduce:

cql3 create table statement:

CREATE TABLE time_bucket_step (
  key varchar,
  object_id varchar,
  value varchar,
  PRIMARY KEY (key, object_id)
);

Loading and saving data to Cassandra ("sorted" file is in the attachment):

time_bucket_step = load 'sorted' using PigStorage('\t','-schema');

records = foreach time_bucket_step
  generate
TOTUPLE(TOTUPLE('key', key),TOTUPLE('object_id', object_id)),
TOTUPLE(value);

store records into 
'cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F'
 using org.apache.cassandra.hadoop.pig.CqlNativeStorage();

Results:

Input(s):
Successfully read 139026 records (5817 bytes) from: "hdfs://.../sorted"
Output(s):
Successfully stored 139026 records in: 
"cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F"


Loading data from Cassandra: (note that not all data are read)

time_bucket_step_cass = load 'cql://socialdata/time_bucket_step' using 
org.apache.cassandra.hadoop.pig.CqlNativeStorage();
store time_bucket_step_cass into 'time_bucket_step_cass' using 
PigStorage('\t','-schema');

Results:

Input(s):
Successfully read 80727 records (20068 bytes) from: 
"cql://socialdata/time_bucket_step"
Output(s):
Successfully stored 80727 records (2098178 bytes) in: 
"hdfs:///time_bucket_step_cass"

Actual: only 80727 of 139026 records were loaded
Expected: All data should be loaded



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8166) Not all data is loaded to Pig using CqlNativeStorage

2014-10-22 Thread Oksana Danylyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180224#comment-14180224
 ] 

Oksana Danylyshyn commented on CASSANDRA-8166:
--

Please also note, that it was tried on version 2.1.0, 
All the data is loading fine using CqlStorage with versions prior to 
2.1.0-rc-6, 
and since 2.1.0-rc6 CqlStorage started to produce errors: Unable to find 
inputformat class 'org.apache.cassandra.hadoop.cql3.CqlPagingInputFormat', 
however CqlNativeStorage works without errors, but does not return all the data.
Issue is also reproduced for tables with non-compound keys.

> Not all data is loaded to Pig using CqlNativeStorage
> 
>
> Key: CASSANDRA-8166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8166
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Reporter: Oksana Danylyshyn
> Attachments: sorted.zip
>
>
> Not all the data from Cassandra table is loaded into Pig using 
> CqlNativeStorage function.
> Steps to reproduce:
> cql3 create table statement:
> CREATE TABLE time_bucket_step (
>   key varchar,
>   object_id varchar,
>   value varchar,
>   PRIMARY KEY (key, object_id)
> );
> Loading and saving data to Cassandra ("sorted" file is in the attachment):
> time_bucket_step = load 'sorted' using PigStorage('\t','-schema');
> records = foreach time_bucket_step
>   generate
> TOTUPLE(TOTUPLE('key', key),TOTUPLE('object_id', object_id)),
> TOTUPLE(value);
> store records into 
> 'cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F'
>  using org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> Results:
> Input(s):
> Successfully read 139026 records (5817 bytes) from: "hdfs://.../sorted"
> Output(s):
> Successfully stored 139026 records in: 
> "cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F"
> Loading data from Cassandra: (note that not all data are read)
> time_bucket_step_cass = load 'cql://socialdata/time_bucket_step' using 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> store time_bucket_step_cass into 'time_bucket_step_cass' using 
> PigStorage('\t','-schema');
> Results:
> Input(s):
> Successfully read 80727 records (20068 bytes) from: 
> "cql://socialdata/time_bucket_step"
> Output(s):
> Successfully stored 80727 records (2098178 bytes) in: 
> "hdfs:///time_bucket_step_cass"
> Actual: only 80727 of 139026 records were loaded
> Expected: All data should be loaded



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8166) Not all data is loaded to Pig using CqlNativeStorage

2014-10-22 Thread Oksana Danylyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180478#comment-14180478
 ] 

Oksana Danylyshyn commented on CASSANDRA-8166:
--

yes, its 139026
{code}cqlsh:socialdata> select count(*) from time_bucket_step limit 15;

 count

 139026{code}

> Not all data is loaded to Pig using CqlNativeStorage
> 
>
> Key: CASSANDRA-8166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8166
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Reporter: Oksana Danylyshyn
>Assignee: Alex Liu
> Attachments: sorted.zip
>
>
> Not all the data from Cassandra table is loaded into Pig using 
> CqlNativeStorage function.
> Steps to reproduce:
> cql3 create table statement:
> CREATE TABLE time_bucket_step (
>   key varchar,
>   object_id varchar,
>   value varchar,
>   PRIMARY KEY (key, object_id)
> );
> Loading and saving data to Cassandra ("sorted" file is in the attachment):
> time_bucket_step = load 'sorted' using PigStorage('\t','-schema');
> records = foreach time_bucket_step
>   generate
> TOTUPLE(TOTUPLE('key', key),TOTUPLE('object_id', object_id)),
> TOTUPLE(value);
> store records into 
> 'cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F'
>  using org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> Results:
> Input(s):
> Successfully read 139026 records (5817 bytes) from: "hdfs://.../sorted"
> Output(s):
> Successfully stored 139026 records in: 
> "cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F"
> Loading data from Cassandra: (note that not all data are read)
> time_bucket_step_cass = load 'cql://socialdata/time_bucket_step' using 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> store time_bucket_step_cass into 'time_bucket_step_cass' using 
> PigStorage('\t','-schema');
> Results:
> Input(s):
> Successfully read 80727 records (20068 bytes) from: 
> "cql://socialdata/time_bucket_step"
> Output(s):
> Successfully stored 80727 records (2098178 bytes) in: 
> "hdfs:///time_bucket_step_cass"
> Actual: only 80727 of 139026 records were loaded
> Expected: All data should be loaded



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8166) Not all data is loaded to Pig using CqlNativeStorage

2014-10-22 Thread Oksana Danylyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180478#comment-14180478
 ] 

Oksana Danylyshyn edited comment on CASSANDRA-8166 at 10/22/14 8:26 PM:


yes, it's 139026
{code}cqlsh:socialdata> select count(*) from time_bucket_step limit 15;

 count

 139026{code}


was (Author: oksana danylyshyn):
yes, its 139026
{code}cqlsh:socialdata> select count(*) from time_bucket_step limit 15;

 count

 139026{code}

> Not all data is loaded to Pig using CqlNativeStorage
> 
>
> Key: CASSANDRA-8166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8166
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Reporter: Oksana Danylyshyn
>Assignee: Alex Liu
> Attachments: sorted.zip
>
>
> Not all the data from Cassandra table is loaded into Pig using 
> CqlNativeStorage function.
> Steps to reproduce:
> cql3 create table statement:
> CREATE TABLE time_bucket_step (
>   key varchar,
>   object_id varchar,
>   value varchar,
>   PRIMARY KEY (key, object_id)
> );
> Loading and saving data to Cassandra ("sorted" file is in the attachment):
> time_bucket_step = load 'sorted' using PigStorage('\t','-schema');
> records = foreach time_bucket_step
>   generate
> TOTUPLE(TOTUPLE('key', key),TOTUPLE('object_id', object_id)),
> TOTUPLE(value);
> store records into 
> 'cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F'
>  using org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> Results:
> Input(s):
> Successfully read 139026 records (5817 bytes) from: "hdfs://.../sorted"
> Output(s):
> Successfully stored 139026 records in: 
> "cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F"
> Loading data from Cassandra: (note that not all data are read)
> time_bucket_step_cass = load 'cql://socialdata/time_bucket_step' using 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> store time_bucket_step_cass into 'time_bucket_step_cass' using 
> PigStorage('\t','-schema');
> Results:
> Input(s):
> Successfully read 80727 records (20068 bytes) from: 
> "cql://socialdata/time_bucket_step"
> Output(s):
> Successfully stored 80727 records (2098178 bytes) in: 
> "hdfs:///time_bucket_step_cass"
> Actual: only 80727 of 139026 records were loaded
> Expected: All data should be loaded



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8166) Not all data is loaded to Pig using CqlNativeStorage

2014-10-22 Thread Oksana Danylyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180527#comment-14180527
 ] 

Oksana Danylyshyn commented on CASSANDRA-8166:
--

just added schema file in the attachments

> Not all data is loaded to Pig using CqlNativeStorage
> 
>
> Key: CASSANDRA-8166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8166
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Reporter: Oksana Danylyshyn
>Assignee: Alex Liu
> Attachments: pig_header, pig_schema, sorted.zip
>
>
> Not all the data from Cassandra table is loaded into Pig using 
> CqlNativeStorage function.
> Steps to reproduce:
> cql3 create table statement:
> CREATE TABLE time_bucket_step (
>   key varchar,
>   object_id varchar,
>   value varchar,
>   PRIMARY KEY (key, object_id)
> );
> Loading and saving data to Cassandra ("sorted" file is in the attachment):
> time_bucket_step = load 'sorted' using PigStorage('\t','-schema');
> records = foreach time_bucket_step
>   generate
> TOTUPLE(TOTUPLE('key', key),TOTUPLE('object_id', object_id)),
> TOTUPLE(value);
> store records into 
> 'cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F'
>  using org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> Results:
> Input(s):
> Successfully read 139026 records (5817 bytes) from: "hdfs://.../sorted"
> Output(s):
> Successfully stored 139026 records in: 
> "cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F"
> Loading data from Cassandra: (note that not all data are read)
> time_bucket_step_cass = load 'cql://socialdata/time_bucket_step' using 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> store time_bucket_step_cass into 'time_bucket_step_cass' using 
> PigStorage('\t','-schema');
> Results:
> Input(s):
> Successfully read 80727 records (20068 bytes) from: 
> "cql://socialdata/time_bucket_step"
> Output(s):
> Successfully stored 80727 records (2098178 bytes) in: 
> "hdfs:///time_bucket_step_cass"
> Actual: only 80727 of 139026 records were loaded
> Expected: All data should be loaded



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8166) Not all data is loaded to Pig using CqlNativeStorage

2014-10-22 Thread Oksana Danylyshyn (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oksana Danylyshyn updated CASSANDRA-8166:
-
Attachment: pig_header
pig_schema

> Not all data is loaded to Pig using CqlNativeStorage
> 
>
> Key: CASSANDRA-8166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8166
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Reporter: Oksana Danylyshyn
>Assignee: Alex Liu
> Attachments: pig_header, pig_schema, sorted.zip
>
>
> Not all the data from Cassandra table is loaded into Pig using 
> CqlNativeStorage function.
> Steps to reproduce:
> cql3 create table statement:
> CREATE TABLE time_bucket_step (
>   key varchar,
>   object_id varchar,
>   value varchar,
>   PRIMARY KEY (key, object_id)
> );
> Loading and saving data to Cassandra ("sorted" file is in the attachment):
> time_bucket_step = load 'sorted' using PigStorage('\t','-schema');
> records = foreach time_bucket_step
>   generate
> TOTUPLE(TOTUPLE('key', key),TOTUPLE('object_id', object_id)),
> TOTUPLE(value);
> store records into 
> 'cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F'
>  using org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> Results:
> Input(s):
> Successfully read 139026 records (5817 bytes) from: "hdfs://.../sorted"
> Output(s):
> Successfully stored 139026 records in: 
> "cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F"
> Loading data from Cassandra: (note that not all data are read)
> time_bucket_step_cass = load 'cql://socialdata/time_bucket_step' using 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> store time_bucket_step_cass into 'time_bucket_step_cass' using 
> PigStorage('\t','-schema');
> Results:
> Input(s):
> Successfully read 80727 records (20068 bytes) from: 
> "cql://socialdata/time_bucket_step"
> Output(s):
> Successfully stored 80727 records (2098178 bytes) in: 
> "hdfs:///time_bucket_step_cass"
> Actual: only 80727 of 139026 records were loaded
> Expected: All data should be loaded



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8166) Not all data is loaded to Pig using CqlNativeStorage

2014-10-23 Thread Oksana Danylyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14181266#comment-14181266
 ] 

Oksana Danylyshyn commented on CASSANDRA-8166:
--

With the patch applied, the issue is fixed. Here is what I get when trying to 
load data from Cassandra now:
{code}Input(s):
Successfully read 139026 records (20068 bytes) from: 
"cql://socialdata/time_bucket_step"{code}

Thank you for a prompt fix.

> Not all data is loaded to Pig using CqlNativeStorage
> 
>
> Key: CASSANDRA-8166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8166
> Project: Cassandra
>  Issue Type: Bug
>  Components: Hadoop
>Reporter: Oksana Danylyshyn
>Assignee: Alex Liu
> Attachments: 8166_2.1_branch.txt, pig_header, pig_schema, sorted.zip
>
>
> Not all the data from Cassandra table is loaded into Pig using 
> CqlNativeStorage function.
> Steps to reproduce:
> cql3 create table statement:
> CREATE TABLE time_bucket_step (
>   key varchar,
>   object_id varchar,
>   value varchar,
>   PRIMARY KEY (key, object_id)
> );
> Loading and saving data to Cassandra ("sorted" file is in the attachment):
>  time_bucket_step = load 'sorted' using PigStorage('\t') as (key:chararray, 
> object_id:chararray, value:chararray);
> records = foreach time_bucket_step
>   generate
> TOTUPLE(TOTUPLE('key', key),TOTUPLE('object_id', object_id)),
> TOTUPLE(value);
> store records into 
> 'cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F'
>  using org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> Results:
> Input(s):
> Successfully read 139026 records (5817 bytes) from: "hdfs://.../sorted"
> Output(s):
> Successfully stored 139026 records in: 
> "cql://socialdata/time_bucket_step?output_query=UPDATE+socialdata.time_bucket_step+set+value+%3D+%3F"
> Loading data from Cassandra: (note that not all data are read)
> time_bucket_step_cass = load 'cql://socialdata/time_bucket_step' using 
> org.apache.cassandra.hadoop.pig.CqlNativeStorage();
> store time_bucket_step_cass into 'time_bucket_step_cass' using 
> PigStorage('\t','-schema');
> Results:
> Input(s):
> Successfully read 80727 records (20068 bytes) from: 
> "cql://socialdata/time_bucket_step"
> Output(s):
> Successfully stored 80727 records (2098178 bytes) in: 
> "hdfs:///time_bucket_step_cass"
> Actual: only 80727 of 139026 records were loaded
> Expected: All data should be loaded



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)