Just in case someone hits this thread by having the same issue, please vote
for this bug:
https://issues.apache.org/jira/browse/PIG-1271
On Tue, Nov 6, 2012 at 4:50 PM, William Oberman wrote:
> Wow, ok. That is completely unexpected. Thanks for the heads up!
>
> In my case, because part of my
Wow, ok. That is completely unexpected. Thanks for the heads up!
In my case, because part of my data is binary (UUIDs from Cassandra) all
possible characters can appear in the data, making PigStorage unhelpful
;-)
I just tried AvroStorage in piggybank and that is able to store/load my
data
>> This is a dumb question, but PigStorage escapes the delimiter, right?
No it doesn't.
On Tue, Nov 6, 2012 at 1:29 PM, William Oberman wrote:
> This is a dumb question, but PigStorage escapes the delimiter, right? I
> was assuming I didn't have to select a delimiter such that it doesn't
> appe
This is a dumb question, but PigStorage escapes the delimiter, right? I
was assuming I didn't have to select a delimiter such that it doesn't
appear in the data as it would get escaped by the export process, and
unescaped in the import process
On Tue, Nov 6, 2012 at 4:01 PM, Cheolsoo Park w
Hi Will,
>> data = LOAD 'hdfs://ZZZ/tmp/test' USING PigStorage() AS
(key:chararray,columns:bag {column:tuple (name, value)});
Can you please provide some of your data from this file
(hdfs://ZZZ/tmp/test) that can help us to reproduce your problem? 1 ~ 2
rows would be sufficient.
Thanks,
Cheolsoo
I'm trying to play around with Amazon EMR, and I currently have self hosted
Cassandra as the source of data. I was going to try to do: Cassandra -> S3
-> EMR. I've traced my problems to PigStorage. At this point I can
recreate my problem "locally" without involving S3 or Amazon.
In my local tes