Re: Having troubles with PigStorage

2012-11-06 Thread William Oberman
Just in case someone hits this thread by having the same issue, please vote for this bug: https://issues.apache.org/jira/browse/PIG-1271 On Tue, Nov 6, 2012 at 4:50 PM, William Oberman wrote: > Wow, ok. That is completely unexpected. Thanks for the heads up! > > In my case, because part of my

Re: Having troubles with PigStorage

2012-11-06 Thread William Oberman
Wow, ok. That is completely unexpected. Thanks for the heads up! In my case, because part of my data is binary (UUIDs from Cassandra) all possible characters can appear in the data, making PigStorage unhelpful ;-) I just tried AvroStorage in piggybank and that is able to store/load my data

Re: Having troubles with PigStorage

2012-11-06 Thread Cheolsoo Park
>> This is a dumb question, but PigStorage escapes the delimiter, right? No it doesn't. On Tue, Nov 6, 2012 at 1:29 PM, William Oberman wrote: > This is a dumb question, but PigStorage escapes the delimiter, right? I > was assuming I didn't have to select a delimiter such that it doesn't > appe

Re: Having troubles with PigStorage

2012-11-06 Thread William Oberman
This is a dumb question, but PigStorage escapes the delimiter, right? I was assuming I didn't have to select a delimiter such that it doesn't appear in the data as it would get escaped by the export process, and unescaped in the import process On Tue, Nov 6, 2012 at 4:01 PM, Cheolsoo Park w

Re: Having troubles with PigStorage

2012-11-06 Thread Cheolsoo Park
Hi Will, >> data = LOAD 'hdfs://ZZZ/tmp/test' USING PigStorage() AS (key:chararray,columns:bag {column:tuple (name, value)}); Can you please provide some of your data from this file (hdfs://ZZZ/tmp/test) that can help us to reproduce your problem? 1 ~ 2 rows would be sufficient. Thanks, Cheolsoo

Having troubles with PigStorage

2012-11-06 Thread William Oberman
I'm trying to play around with Amazon EMR, and I currently have self hosted Cassandra as the source of data. I was going to try to do: Cassandra -> S3 -> EMR. I've traced my problems to PigStorage. At this point I can recreate my problem "locally" without involving S3 or Amazon. In my local tes