Or is it because I'm using Pig 0.6 where gz format is not supported? I'll
run this on aws EMR which only pig 0.6 is supported. I have to use later
version of Pig?

On Wed, May 18, 2011 at 11:12 AM, Dexin Wang <wangde...@gmail.com> wrote:

> Hi,
>
> Anyone using Twitter's elephantbird library? I was using its JsonLoader and
> got this error:
>
> WARN  com.twitter.elephantbird.pig.load.JsonLoader - Could not json-decode
> string
> Unexpected character () at position 0.
> at org.json.simple.parser.Yylex.yylex(Unknown Source)
> at org.json.simple.parser.JSONParser.nextToken(Unknown Source)
>  at org.json.simple.parser.JSONParser.parse(Unknown Source)
> at org.json.simple.parser.JSONParser.parse(Unknown Source)
>
> But if I manually gunzip the file to a clear text json file, JsonLoader
> works fine.
>
> Again this fails:
>
> raw_json = LOAD 'cc.json.gz' USING
> com.twitter.elephantbird.pig.load.JsonLoader();
>
> this works:
>
> $ gunzip cc.json.gz
> raw_json = LOAD 'cc.json' USING
> com.twitter.elephantbird.pig.load.JsonLoader();
>
> Any suggestions for this? Or is there any other json loader library out
> there? I can write my own but would rather use one if already exists.
>
> Thanks,
>
> Dexin
>

Reply via email to