I forgot about UDFContext providing the schema, and the pig docs are out of date. Is no problem now.
About default behavior for json, that would seem to be: tuples -> objects, bags -> arrays, integers -> long, decimals -> double, and configs for setting low precision to substitute int/float. Maps are loaded separately anyway, and can use their own loadfunc. Easily loading json would be a huge boon to Pig's accessibility. I don't see a reason to postpone acessibility. Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com On Apr 10, 2012, at 7:53 AM, Dmitriy Ryaboy <dvrya...@gmail.com> wrote: > first question: you can do this when outputSchema() is called, as it's > passed the input schema. IIRC, in trunk you have hooks to pass that > info to the backend in a udf. > > second question: see discussion on JsonLoader jira.. short answer: > non-trivial, no clear decision on what the most sensible thing to do > is (other than "map" which is unlikely to be what you want). Rather > than do something bad and then be stuck with a poor decision, allowing > people to provide their own schema instead for now. > > D > > On Tue, Apr 10, 2012 at 1:48 AM, Russell Jurney > <russell.jur...@gmail.com> wrote: >> Followup question: would it be nice if JsonLoader inferred schemas when >> none is present, according to some defaults? >> >> On Tue, Apr 10, 2012 at 12:48 AM, Russell Jurney >> <russell.jur...@gmail.com>wrote: >> >>> Is there a way to get the field names in an EvalFunc? I am close to done >>> but... no cigar :) I need these to finish. >>> >>> >>> On Mon, Apr 9, 2012 at 11:03 PM, Russell Jurney >>> <russell.jur...@gmail.com>wrote: >>> >>>> So far this is not easy. >>>> >>>> >>>> On Mon, Apr 9, 2012 at 5:42 PM, Russell Jurney >>>> <russell.jur...@gmail.com>wrote: >>>> >>>>> I see Jackson being used in the Mozilla stuff. It looks pretty >>>>> straightforward. >>>>> >>>>> >>>>> On Mon, Apr 9, 2012 at 5:38 PM, Dmitriy Ryaboy <dvrya...@gmail.com>wrote: >>>>> >>>>>> Jackson is your friend. >>>>>> >>>>>> On Mon, Apr 9, 2012 at 5:14 PM, Russell Jurney < >>>>>> russell.jur...@gmail.com> wrote: >>>>>>> I need to be able to JSONize and return json:chararray's of any pig >>>>>>> datatypes, to be able to index complex types in ElasticSearch via >>>>>>> Wonderdog. See: https://issues.apache.org/jira/browse/PIG-2641 >>>>>>> >>>>>>> Does anyone have existing code they can contribute to a toJSON UDF >>>>>> that >>>>>>> handles all these types? >>>>>>> >>>>>>> For instance, Mozilla has this Map to JSON UDF: >>>>>>> >>>>>> https://github.com/mozilla-metrics/akela/blob/master/src/main/java/com/mozilla/pig/eval/json/MapToJson.java >>>>>>> >>>>>>> It is apache licensed, so I think I can paste it into a general >>>>>> toJSON UDF? >>>>>>> >>>>>>> >>>>>>> Elephant-bird has this code, which turns JSON to Maps: >>>>>>> >>>>>> https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/pig/piggybank/JsonStringToMap.java >>>>>>> >>>>>>> ehh... thinking out loud... I'm just gonna do this in JRuby. If that >>>>>> has >>>>>>> issues, Python. >>>>>>> >>>>>>> Solved! :) >>>>>>> >>>>>>> -- >>>>>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com >>>>>> datasyndrome.com >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome >>>>> .com >>>>> >>>> >>>> >>>> >>>> -- >>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome. >>>> com >>>> >>> >>> >>> >>> -- >>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome. >>> com >>> >> >> >> >> -- >> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com