I forgot about UDFContext providing the schema, and the pig docs are
out of date. Is no problem now.

About default behavior for json, that would seem to be: tuples ->
objects, bags -> arrays, integers -> long, decimals -> double, and
configs for setting low precision to substitute int/float. Maps are
loaded separately anyway, and can use their own loadfunc.

Easily loading json would be a huge boon to Pig's accessibility. I
don't see a reason to postpone acessibility.

Russell Jurney
twitter.com/rjurney
russell.jur...@gmail.com
datasyndrome.com

On Apr 10, 2012, at 7:53 AM, Dmitriy Ryaboy <dvrya...@gmail.com> wrote:

> first question: you can do this when outputSchema() is called, as it's
> passed the input schema. IIRC, in trunk you have hooks to pass that
> info to the backend in a udf.
>
> second question: see discussion on JsonLoader jira.. short answer:
> non-trivial, no clear decision on what the most sensible thing to do
> is (other than "map" which is unlikely to be what you want). Rather
> than do something bad and then be stuck with a poor decision, allowing
> people to provide their own schema instead for now.
>
> D
>
> On Tue, Apr 10, 2012 at 1:48 AM, Russell Jurney
> <russell.jur...@gmail.com> wrote:
>> Followup question: would it be nice if JsonLoader inferred schemas when
>> none is present, according to some defaults?
>>
>> On Tue, Apr 10, 2012 at 12:48 AM, Russell Jurney
>> <russell.jur...@gmail.com>wrote:
>>
>>> Is there a way to get the field names in an EvalFunc? I am close to done
>>> but... no cigar :)  I need these to finish.
>>>
>>>
>>> On Mon, Apr 9, 2012 at 11:03 PM, Russell Jurney 
>>> <russell.jur...@gmail.com>wrote:
>>>
>>>> So far this is not easy.
>>>>
>>>>
>>>> On Mon, Apr 9, 2012 at 5:42 PM, Russell Jurney 
>>>> <russell.jur...@gmail.com>wrote:
>>>>
>>>>> I see Jackson being used in the Mozilla stuff.  It looks pretty
>>>>> straightforward.
>>>>>
>>>>>
>>>>> On Mon, Apr 9, 2012 at 5:38 PM, Dmitriy Ryaboy <dvrya...@gmail.com>wrote:
>>>>>
>>>>>> Jackson is your friend.
>>>>>>
>>>>>> On Mon, Apr 9, 2012 at 5:14 PM, Russell Jurney <
>>>>>> russell.jur...@gmail.com> wrote:
>>>>>>> I need to be able to JSONize and return json:chararray's of any pig
>>>>>>> datatypes, to be able to index complex types in ElasticSearch via
>>>>>>> Wonderdog.  See: https://issues.apache.org/jira/browse/PIG-2641
>>>>>>>
>>>>>>> Does anyone have existing code they can contribute to a toJSON UDF
>>>>>> that
>>>>>>> handles all these types?
>>>>>>>
>>>>>>> For instance, Mozilla has this Map to JSON UDF:
>>>>>>>
>>>>>> https://github.com/mozilla-metrics/akela/blob/master/src/main/java/com/mozilla/pig/eval/json/MapToJson.java
>>>>>>>
>>>>>>> It is apache licensed, so I think I can paste it into a general
>>>>>> toJSON UDF?
>>>>>>>
>>>>>>>
>>>>>>> Elephant-bird has this code, which turns JSON to Maps:
>>>>>>>
>>>>>> https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/pig/piggybank/JsonStringToMap.java
>>>>>>>
>>>>>>>  ehh... thinking out loud... I'm just gonna do this in JRuby. If that
>>>>>> has
>>>>>>> issues, Python.
>>>>>>>
>>>>>>> Solved! :)
>>>>>>>
>>>>>>> --
>>>>>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com
>>>>>> datasyndrome.com
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome
>>>>> .com
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.
>>>> com
>>>>
>>>
>>>
>>>
>>> --
>>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.
>>> com
>>>
>>
>>
>>
>> --
>> Russell Jurney twitter.com/rjurney russell.jur...@gmail.com datasyndrome.com

Reply via email to