[ 
https://issues.apache.org/jira/browse/PIG-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Packer updated PIG-1914:
---------------------------------

                 Tags: JSON LoadFunc StoreFunc  (was: JSON LoadFunc)
    Affects Version/s:     (was: 0.9.0)
                           (was: 0.8.0)
                       0.11
         Release Note: Adds Piggybank functions for loading/storing JSON 
without relying on storing metadata alongside it.  (was: Adds support for 
loading JSON data in Pig)
               Status: Patch Available  (was: Open)

Hi, I submitted a patch with an implementation of JSON load and store functions 
which do not rely on metadata being stored alongside the data. There is javadoc 
documentation for each function, but here is a summary of the features.

The JsonLoader can either be passed a schema as a string argument, or it can 
infer a schema if none is provided. If passed a schema, it will load fields in 
the JSON which match the field names in the schema, ignoring extra fields, 
writing nulls for missing fields, and handling out-of-order fields properly.

If not passed a schema, it will load the entire document as a map. The values 
of the map will either be bytearrays (for scalar values) or further maps/bags 
(for nested objects and arrays).

Example usage:

json = LOAD '$INPUT_PATH' USING org.apache.pig.piggybank.storage.JsonLoader('a: 
int, t: (i: int, j: int)');

STORE json INTO '$OUTPUT_PATH' USING 
org.apache.pig.piggybank.storage.JsonStorage();

Jonathan Packer (Mortar Data)
                
> Support load/store JSON data in Pig
> -----------------------------------
>
>                 Key: PIG-1914
>                 URL: https://issues.apache.org/jira/browse/PIG-1914
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.11
>            Reporter: Chao Tian
>         Attachments: json.patch, PIG-1914.patch
>
>
> The JSON is a commonly used data storage format. It is popular for storing 
> structured data, especially for JavaScript data exchange. 
> Pig should have the ability to load/store JSON format data. I plan to write 
> one for the piggy bank.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to