[ 
https://issues.apache.org/jira/browse/HAWQ-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219495#comment-15219495
 ] 

Ian Hellstrom commented on HAWQ-178:
------------------------------------

I understand that for most 'social network' purposes that is fine; most of the 
examples I see are based on Twitter at least. For many manufacturing/healthcare 
companies that won't do. They have highly nested data structures and a lot of 
those are well-structured. I am working on many use cases (with Spark) where 
there are arrays of structs (with several layers of arrays of structs within). 
Unnesting these is for some purposes a must, for instance when feeding to a BI 
tool.

Support for unnesting very complex JSONs is also hit-and-miss in Hive. Plus, 
when you already do the bulk of the work in Hive, many won't like the idea of 
using HAWQ (or something else) on top of that.

Having these structures as TEXT requires lots of messy regex. I'm just saying 
that, so you know where I'm coming from.

> Add JSON plugin support in code base
> ------------------------------------
>
>                 Key: HAWQ-178
>                 URL: https://issues.apache.org/jira/browse/HAWQ-178
>             Project: Apache HAWQ
>          Issue Type: New Feature
>          Components: PXF
>            Reporter: Goden Yao
>            Assignee: Christian Tzolov
>             Fix For: backlog
>
>         Attachments: PXFJSONPluginforHAWQ2.0andPXF3.0.0.pdf, 
> PXFJSONPluginforHAWQ2.0andPXF3.0.0v.2.pdf, 
> PXFJSONPluginforHAWQ2.0andPXF3.0.0v.3.pdf
>
>
> JSON has been a popular format used in HDFS as well as in the community, 
> there has been a few JSON PXF plugins developed by the community and we'd 
> like to see it being incorporated into the code base as an optional package.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to