I see where you are going with this now....

Not sure if you might be bumping into this bug:
https://issues.apache.org/jira/browse/HIVE-1575

since this line
  LATERAL VIEW json_tuple(tr2.extended_entities, 'media') tr3 as media
pulls the JSON array as a "top-level" object...

does this not work?
  LATERAL VIEW json_tuple(tr1.text_col, 'id', 'extended_entities', 
'possibly_sensitive', 'extended_entities.media') tr2 as id, extended_entities, 
possibly_sensitive, media


From: Sam Joe [mailto:games2013....@gmail.com]
Sent: Tuesday, October 27, 2015 2:25 PM
To: user@hive.apache.org
Subject: Re: Using json_tuple for Nested json Arrays

Hi Ryan,

The simple query is running fine as shown below:

hive>   SELECT tr2.id<http://tr2.id>, tr2.possibly_sensitive
    >   FROM tweets_raw tr1
    >   LATERAL VIEW json_tuple(tr1.text_col, 'id', 'extended_entities', 
'possibly_sensitive') tr2 as id, extended_entities, possibly_sensitive
    >   where tr2.id<http://tr2.id>='654395184428515332'
    >   LIMIT 1;
OK
654395184428515332      false
Time taken: 1.813 seconds, Fetched: 1 row(s)
hive>


However, if I try to get any data from the json array it's failing.

Thanks,
Joel

On Tue, Oct 27, 2015 at 4:21 PM, Ryan Harris 
<ryan.har...@zionsbancorp.com<mailto:ryan.har...@zionsbancorp.com>> wrote:
looking at your sample data, you shouldn't need to use lateral view explode 
unless you are trying to get 1 entry per row for  your media sizes (thumb, 
small, large, medium, etc) ...

Try starting with something simple like :

SELECT get_json_object(text_col, '$.id') as id FROM tweets_raw limit 10;

You should also be able to use json_tuple(), but start simple....

From: Sam Joe [mailto:games2013....@gmail.com<mailto:games2013....@gmail.com>]
Sent: Tuesday, October 27, 2015 1:43 PM

To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Re: Using json_tuple for Nested json Arrays

Hi Ryan,

Thanks for your reply! I didn't try using  get_json_object() UDF. I will try 
using that and let you know the results.

I tried using the following script which failed :

  SELECT tr2.id<http://tr2.id>, tr2.possibly_sensitive, tr3.media, 
media_object.source_user_id
  FROM tweets_raw tr1
  LATERAL VIEW json_tuple(tr1.text_col, 'id', 'extended_entities', 
'possibly_sensitive') tr2 as id, extended_entities, possibly_sensitive
  LATERAL VIEW json_tuple(tr2.extended_entities, 'media') tr3 as media
  LATERAL VIEW EXPLODE(tr3.media) media_exploded as media_object
  where tr2.id<http://tr2.id>='654395184428515332'
  LIMIT 1;

FAILED: UDFArgumentException explode() takes an array or a map as a parameter


Thanks,
Joel

On Tue, Oct 27, 2015 at 3:37 PM, Ryan Harris 
<ryan.har...@zionsbancorp.com<mailto:ryan.har...@zionsbancorp.com>> wrote:
Do you have an example of the query that you tried (which failed).
In short, you probably want to use the get_json_object() UDF:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-get_json_object

if you need the JSON array broken into individual records, you might require 
lateral view explode as in these examples:
http://mechanics.flite.com/blog/2014/04/16/using-explode-and-lateral-view-in-hive/
http://stackoverflow.com/questions/28716165/how-to-query-struct-array-with-hive-get-json-object


From: Sam Joe [mailto:games2013....@gmail.com<mailto:games2013....@gmail.com>]
Sent: Tuesday, October 27, 2015 1:29 PM
To: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Re: Using json_tuple for Nested json Arrays

I tried using EXPLODE function on the nested json array but it doesn't work and 
throws following error:

FAILED: UDFArgumentException explode() takes an array or a map as a parameter


Thanks,
Joel

On Tue, Oct 27, 2015 at 3:20 PM, Sam Joe 
<games2013....@gmail.com<mailto:games2013....@gmail.com>> wrote:
Hi,

Is it possible to use json_tuple function to extract data from json arrays 
(nested too). I am trying to process json data as string and avoid using serdes 
since user data may be malformed.

Please see a sample json data given below:


{
"filter_level": "low",
"retweeted": false,
"in_reply_to_screen_name": null,
"possibly_sensitive": false,
"truncated": false,
"lang": "en",
"in_reply_to_status_id_str": null,
"id": 654395184428515332,
"extended_entities": {
"media": [{
"sizes": {
"thumb": {
"w": 150,
"resize": "crop",
"h": 150
},
"small": {
"w": 340,
"resize": "fit",
"h": 255
},
"large": {
"w": 1024,
"resize": "fit",
"h": 768
},
"medium": {
"w": 600,
"resize": "fit",
"h": 450
}
},
"source_user_id": 16864598,
"media_url": "http://pbs.twimg.com/media/CRSL2MPWsAAOnZo.jpg";,
"display_url": "pic.twitter.com/i3004WyF4g<http://pic.twitter.com/i3004WyF4g>",
"type": "photo",
"url": "http://t.co/i3004WyF4g";,
"id": 654301608990388224,
"media_url_https": "https://pbs.twimg.com/media/CRSL2MPWsAAOnZo.jpg";,
"expanded_url": 
"http://twitter.com/lordlancaster/status/654301626665189376/photo/1";,
"source_user_id_str": "16864598",
"indices": [143,
144],
"source_status_id_str": "654301626665189376",
"source_status_id": 654301626665189376,
"id_str": "654301608990388224"
},
{
"sizes": {
"thumb": {
"w": 150,
"resize": "crop",
"h": 150
},
"small": {
"w": 340,
"resize": "fit",
"h": 255
},
"large": {
"w": 1024,
"resize": "fit",
"h": 768
},
"medium": {
"w": 600,
"resize": "fit",
"h": 450
}
},
"source_user_id": 16864598,
"media_url": "http://pbs.twimg.com/media/CRSL2MRWgAAGOcj.jpg";,
"display_url": "pic.twitter.com/i3004WyF4g<http://pic.twitter.com/i3004WyF4g>",
"type": "photo",
"url": "http://t.co/i3004WyF4g";,
"id": 654301608998764544,
"media_url_https": "https://pbs.twimg.com/media/CRSL2MRWgAAGOcj.jpg";,
"expanded_url": 
"http://twitter.com/lordlancaster/status/654301626665189376/photo/1";,
"source_user_id_str": "16864598",
"indices": [143,
144],
"source_status_id_str": "654301626665189376",
"source_status_id": 654301626665189376,
"id_str": "654301608998764544"
},
{
"sizes": {
"thumb": {
"w": 150,
"resize": "crop",
"h": 150
},
"small": {
"w": 340,
"resize": "fit",
"h": 255
},
"large": {
"w": 1024,
"resize": "fit",
"h": 768
},
"medium": {
"w": 600,
"resize": "fit",
"h": 450
}
},
"source_user_id": 16864598,
"media_url": "http://pbs.twimg.com/media/CRSL2MQWwAAP4Qo.jpg";,
"display_url": "pic.twitter.com/i3004WyF4g<http://pic.twitter.com/i3004WyF4g>",
"type": "photo",
"url": "http://t.co/i3004WyF4g";,
"id": 654301608994586624,
"media_url_https": "https://pbs.twimg.com/media/CRSL2MQWwAAP4Qo.jpg";,
"expanded_url": 
"http://twitter.com/lordlancaster/status/654301626665189376/photo/1";,
"source_user_id_str": "16864598",
"indices": [143,
144],
"source_status_id_str": "654301626665189376",
"source_status_id": 654301626665189376,
"id_str": "654301608994586624"
},
{
"sizes": {
"thumb": {
"w": 150,
"resize": "crop",
"h": 150
},
"small": {
"w": 340,
"resize": "fit",
"h": 255
},
"large": {
"w": 1024,
"resize": "fit",
"h": 768
},
"medium": {
"w": 600,
"resize": "fit",
"h": 450
}
},
"source_user_id": 16864598,
"media_url": "http://pbs.twimg.com/media/CRSL2M8WcAEXowZ.jpg";,
"display_url": "pic.twitter.com/i3004WyF4g<http://pic.twitter.com/i3004WyF4g>",
"type": "photo",
"url": "http://t.co/i3004WyF4g";,
"id": 654301609179115521,
"media_url_https": "https://pbs.twimg.com/media/CRSL2M8WcAEXowZ.jpg";,
"expanded_url": 
"http://twitter.com/lordlancaster/status/654301626665189376/photo/1";,
"source_user_id_str": "16864598",
"indices": [143,
144],
"source_status_id_str": "654301626665189376",
"source_status_id": 654301626665189376,
"id_str": "654301609179115521"
}]
}
}

Appreciate any help!

Thanks,
Joel



________________________________
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL 
and may contain information that is privileged and exempt from disclosure under 
applicable law. If you are neither the intended recipient nor responsible for 
delivering the message to the intended recipient, please note that any 
dissemination, distribution, copying or the taking of any action in reliance 
upon the message is strictly prohibited. If you have received this 
communication in error, please notify the sender immediately. Thank you.

________________________________
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL 
and may contain information that is privileged and exempt from disclosure under 
applicable law. If you are neither the intended recipient nor responsible for 
delivering the message to the intended recipient, please note that any 
dissemination, distribution, copying or the taking of any action in reliance 
upon the message is strictly prohibited. If you have received this 
communication in error, please notify the sender immediately. Thank you.


======================================================================
THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL 
and may contain information that is privileged and exempt from disclosure under 
applicable law. If you are neither the intended recipient nor responsible for 
delivering the message to the intended recipient, please note that any 
dissemination, distribution, copying or the taking of any action in reliance 
upon the message is strictly prohibited. If you have received this 
communication in error, please notify the sender immediately.  Thank you.

Reply via email to