Hi Ryan, The simple query is running fine as shown below:
hive> SELECT tr2.id, tr2.possibly_sensitive > FROM tweets_raw tr1 > LATERAL VIEW json_tuple(tr1.text_col, 'id', 'extended_entities', 'possibly_sensitive') tr2 as id, extended_entities, possibly_sensitive > where tr2.id='654395184428515332' > LIMIT 1; OK 654395184428515332 false Time taken: 1.813 seconds, Fetched: 1 row(s) hive> However, if I try to get any data from the json array it's failing. Thanks, Joel On Tue, Oct 27, 2015 at 4:21 PM, Ryan Harris <ryan.har...@zionsbancorp.com> wrote: > looking at your sample data, you shouldn't need to use lateral view > explode unless you are trying to get 1 entry per row for your media sizes > (thumb, small, large, medium, etc) ... > > > > Try starting with something simple like : > > > > SELECT get_json_object(text_col, '$.id') as id FROM tweets_raw limit 10; > > > > You should also be able to use json_tuple(), but start simple.... > > > > *From:* Sam Joe [mailto:games2013....@gmail.com] > *Sent:* Tuesday, October 27, 2015 1:43 PM > > *To:* user@hive.apache.org > *Subject:* Re: Using json_tuple for Nested json Arrays > > > > Hi Ryan, > > > > Thanks for your reply! I didn't try using get_json_object() UDF. I will > try using that and let you know the results. > > > > I tried using the following script which failed : > > > > SELECT tr2.id, tr2.possibly_sensitive, tr3.media, > media_object.source_user_id > > FROM tweets_raw tr1 > > LATERAL VIEW json_tuple(tr1.text_col, 'id', 'extended_entities', > 'possibly_sensitive') tr2 as id, extended_entities, possibly_sensitive > > LATERAL VIEW json_tuple(tr2.extended_entities, 'media') tr3 as media > > LATERAL VIEW EXPLODE(tr3.media) media_exploded as media_object > > where tr2.id='654395184428515332' > > LIMIT 1; > > > > FAILED: UDFArgumentException explode() takes an array or a map as a > parameter > > > > > > Thanks, > > Joel > > > > On Tue, Oct 27, 2015 at 3:37 PM, Ryan Harris <ryan.har...@zionsbancorp.com> > wrote: > > Do you have an example of the query that you tried (which failed). > > In short, you probably want to use the get_json_object() UDF: > > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-get_json_object > > > > if you need the JSON array broken into individual records, you might > require lateral view explode as in these examples: > > > http://mechanics.flite.com/blog/2014/04/16/using-explode-and-lateral-view-in-hive/ > > > http://stackoverflow.com/questions/28716165/how-to-query-struct-array-with-hive-get-json-object > > > > > > *From:* Sam Joe [mailto:games2013....@gmail.com] > *Sent:* Tuesday, October 27, 2015 1:29 PM > *To:* user@hive.apache.org > *Subject:* Re: Using json_tuple for Nested json Arrays > > > > I tried using EXPLODE function on the nested json array but it doesn't > work and throws following error: > > > > FAILED: UDFArgumentException explode() takes an array or a map as a > parameter > > > > > > Thanks, > > Joel > > > > On Tue, Oct 27, 2015 at 3:20 PM, Sam Joe <games2013....@gmail.com> wrote: > > Hi, > > > > Is it possible to use json_tuple function to extract data from json arrays > (nested too). I am trying to process json data as string and avoid using > serdes since user data may be malformed. > > > > Please see a sample json data given below: > > > > > > { > > "filter_level": "low", > > "retweeted": false, > > "in_reply_to_screen_name": null, > > "possibly_sensitive": false, > > "truncated": false, > > "lang": "en", > > "in_reply_to_status_id_str": null, > > "id": 654395184428515332, > > "extended_entities": { > > "media": [{ > > "sizes": { > > "thumb": { > > "w": 150, > > "resize": "crop", > > "h": 150 > > }, > > "small": { > > "w": 340, > > "resize": "fit", > > "h": 255 > > }, > > "large": { > > "w": 1024, > > "resize": "fit", > > "h": 768 > > }, > > "medium": { > > "w": 600, > > "resize": "fit", > > "h": 450 > > } > > }, > > "source_user_id": 16864598, > > "media_url": "http://pbs.twimg.com/media/CRSL2MPWsAAOnZo.jpg", > > "display_url": "pic.twitter.com/i3004WyF4g", > > "type": "photo", > > "url": "http://t.co/i3004WyF4g", > > "id": 654301608990388224, > > "media_url_https": "https://pbs.twimg.com/media/CRSL2MPWsAAOnZo.jpg", > > "expanded_url": " > http://twitter.com/lordlancaster/status/654301626665189376/photo/1", > > "source_user_id_str": "16864598", > > "indices": [143, > > 144], > > "source_status_id_str": "654301626665189376", > > "source_status_id": 654301626665189376, > > "id_str": "654301608990388224" > > }, > > { > > "sizes": { > > "thumb": { > > "w": 150, > > "resize": "crop", > > "h": 150 > > }, > > "small": { > > "w": 340, > > "resize": "fit", > > "h": 255 > > }, > > "large": { > > "w": 1024, > > "resize": "fit", > > "h": 768 > > }, > > "medium": { > > "w": 600, > > "resize": "fit", > > "h": 450 > > } > > }, > > "source_user_id": 16864598, > > "media_url": "http://pbs.twimg.com/media/CRSL2MRWgAAGOcj.jpg", > > "display_url": "pic.twitter.com/i3004WyF4g", > > "type": "photo", > > "url": "http://t.co/i3004WyF4g", > > "id": 654301608998764544, > > "media_url_https": "https://pbs.twimg.com/media/CRSL2MRWgAAGOcj.jpg", > > "expanded_url": " > http://twitter.com/lordlancaster/status/654301626665189376/photo/1", > > "source_user_id_str": "16864598", > > "indices": [143, > > 144], > > "source_status_id_str": "654301626665189376", > > "source_status_id": 654301626665189376, > > "id_str": "654301608998764544" > > }, > > { > > "sizes": { > > "thumb": { > > "w": 150, > > "resize": "crop", > > "h": 150 > > }, > > "small": { > > "w": 340, > > "resize": "fit", > > "h": 255 > > }, > > "large": { > > "w": 1024, > > "resize": "fit", > > "h": 768 > > }, > > "medium": { > > "w": 600, > > "resize": "fit", > > "h": 450 > > } > > }, > > "source_user_id": 16864598, > > "media_url": "http://pbs.twimg.com/media/CRSL2MQWwAAP4Qo.jpg", > > "display_url": "pic.twitter.com/i3004WyF4g", > > "type": "photo", > > "url": "http://t.co/i3004WyF4g", > > "id": 654301608994586624, > > "media_url_https": "https://pbs.twimg.com/media/CRSL2MQWwAAP4Qo.jpg", > > "expanded_url": " > http://twitter.com/lordlancaster/status/654301626665189376/photo/1", > > "source_user_id_str": "16864598", > > "indices": [143, > > 144], > > "source_status_id_str": "654301626665189376", > > "source_status_id": 654301626665189376, > > "id_str": "654301608994586624" > > }, > > { > > "sizes": { > > "thumb": { > > "w": 150, > > "resize": "crop", > > "h": 150 > > }, > > "small": { > > "w": 340, > > "resize": "fit", > > "h": 255 > > }, > > "large": { > > "w": 1024, > > "resize": "fit", > > "h": 768 > > }, > > "medium": { > > "w": 600, > > "resize": "fit", > > "h": 450 > > } > > }, > > "source_user_id": 16864598, > > "media_url": "http://pbs.twimg.com/media/CRSL2M8WcAEXowZ.jpg", > > "display_url": "pic.twitter.com/i3004WyF4g", > > "type": "photo", > > "url": "http://t.co/i3004WyF4g", > > "id": 654301609179115521, > > "media_url_https": "https://pbs.twimg.com/media/CRSL2M8WcAEXowZ.jpg", > > "expanded_url": " > http://twitter.com/lordlancaster/status/654301626665189376/photo/1", > > "source_user_id_str": "16864598", > > "indices": [143, > > 144], > > "source_status_id_str": "654301626665189376", > > "source_status_id": 654301626665189376, > > "id_str": "654301609179115521" > > }] > > } > > } > > > > Appreciate any help! > > > > *Thanks,* > > *Joel* > > > > > > > ------------------------------ > > THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS > CONFIDENTIAL and may contain information that is privileged and exempt from > disclosure under applicable law. If you are neither the intended recipient > nor responsible for delivering the message to the intended recipient, > please note that any dissemination, distribution, copying or the taking of > any action in reliance upon the message is strictly prohibited. If you have > received this communication in error, please notify the sender immediately. > Thank you. > > > ------------------------------ > THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS > CONFIDENTIAL and may contain information that is privileged and exempt from > disclosure under applicable law. If you are neither the intended recipient > nor responsible for delivering the message to the intended recipient, > please note that any dissemination, distribution, copying or the taking of > any action in reliance upon the message is strictly prohibited. If you have > received this communication in error, please notify the sender immediately. > Thank you. >