Re: Hive select shows null after successful data load

Sunita Arvind Fri, 21 Jun 2013 01:43:45 -0700

Yes values is the outer most array. Probably array < struct < struct is the
max level of nesting possible. Any number of structs can be nested, but
internal arrays seem to be an issue. The ones that failed had, array<
struct < struct < array < struct. This broke the serde.


Regarding pretty printing, I tried something like this:

SELECT
   records.company.name as company,
   records.position.title as jobTitle,
   records.locationDescription as Place
FROM
   linkedin_jobsearch
   LATERAL VIEW explode(jobs.values) exploded_table as records;

Its somewhat closer to what I want , but needs improvisation. Will explore
beeline. Thanks for the info.

regards
Sunita


On Thu, Jun 20, 2013 at 10:19 AM, Stephen Sprague <sprag...@gmail.com>wrote:

> hooray!   over one hurdle and onto the next one.   So something about that
> one nested array caused the problem.  very strange. I wonder if there is a
> smaller test case to look at as it seems not all arrays break it since i
> see one for the attribute "values".
>
> As to the formatting issue i don't believe the native hive client has much
> to offer there. its bare bones and record oriented.   beeline seems to
> another opensource hive client which looks to have more options you might
> have a gander at that though i don't think it has anything special for
> pretty printing arrays, maps or structs but i could be wrong.
>
> And then of course nothing stopping you though from exploring piping that
> gnarly stuff into python (or whatever) and have it come out the other end
> all nice and pretty -- and then posting that here. :)
>
>
> On Wed, Jun 19, 2013 at 7:54 PM, Sunita Arvind <sunitarv...@gmail.com>wrote:
>
>> Finally I could get it work. The issue resolves once I remove the arrays
>> within position structure. So that is the limitation of the serde. I
>> changed 'industries' to string and 'jobfunctions' to Map<string,string> I
>> can query the table just fine now. Here is the complete DDL for reference:
>>
>> create external table linkedin_Jobsearch (
>>
>> jobs STRUCT<
>> values : ARRAY<STRUCT<
>> company : STRUCT<
>> id : STRING,
>> name : STRING>,
>> postingDate : STRUCT<
>> year : STRING,
>> day : STRING,
>> month : STRING>,
>> descriptionSnippet : STRING,
>> expirationDate : STRUCT<
>> year : STRING,
>> day : STRING,
>> month : STRING>,
>> position : STRUCT<
>> jobFunctions : MAP<STRING,STRING>,    ------these were arrays of
>> structure in my previous attempts
>> industries : STRING,
>> title : STRING,
>>
>> jobType : STRUCT<
>> code : STRING,
>> name : STRING>,
>> experienceLevel : STRUCT<
>> code : STRING,
>> name : STRING>>,
>> id : STRING,
>> customerJobCode : STRING,
>> skillsAndExperience : STRING,
>> salary : STRING,
>> jobPoster : STRUCT<
>> id : STRING,
>> firstName : STRING,
>> lastName : STRING,
>> headline : STRING>,
>> referralBonus : STRING,
>> locationDescription : STRING>>>
>> )
>> ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
>> LOCATION '/user/sunita/tables/jobs';
>>
>> Thanks Stephen for sharing your thoughts. It helped.
>>
>> Also if someone /Stephen could help me display this information in a
>> useful manner, that would be great. Right now all the values show up as
>> arrays. Here is what I mean:
>> For a query like this:
>> hive> select jobs.values.company.name, jobs.values.position.title,
>> jobs.values.locationdescription from linkedin_jobsearch;
>>
>> This is the output:
>>
>> ["CyberCoders","CyberCoders","CyberCoders","Management Science
>> Associates","Google","Google","CyberCoders","CyberCoders","HP","Sigmaways","Global
>> Data Consultancy","Global Data
>> Consultancy","CyberCoders","CyberCoders","CyberCoders","VMware","CD IT
>> Recruitment","CD IT Recruitment","Digital Reasoning Systems","AOL"]
>> ["Software Engineer-Hadoop, HDFS, HBase, Pig- Vertica Analytics","Software
>> Engineer-Hadoop, HDFS, HBase, Pig- Vertica Analytics","Software
>> Engineer-Hadoop, HDFS, HBase, Pig- Vertica Analytics","Data
>> Architect","Systems Engineer, Site Reliability Engineering","Systems
>> Engineer, Site Reliability Engineering","NoSQL Engineer - MongoDB for big
>> data, web crawling - RELO OFFER","NoSQL Engineer - MongoDB for big data,
>> web crawling - RELO OFFER","Hadoop Database Administrator Medicare","Hadoop
>> / Big Data Consultant","Lead Hadoop developer","Head of Big Data -
>> Hadoop","Hadoop Engineer - Hadoop, Operations, Linux Admin, Java,
>> Storage","Sr. Hadoop Administrator - Hadoop, MapReduce, HDFS","Sr. Hadoop
>> Administrator - Hadoop, MapReduce, HDFS","Software Engineer - Big
>> Data","Hadoop Team Lead Consultant - Global Leader in Big Data
>> solutions","Hadoop Administrator Consultant - Global Leader in Big Data
>> solutions","Java Developer","Sr.Software Engineer-Big Data-Hadoop"]
>> ["Pittsburgh, PA","Pittsburgh, PA","Harrisburg, PA","Pittsburgh, PA
>> (Shadyside area near Bakery Square)","Pittsburgh, PA, USA","Pittsburgh,
>> PA","Cleveland, OH","Akron, OH","Herndon, VA","Cupertino, CA","London,
>> United Kingdom","London, United Kingdom","Mountain View, CA","san jose,
>> CA","Santa Clara, CA","Palo Alto, CA","Home based - Live anywhere in the UK
>> or Benelux","Home based - Live anywhere in the UK or Benelux","Herndon,
>> VA","Dulles, VA"]
>> Time taken: 8.518 seconds
>>
>> All company names come into an array, all position titles into another
>> array and all locationdescription into yet another array. I cannot map 1
>> value to the other.
>>
>> The below query gives a decent output where individual columns can be
>> somewhat mapped:
>>
>> hive> select jobs.values[0].company.name, jobs.values[0].position.title,
>> jobs.values[0].locationdescription from linkedin_jobsearch;
>>
>> CyberCoders     Software Engineer-Hadoop, HDFS, HBase, Pig- Vertica
>> Analytics  Pittsburgh, PA
>> Time taken: 8.543 seconds
>>
>> But if I want to get the whole list this does not work. I have tried
>> setting Input and output formats and setting serde properties to map to
>> columns, but the output is the same. I haven't tried LATERAL VIEW
>> json_tuple as yet, I found it cryptic and I hope there is something simpler.
>>
>> I can think of writing a UDF which loops for the length of the array and
>> I can access values array incrementally. But the catch here is, what if one
>> of the records has any of their fields missing.
>>
>> Is there a better solution?
>>
>> regards
>> Sunita
>>
>>
>>
>>
>>
>> On Wed, Jun 19, 2013 at 3:29 PM, Sunita Arvind <sunitarv...@gmail.com>wrote:
>>
>>> Thanks Stephen,
>>> Let me explore options. I will let you all know once I am successful.
>>>
>>> regards
>>> Sunita
>>>
>>>
>>> On Wed, Jun 19, 2013 at 3:08 PM, Stephen Sprague <sprag...@gmail.com>wrote:
>>>
>>>> try_parsed_json is not trivial imho :)
>>>>
>>>> start with the very, very basic, for example,  { "jobs" : "foo" }.  Get
>>>> that to work first. :)   When that works add a level of nesting and see
>>>> what happens.  Keep building on it until you either break it (and then you
>>>> know that last thing you added broke it and can concentrate on that) or
>>>> you'll have worked out all the bugs and your final example will work.
>>>> Nothing fancy here except old school trial and error.
>>>>
>>>> An alternative I keep bringing up when native semantics don't go one's
>>>> way is the transform() function.  use python, perl,  ruby or whatever to
>>>> parse the json and go nuts with the rich features of said language.  just
>>>> write your output to stdout as a delimited serialization of what you want
>>>> to store and that's it.  That would be another way to get your scalars,
>>>> arrays and structs to work.
>>>>
>>>> Don't give up yet though on the JsonSerde! :)  Its probably something
>>>> very easy that we just can't see.
>>>>
>>>>
>>>>
>>>> On Wed, Jun 19, 2013 at 10:00 AM, Sunita Arvind 
>>>> <sunitarv...@gmail.com>wrote:
>>>>
>>>>> Thanks for looking into it Ramki.
>>>>> Yes I had tried these options. Here is what I get (renamed the table
>>>>> to have a meaningful name):
>>>>>
>>>>> hive> select jobs.values[1].id from linkedinjobsearch;
>>>>> ......mapreduce task details....
>>>>> OK
>>>>> NULL
>>>>> Time taken: 9.586 seconds
>>>>>
>>>>>
>>>>> hive> select jobs.values[0].position.title from linkedinjobsearch;
>>>>>  Total MapReduce jobs = 1
>>>>> Launching Job 1 out of 1
>>>>>
>>>>> OK
>>>>> NULL
>>>>> Time taken: 9.617 seconds
>>>>>
>>>>>
>>>>> I am trying to connect btrace to the process to be able to trace the
>>>>> code but cant get it to respond. Here is what I tried:
>>>>>
>>>>> [sunita@node01 ~]$ hive --debug, recursive=y,
>>>>> port=7000,mainSuspend=y, childSuspend=y
>>>>> ERROR: Cannot load this JVM TI agent twice, check your java command
>>>>> line for duplicate jdwp options.
>>>>> Error occurred during initialization of VM
>>>>> agent library failed to init: jdwp
>>>>>
>>>>> Tried changing the port also. Any idea regarding the debuggers that
>>>>> can be used. I also tried explain query and that does not show any issues
>>>>> either.
>>>>>
>>>>> regards
>>>>> Sunita
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 19, 2013 at 12:11 PM, Ramki Palle 
>>>>> <ramki.pa...@gmail.com>wrote:
>>>>>
>>>>>> Can you run some other queries from job1 table and see if any query
>>>>>> returns some data?
>>>>>>
>>>>>> I am guessing your query "select jobs.values.position.title from
>>>>>> jobs1;"  may have some issue. May be it should be as
>>>>>>
>>>>>> select jobs.values[0].position.title from jobs1;
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Ramki.
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 19, 2013 at 8:24 AM, Sunita Arvind <sunitarv...@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Thanks Stephen,
>>>>>>>
>>>>>>> That's just what I tried with the try_parsed table. It is exactly
>>>>>>> same data with lesser nesting in the structure and lesser number of 
>>>>>>> entries.
>>>>>>> Do you mean to say that highly nested jsons can lead to issues? What
>>>>>>> are typical solution to such issues? Write UDFs in hive or parse the 
>>>>>>> JSON
>>>>>>> into a delimited file?
>>>>>>> I have heard of custom serdes also. Not sure if UDFs and custom
>>>>>>> serdes are one and the same.
>>>>>>>
>>>>>>> regards
>>>>>>> Sunita
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Jun 19, 2013 at 10:38 AM, Stephen Sprague <
>>>>>>> sprag...@gmail.com> wrote:
>>>>>>>
>>>>>>>> I think you might have to start small here instead of going for the
>>>>>>>> home run on the first swing.  when all else fails start with a trivial 
>>>>>>>> json
>>>>>>>> object and then build up from there and see what additional step breaks
>>>>>>>> it.   that way you know if the trivial example fails is something
>>>>>>>> fundamental and not the complexity of your json object that's throwing
>>>>>>>> things off.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Jun 19, 2013 at 4:34 AM, Sunita Arvind <
>>>>>>>> sunitarv...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Thanks for sharing your experience Richa.
>>>>>>>>> I do have timestamps but in the format of year : INT, day : INT,
>>>>>>>>> month : INT.
>>>>>>>>> As per your suggestion, I changed them all to string, but still
>>>>>>>>> get null as the output.
>>>>>>>>>
>>>>>>>>> regards
>>>>>>>>> Sunita
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jun 19, 2013 at 2:17 AM, Richa Sharma <
>>>>>>>>> mailtorichasha...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Do you have any timestamp fields in the table that might contain
>>>>>>>>>> null value ?
>>>>>>>>>>
>>>>>>>>>> I faced a similar situation sometime back - changing the data
>>>>>>>>>> type to string made it work.
>>>>>>>>>>
>>>>>>>>>> But I was working on delimited text files.
>>>>>>>>>> Not sure if it applies to JSON .. but its still worth giving a
>>>>>>>>>> try !!
>>>>>>>>>>
>>>>>>>>>> Richa
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jun 19, 2013 at 7:28 AM, Sunita Arvind <
>>>>>>>>>> sunitarv...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Having the a column name same as the table name, is a problem
>>>>>>>>>>> due to which I was not able to reference jobs.values.id from
>>>>>>>>>>> jobs. Changing the table name to jobs1 resolved the semantic error.
>>>>>>>>>>> However, the query still returns null
>>>>>>>>>>>
>>>>>>>>>>> hive> select jobs.values.position.title from jobs1;
>>>>>>>>>>> Total MapReduce jobs = 1
>>>>>>>>>>> Launching Job 1 out of 1
>>>>>>>>>>> Number of reduce tasks is set to 0 since there's no reduce
>>>>>>>>>>> operator
>>>>>>>>>>> Starting Job = job_201306080116_0036, Tracking URL =
>>>>>>>>>>> http://node01.expressanalytics.net:50030/jobdetails.jsp?jobid=job_201306080116_0036
>>>>>>>>>>> Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill
>>>>>>>>>>> job_201306080116_0036
>>>>>>>>>>> Hadoop job information for Stage-1: number of mappers: 1; number
>>>>>>>>>>> of reducers: 0
>>>>>>>>>>> 2013-06-18 18:55:52,381 Stage-1 map = 0%,  reduce = 0%
>>>>>>>>>>> 2013-06-18 18:55:56,394 Stage-1 map = 100%,  reduce = 0%,
>>>>>>>>>>> Cumulative CPU 0.88 sec
>>>>>>>>>>> 2013-06-18 18:55:57,400 Stage-1 map = 100%,  reduce = 0%,
>>>>>>>>>>> Cumulative CPU 0.88 sec
>>>>>>>>>>> 2013-06-18 18:55:58,407 Stage-1 map = 100%,  reduce = 100%,
>>>>>>>>>>> Cumulative CPU 0.88 sec
>>>>>>>>>>> MapReduce Total cumulative CPU time: 880 msec
>>>>>>>>>>> Ended Job = job_201306080116_0036
>>>>>>>>>>> MapReduce Jobs Launched:
>>>>>>>>>>> Job 0: Map: 1   Cumulative CPU: 0.88 sec   HDFS Read: 35374 HDFS
>>>>>>>>>>> Write: 3 SUCCESS
>>>>>>>>>>> Total MapReduce CPU Time Spent: 880 msec
>>>>>>>>>>> OK
>>>>>>>>>>> null
>>>>>>>>>>> Time taken: 9.591 seconds
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> regards
>>>>>>>>>>> Sunita
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jun 18, 2013 at 9:35 PM, Sunita Arvind <
>>>>>>>>>>> sunitarv...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Ok.
>>>>>>>>>>>> The data files are quite small. Around 35 KB and 1 KB each.
>>>>>>>>>>>>
>>>>>>>>>>>> [sunita@node01 tables]$  hadoop fs -ls /user/sunita/tables/jobs
>>>>>>>>>>>> Found 1 items
>>>>>>>>>>>> -rw-r--r--   3 sunita hdfs      35172 2013-06-18 18:31
>>>>>>>>>>>> /user/sunita/tables/jobs/jobs_noSite_parsed.json
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [sunita@node01 tables]$ hadoop fs -text
>>>>>>>>>>>> /user/sunita/tables/jobs/jobs_noSite_parsed.json |more
>>>>>>>>>>>> {"jobs": {"_total": 1812, "_count": 20, "_start": 0, "values":
>>>>>>>>>>>> [{"company": {"i
>>>>>>>>>>>> d": 21836, "name": "CyberCoders"}, "postingDate": {"year":
>>>>>>>>>>>> 2013, "day": 10, "mo
>>>>>>>>>>>> nth": 6}, "descriptionSnippet": "Software Engineer-Hadoop,
>>>>>>>>>>>> HDFS, HBase, Pig- Ve
>>>>>>>>>>>> rtica Analytics Senior Hadoop Engineer - Skills Required -
>>>>>>>>>>>> Hadoop, HDFS, HBase,
>>>>>>>>>>>>  Pig, SQL, Industrial Software Development, System Integration,
>>>>>>>>>>>> Java, high perf
>>>>>>>>>>>> ormance, multi-threading, VerticaWe are a well known consumer
>>>>>>>>>>>> product developme
>>>>>>>>>>>> nt company and we are looking to add a Hadoop Engineer to our
>>>>>>>>>>>> Engineering team.
>>>>>>>>>>>>   You will be working with the latest ", "expirationDate":
>>>>>>>>>>>> {"year": 2013, "day"
>>>>>>>>>>>>
>>>>>>>>>>>> Its a single line, so used 'more' rather than 'head'. But
>>>>>>>>>>>> effectively, the file exists and has the data.
>>>>>>>>>>>>
>>>>>>>>>>>> regards
>>>>>>>>>>>> Sunita
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Jun 18, 2013 at 8:38 PM, Stephen Sprague <
>>>>>>>>>>>> sprag...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> As Nitin alluded to its best to confirm the data is definitely
>>>>>>>>>>>>> in hdfs using hdfs semantics rather than hive for the first step.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. how big is it?   hadoop fs -ls <your hdfs dir>
>>>>>>>>>>>>> 2. cat a bit of it and see if anything is there.   hadoop fs
>>>>>>>>>>>>> -text <your hdfs dir>/<filename> | head -10
>>>>>>>>>>>>>
>>>>>>>>>>>>> do you see any data from step #2?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Jun 18, 2013 at 3:58 PM, Sunita Arvind <
>>>>>>>>>>>>> sunitarv...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I ran some complex queries. Something to the extent of
>>>>>>>>>>>>>>                     select jobs from jobs;
>>>>>>>>>>>>>>  which triggers map reduce jobs but does not show errors and
>>>>>>>>>>>>>> produces the same output "null". If I try referencing the struct 
>>>>>>>>>>>>>> elements,
>>>>>>>>>>>>>> I get error which seems to be the root cause.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Attached are the select statement outputs with the
>>>>>>>>>>>>>> corresponding hive logs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have also attached my usage details of another table -
>>>>>>>>>>>>>> try_parsed which has a subset of the same data which seems to 
>>>>>>>>>>>>>> work fine.
>>>>>>>>>>>>>> Also attached is the input file for this table - try_parsed.json
>>>>>>>>>>>>>> Thanks for your help
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sunita
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Jun 18, 2013 at 4:35 PM, Nitin Pawar <
>>>>>>>>>>>>>> nitinpawar...@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> can you run a little more complex query
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> select uniq across columns or do some maths. so we know when
>>>>>>>>>>>>>>> it fires up a mapreduce
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Jun 19, 2013 at 1:59 AM, Sunita Arvind <
>>>>>>>>>>>>>>> sunitarv...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for responding Nitin. Yes I am sure that serde is
>>>>>>>>>>>>>>>> working fine and json file is being picked based on all the 
>>>>>>>>>>>>>>>> errors that
>>>>>>>>>>>>>>>> showed up till this stage. What sort of error are you 
>>>>>>>>>>>>>>>> suspecting. File not
>>>>>>>>>>>>>>>> present or serde not parsing it ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tuesday, June 18, 2013, Nitin Pawar wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> select * from table is as good as hdfs -cat
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> are you sure there is any data in the table?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Jun 18, 2013 at 11:54 PM, Sunita Arvind <
>>>>>>>>>>>>>>>>> sunitarv...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am able to parse the input JSON file and load it into
>>>>>>>>>>>>>>>>>> hive. I do not see any errors with create table, so I am 
>>>>>>>>>>>>>>>>>> assuming that. But
>>>>>>>>>>>>>>>>>> when I try to read the data, I get null
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> hive> select * from jobs;
>>>>>>>>>>>>>>>>>> OK
>>>>>>>>>>>>>>>>>> null
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I have validated the JSON with JSONLint and Notepad++
>>>>>>>>>>>>>>>>>> JSON plugin and it is a valid JSON. Here is my create
>>>>>>>>>>>>>>>>>> table statement and attached is the json input file.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> create external table jobs (
>>>>>>>>>>>>>>>>>> jobs STRUCT<
>>>>>>>>>>>>>>>>>> values : ARRAY<STRUCT<
>>>>>>>>>>>>>>>>>> company : STRUCT<
>>>>>>>>>>>>>>>>>> id : STRING,
>>>>>>>>>>>>>>>>>> name : STRING>,
>>>>>>>>>>>>>>>>>> postingDate : STRUCT<
>>>>>>>>>>>>>>>>>> year : INT,
>>>>>>>>>>>>>>>>>> day : INT,
>>>>>>>>>>>>>>>>>> month : INT>,
>>>>>>>>>>>>>>>>>> descriptionSnippet : STRING,
>>>>>>>>>>>>>>>>>> expirationDate : STRUCT<
>>>>>>>>>>>>>>>>>> year : INT,
>>>>>>>>>>>>>>>>>> day : INT,
>>>>>>>>>>>>>>>>>> month : INT>,
>>>>>>>>>>>>>>>>>> position : STRUCT<
>>>>>>>>>>>>>>>>>> title : STRING,
>>>>>>>>>>>>>>>>>> jobFunctions : ARRAY<STRUCT<
>>>>>>>>>>>>>>>>>> code : STRING,
>>>>>>>>>>>>>>>>>> name : STRING>>,
>>>>>>>>>>>>>>>>>> industries : ARRAY<STRUCT<
>>>>>>>>>>>>>>>>>> code : STRING,
>>>>>>>>>>>>>>>>>> id : STRING,
>>>>>>>>>>>>>>>>>> name : STRING>>,
>>>>>>>>>>>>>>>>>> jobType : STRUCT<
>>>>>>>>>>>>>>>>>> code : STRING,
>>>>>>>>>>>>>>>>>> name : STRING>,
>>>>>>>>>>>>>>>>>> experienceLevel : STRUCT<
>>>>>>>>>>>>>>>>>> code : STRING,
>>>>>>>>>>>>>>>>>> name : STRING>>,
>>>>>>>>>>>>>>>>>> id : STRING,
>>>>>>>>>>>>>>>>>> customerJobCode : STRING,
>>>>>>>>>>>>>>>>>> skillsAndExperience : STRING,
>>>>>>>>>>>>>>>>>> salary : STRING,
>>>>>>>>>>>>>>>>>> jobPoster : STRUCT<
>>>>>>>>>>>>>>>>>> id : STRING,
>>>>>>>>>>>>>>>>>> firstName : STRING,
>>>>>>>>>>>>>>>>>> lastName : STRING,
>>>>>>>>>>>>>>>>>> headline : STRING>,
>>>>>>>>>>>>>>>>>> referralBonus : STRING,
>>>>>>>>>>>>>>>>>> locationDescription : STRING>>>
>>>>>>>>>>>>>>>>>>  )
>>>>>>>>>>>>>>>>>> ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
>>>>>>>>>>>>>>>>>> LOCATION '/user/sunita/tables/jobs';
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The table creation works fine, but when I attempt to
>>>>>>>>>>>>>>>>>> query, I get null as the result.
>>>>>>>>>>>>>>>>>> I tried adding Input/Output formats, Serde Properties,
>>>>>>>>>>>>>>>>>> nothing seems to impact.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I am of the opinion that the libraries cannot handle this
>>>>>>>>>>>>>>>>>> level of nesting and I probably will have to write a custom 
>>>>>>>>>>>>>>>>>> serde or a
>>>>>>>>>>>>>>>>>> parser myself. Just wanted to seek guidance before I get 
>>>>>>>>>>>>>>>>>> into that.
>>>>>>>>>>>>>>>>>> Appreciate your help and guidance.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> regards
>>>>>>>>>>>>>>>>>> Sunita
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Nitin Pawar
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Nitin Pawar
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Hive select shows null after successful data load

Reply via email to