Re: Hive queries returning all NULL values.

2014-08-26 Thread Tor Ivry
Raymond - you were the closest.
Parquet field names contained '::' ex. bag1::user_name

Hope it will help anyone in the future

Thanks for all your help

Tor



On Sun, Aug 17, 2014 at 7:50 PM, Raymond Lau 
wrote:

> Do your field names in your parquet files contain upper case letters by
> any chance ex. userName?  Hive will not read the data of external tables if
> they are not completely lower case field names, it doesn't convert them
> properly in the case of external tables.
> On Aug 17, 2014 8:00 AM, "hadoop hive"  wrote:
>
>> Take a small set of data like 2-5 line and insert it...
>>
>> After that you can try insert first 10 column and then next 10 till you
>> fund your problematic column
>> On Aug 17, 2014 8:37 PM, "Tor Ivry"  wrote:
>>
>>> Is there any way to debug this?
>>>
>>> We are talking about many fields here.
>>> How can I see which field has the mismatch?
>>>
>>>
>>>
>>> On Sun, Aug 17, 2014 at 4:30 PM, hadoop hive 
>>> wrote:
>>>
 Hi,

 You check the data type you have provided while creating external
 table, it should match with data in files.

 Thanks
 Vikas Srivastava
 On Aug 17, 2014 7:07 PM, "Tor Ivry"  wrote:

>  Hi
>
>
>
> I have a hive (0.11) table with the following create syntax:
>
>
>
> CREATE EXTERNAL TABLE events(
>
> …
>
> )
>
> PARTITIONED BY(dt string)
>
>   ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
>
>   STORED AS
>
> INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat"
>
> OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat"
>
> LOCATION '/data-events/success’;
>
>
>
> Query runs fine.
>
>
> I add hdfs partitions (containing snappy.parquet files).
>
>
>
> When I run
>
> hive
>
> > select count(*) from events where dt=“20140815”
>
> I get the correct result
>
>
>
> *Problem:*
>
> When I run
>
> hive
>
> > select * from events where dt=“20140815” limit 1;
>
> I get
>
> OK
>
> NULL NULL NULL NULL NULL NULL NULL 20140815
>
>
>
> *The same query in Impala returns the correct values.*
>
>
>
> Any idea what could be the issue?
>
>
>
> Thanks
>
> Tor
>

>>>


Re: Hive queries returning all NULL values.

2014-08-17 Thread Raymond Lau
Do your field names in your parquet files contain upper case letters by any
chance ex. userName?  Hive will not read the data of external tables if
they are not completely lower case field names, it doesn't convert them
properly in the case of external tables.
On Aug 17, 2014 8:00 AM, "hadoop hive"  wrote:

> Take a small set of data like 2-5 line and insert it...
>
> After that you can try insert first 10 column and then next 10 till you
> fund your problematic column
> On Aug 17, 2014 8:37 PM, "Tor Ivry"  wrote:
>
>> Is there any way to debug this?
>>
>> We are talking about many fields here.
>> How can I see which field has the mismatch?
>>
>>
>>
>> On Sun, Aug 17, 2014 at 4:30 PM, hadoop hive 
>> wrote:
>>
>>> Hi,
>>>
>>> You check the data type you have provided while creating external table,
>>> it should match with data in files.
>>>
>>> Thanks
>>> Vikas Srivastava
>>> On Aug 17, 2014 7:07 PM, "Tor Ivry"  wrote:
>>>
  Hi



 I have a hive (0.11) table with the following create syntax:



 CREATE EXTERNAL TABLE events(

 …

 )

 PARTITIONED BY(dt string)

   ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'

   STORED AS

 INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat"

 OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat"

 LOCATION '/data-events/success’;



 Query runs fine.


 I add hdfs partitions (containing snappy.parquet files).



 When I run

 hive

 > select count(*) from events where dt=“20140815”

 I get the correct result



 *Problem:*

 When I run

 hive

 > select * from events where dt=“20140815” limit 1;

 I get

 OK

 NULL NULL NULL NULL NULL NULL NULL 20140815



 *The same query in Impala returns the correct values.*



 Any idea what could be the issue?



 Thanks

 Tor

>>>
>>


Re: Hive queries returning all NULL values.

2014-08-17 Thread hadoop hive
Take a small set of data like 2-5 line and insert it...

After that you can try insert first 10 column and then next 10 till you
fund your problematic column
On Aug 17, 2014 8:37 PM, "Tor Ivry"  wrote:

> Is there any way to debug this?
>
> We are talking about many fields here.
> How can I see which field has the mismatch?
>
>
>
> On Sun, Aug 17, 2014 at 4:30 PM, hadoop hive  wrote:
>
>> Hi,
>>
>> You check the data type you have provided while creating external table,
>> it should match with data in files.
>>
>> Thanks
>> Vikas Srivastava
>> On Aug 17, 2014 7:07 PM, "Tor Ivry"  wrote:
>>
>>>  Hi
>>>
>>>
>>>
>>> I have a hive (0.11) table with the following create syntax:
>>>
>>>
>>>
>>> CREATE EXTERNAL TABLE events(
>>>
>>> …
>>>
>>> )
>>>
>>> PARTITIONED BY(dt string)
>>>
>>>   ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
>>>
>>>   STORED AS
>>>
>>> INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat"
>>>
>>> OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat"
>>>
>>> LOCATION '/data-events/success’;
>>>
>>>
>>>
>>> Query runs fine.
>>>
>>>
>>> I add hdfs partitions (containing snappy.parquet files).
>>>
>>>
>>>
>>> When I run
>>>
>>> hive
>>>
>>> > select count(*) from events where dt=“20140815”
>>>
>>> I get the correct result
>>>
>>>
>>>
>>> *Problem:*
>>>
>>> When I run
>>>
>>> hive
>>>
>>> > select * from events where dt=“20140815” limit 1;
>>>
>>> I get
>>>
>>> OK
>>>
>>> NULL NULL NULL NULL NULL NULL NULL 20140815
>>>
>>>
>>>
>>> *The same query in Impala returns the correct values.*
>>>
>>>
>>>
>>> Any idea what could be the issue?
>>>
>>>
>>>
>>> Thanks
>>>
>>> Tor
>>>
>>
>


Re: Hive queries returning all NULL values.

2014-08-17 Thread Tor Ivry
Is there any way to debug this?

We are talking about many fields here.
How can I see which field has the mismatch?



On Sun, Aug 17, 2014 at 4:30 PM, hadoop hive  wrote:

> Hi,
>
> You check the data type you have provided while creating external table,
> it should match with data in files.
>
> Thanks
> Vikas Srivastava
> On Aug 17, 2014 7:07 PM, "Tor Ivry"  wrote:
>
>>  Hi
>>
>>
>>
>> I have a hive (0.11) table with the following create syntax:
>>
>>
>>
>> CREATE EXTERNAL TABLE events(
>>
>> …
>>
>> )
>>
>> PARTITIONED BY(dt string)
>>
>>   ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
>>
>>   STORED AS
>>
>> INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat"
>>
>> OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat"
>>
>> LOCATION '/data-events/success’;
>>
>>
>>
>> Query runs fine.
>>
>>
>> I add hdfs partitions (containing snappy.parquet files).
>>
>>
>>
>> When I run
>>
>> hive
>>
>> > select count(*) from events where dt=“20140815”
>>
>> I get the correct result
>>
>>
>>
>> *Problem:*
>>
>> When I run
>>
>> hive
>>
>> > select * from events where dt=“20140815” limit 1;
>>
>> I get
>>
>> OK
>>
>> NULL NULL NULL NULL NULL NULL NULL 20140815
>>
>>
>>
>> *The same query in Impala returns the correct values.*
>>
>>
>>
>> Any idea what could be the issue?
>>
>>
>>
>> Thanks
>>
>> Tor
>>
>


Re: Hive queries returning all NULL values.

2014-08-17 Thread hadoop hive
Hi,

You check the data type you have provided while creating external table, it
should match with data in files.

Thanks
Vikas Srivastava
On Aug 17, 2014 7:07 PM, "Tor Ivry"  wrote:

> Hi
>
>
>
> I have a hive (0.11) table with the following create syntax:
>
>
>
> CREATE EXTERNAL TABLE events(
>
> …
>
> )
>
> PARTITIONED BY(dt string)
>
>   ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe'
>
>   STORED AS
>
> INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat"
>
> OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat"
>
> LOCATION '/data-events/success’;
>
>
>
> Query runs fine.
>
>
> I add hdfs partitions (containing snappy.parquet files).
>
>
>
> When I run
>
> hive
>
> > select count(*) from events where dt=“20140815”
>
> I get the correct result
>
>
>
> *Problem:*
>
> When I run
>
> hive
>
> > select * from events where dt=“20140815” limit 1;
>
> I get
>
> OK
>
> NULL NULL NULL NULL NULL NULL NULL 20140815
>
>
>
> *The same query in Impala returns the correct values.*
>
>
>
> Any idea what could be the issue?
>
>
>
> Thanks
>
> Tor
>