Take a small set of data like 2-5 line and insert it... After that you can try insert first 10 column and then next 10 till you fund your problematic column On Aug 17, 2014 8:37 PM, "Tor Ivry" <tork...@gmail.com> wrote:
> Is there any way to debug this? > > We are talking about many fields here. > How can I see which field has the mismatch? > > > > On Sun, Aug 17, 2014 at 4:30 PM, hadoop hive <hadooph...@gmail.com> wrote: > >> Hi, >> >> You check the data type you have provided while creating external table, >> it should match with data in files. >> >> Thanks >> Vikas Srivastava >> On Aug 17, 2014 7:07 PM, "Tor Ivry" <tork...@gmail.com> wrote: >> >>> Hi >>> >>> >>> >>> I have a hive (0.11) table with the following create syntax: >>> >>> >>> >>> CREATE EXTERNAL TABLE events( >>> >>> … >>> >>> ) >>> >>> PARTITIONED BY(dt string) >>> >>> ROW FORMAT SERDE 'parquet.hive.serde.ParquetHiveSerDe' >>> >>> STORED AS >>> >>> INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat" >>> >>> OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat" >>> >>> LOCATION '/data-events/success’; >>> >>> >>> >>> Query runs fine. >>> >>> >>> I add hdfs partitions (containing snappy.parquet files). >>> >>> >>> >>> When I run >>> >>> hive >>> >>> > select count(*) from events where dt=“20140815” >>> >>> I get the correct result >>> >>> >>> >>> *Problem:* >>> >>> When I run >>> >>> hive >>> >>> > select * from events where dt=“20140815” limit 1; >>> >>> I get >>> >>> OK >>> >>> NULL NULL NULL NULL NULL NULL NULL 20140815 >>> >>> >>> >>> *The same query in Impala returns the correct values.* >>> >>> >>> >>> Any idea what could be the issue? >>> >>> >>> >>> Thanks >>> >>> Tor >>> >> >