Re: NPE when reading Parquet using Hive on Tez

Adam Hunt Tue, 05 Jan 2016 09:11:08 -0800

Hi Gopal,

Spark does offer dynamic allocation, but it doesn't always work as
advertised. My experience with Tez has been more in line with my
expectations. I'll bring up my issues with Spark on that list.


I tried your example and got the same NPE. It might be a mapr-hive issue.
Thanks for your help.

Adam

On Mon, Jan 4, 2016 at 12:58 PM, Gopal Vijayaraghavan <gop...@apache.org>
wrote:

>
> > select count(*) from alexa_parquet;
>
> > Caused by: java.lang.NullPointerException
> >    at
> >org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.tokeni
> >ze(TypeInfoUtils.java:274)
> >    at
> >org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.<init>
> >(TypeInfoUtils.java:293)
> >    at
> >org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils.getTypeInfosFromTypeS
> >tring(TypeInfoUtils.java:764)
> >    at
> >org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.getColum
> >nTypes(DataWritableReadSupport.java:76)
> >    at
> >org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(Dat
> >aWritableReadSupport.java:220)
> >    at
> >org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSp
> >lit(ParquetRecordReaderWrapper.java:256)
>
> This might be an NPE triggered off by a specific case of the type parser.
>
> I tested it out on my current build with simple types and it looks like
> the issue needs more detail on the column types for a repro.
>
> hive> create temporary table x (x int) stored as parquet;
> hive> insert into x values(1),(2);
> hive> select count(*) from x where x.x > 1;
> Status: DAG finished successfully in 0.18 seconds
> OK
> 1
> Time taken: 0.792 seconds, Fetched: 1 row(s)
> hive>
>
> Do you have INT96 in the schema?
>
> > I'm currently evaluating Hive on Tez as an alternative to keeping the
> >SparkSQL thrift sever running all the time locking up resources.
>
> Tez has a tunable value in tez.am.session.min.held-containers (i.e
> something small like 10).
>
> And HiveServer2 can be made work similarly because spark
> HiveThriftServer2.scala is a wrapper around hive's ThriftBinaryCLIService.
>
>
>
>
>
>
> Cheers,
> Gopal
>
>
>

Re: NPE when reading Parquet using Hive on Tez

Reply via email to