Re: hbase-1.1.1 & hive-1.0.1

2016-03-19 Thread Adam Hunt
Version information

Hive 1.x will remain compatible with HBase 0.98.x and lower versions. Hive
2.x will be compatible with HBase 1.x and higher. (See HIVE-10990
 for details.) Consumers
wanting to work with HBase 1.x using Hive 1.x will need to compile Hive 1.x
stream code themselves.


https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration

On Wed, Mar 16, 2016 at 6:23 AM, songj songj  wrote:

> hi all:
> I use hbase-1.1.1 & hive-1.0.1 ,but I can not access hbase from hive
>
> this two apps version does not compatible?
>
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: java.lang.NoSuchMethodError:
> org.apache.hadoop.hbase.client.Put.setDurability(Lorg/apache/hadoop/hbase/client/Durability;)V
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:185)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NoSuchMethodError:
> org.apache.hadoop.hbase.client.Put.setDurability(Lorg/apache/hadoop/hbase/client/Durability;)V
> at
> org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:142)
> at
> org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat$MyRecordWriter.write(HiveHBaseTableOutputFormat.java:117)
> at
> org.apache.hadoop.hive.ql.io.HivePassThroughRecordWriter.write(HivePassThroughRecordWriter.java:40)
>


Re: NPE when reading Parquet using Hive on Tez

2016-02-02 Thread Adam Hunt
HI Gopal,

With the release of 0.8.2, I thought I would give tez another shot.
Unfortunately, I got the same NPE. I dug a little deeper and it appears
that the configuration property "columns.types", which is used
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(),
is not being set. When I manually set that property in hive, your example
works fine.

hive> create temporary table x (x int) stored as parquet;
hive> insert into x values(1),(2);
hive> set columns.type=int;
hive> select count(*) from x where x.x > 1;
OK
1

I also saw that the configuration parameter parquet.columns.index.access is
also checked in that same function. Setting that property to "true" fixes
my issue.

hive> create temporary table x (x int) stored as parquet;
hive> insert into x values(1),(2);
hive> set parquet.column.index.access=true;
hive> select count(*) from x where x.x > 1;
OK
1

Thanks for your help.

Best,
Adam



On Tue, Jan 5, 2016 at 9:10 AM, Adam Hunt <adamph...@gmail.com> wrote:

> Hi Gopal,
>
> Spark does offer dynamic allocation, but it doesn't always work as
> advertised. My experience with Tez has been more in line with my
> expectations. I'll bring up my issues with Spark on that list.
>
> I tried your example and got the same NPE. It might be a mapr-hive issue.
> Thanks for your help.
>
> Adam
>
> On Mon, Jan 4, 2016 at 12:58 PM, Gopal Vijayaraghavan <gop...@apache.org>
> wrote:
>
>>
>> > select count(*) from alexa_parquet;
>>
>> > Caused by: java.lang.NullPointerException
>> >at
>>
>> >org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.tokeni
>> >ze(TypeInfoUtils.java:274)
>> >at
>>
>> >org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.
>> >(TypeInfoUtils.java:293)
>> >at
>>
>> >org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils.getTypeInfosFromTypeS
>> >tring(TypeInfoUtils.java:764)
>> >at
>>
>> >org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.getColum
>> >nTypes(DataWritableReadSupport.java:76)
>> >at
>>
>> >org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(Dat
>> >aWritableReadSupport.java:220)
>> >at
>>
>> >org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSp
>> >lit(ParquetRecordReaderWrapper.java:256)
>>
>> This might be an NPE triggered off by a specific case of the type parser.
>>
>> I tested it out on my current build with simple types and it looks like
>> the issue needs more detail on the column types for a repro.
>>
>> hive> create temporary table x (x int) stored as parquet;
>> hive> insert into x values(1),(2);
>> hive> select count(*) from x where x.x > 1;
>> Status: DAG finished successfully in 0.18 seconds
>> OK
>> 1
>> Time taken: 0.792 seconds, Fetched: 1 row(s)
>> hive>
>>
>> Do you have INT96 in the schema?
>>
>> > I'm currently evaluating Hive on Tez as an alternative to keeping the
>> >SparkSQL thrift sever running all the time locking up resources.
>>
>> Tez has a tunable value in tez.am.session.min.held-containers (i.e
>> something small like 10).
>>
>> And HiveServer2 can be made work similarly because spark
>> HiveThriftServer2.scala is a wrapper around hive's ThriftBinaryCLIService.
>>
>>
>>
>>
>>
>>
>> Cheers,
>> Gopal
>>
>>
>>
>


Re: NPE when reading Parquet using Hive on Tez

2016-01-05 Thread Adam Hunt
Hi Gopal,

Spark does offer dynamic allocation, but it doesn't always work as
advertised. My experience with Tez has been more in line with my
expectations. I'll bring up my issues with Spark on that list.

I tried your example and got the same NPE. It might be a mapr-hive issue.
Thanks for your help.

Adam

On Mon, Jan 4, 2016 at 12:58 PM, Gopal Vijayaraghavan 
wrote:

>
> > select count(*) from alexa_parquet;
>
> > Caused by: java.lang.NullPointerException
> >at
> >org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.tokeni
> >ze(TypeInfoUtils.java:274)
> >at
> >org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.
> >(TypeInfoUtils.java:293)
> >at
> >org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils.getTypeInfosFromTypeS
> >tring(TypeInfoUtils.java:764)
> >at
> >org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.getColum
> >nTypes(DataWritableReadSupport.java:76)
> >at
> >org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(Dat
> >aWritableReadSupport.java:220)
> >at
> >org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSp
> >lit(ParquetRecordReaderWrapper.java:256)
>
> This might be an NPE triggered off by a specific case of the type parser.
>
> I tested it out on my current build with simple types and it looks like
> the issue needs more detail on the column types for a repro.
>
> hive> create temporary table x (x int) stored as parquet;
> hive> insert into x values(1),(2);
> hive> select count(*) from x where x.x > 1;
> Status: DAG finished successfully in 0.18 seconds
> OK
> 1
> Time taken: 0.792 seconds, Fetched: 1 row(s)
> hive>
>
> Do you have INT96 in the schema?
>
> > I'm currently evaluating Hive on Tez as an alternative to keeping the
> >SparkSQL thrift sever running all the time locking up resources.
>
> Tez has a tunable value in tez.am.session.min.held-containers (i.e
> something small like 10).
>
> And HiveServer2 can be made work similarly because spark
> HiveThriftServer2.scala is a wrapper around hive's ThriftBinaryCLIService.
>
>
>
>
>
>
> Cheers,
> Gopal
>
>
>


NPE when reading Parquet using Hive on Tez

2016-01-04 Thread Adam Hunt
Hi,

When I perform any operation on a data set stored in Parquet format using
Hive on Tez, I get an NPE (see bottom for stack trace). The same operation
works fine on tables stored as text, Avro, ORC and Sequence files. The same
query on the parquet tables also works fine if I use Hive on MR.

I'm running MapR 5.0.0  with Hive 1.2.0-mapr-1510, Hadoop 2.7.0-mapr-1506
and Tez 0.7.0 compiled from source.

I'm currently evaluating Hive on Tez as an alternative to keeping the
SparkSQL thrift sever running all the time locking up resources.
Unfortunately, this is a blocker since most of our data is stored in
Parquet files.

Thanks,
Adam

select count(*) from alexa_parquet;
or
create table kmeans_results_100_orc stored as orc as select * from
kmeans_results_100;

], TaskAttempt 3 failed, info=[Error: Failure while running
task:java.lang.RuntimeException: java.lang.RuntimeException:
java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at
org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.IOException:
java.lang.NullPointerException
at
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:192)
at
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:131)
at
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:97)
at
org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149)
at
org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80)
at
org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:614)
at
org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:593)
at
org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:141)
at
org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:370)
at
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:127)
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
... 14 more
Caused by: java.io.IOException: java.lang.NullPointerException
at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:252)
at
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:189)
... 25 more
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.tokenize(TypeInfoUtils.java:274)
at
org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.(TypeInfoUtils.java:293)
at
org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils.getTypeInfosFromTypeString(TypeInfoUtils.java:764)
at
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.getColumnTypes(DataWritableReadSupport.java:76)
at
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:220)
at
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:256)
at
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:99)
at