[jira] [Created] (HIVE-13337) VectorHashKeyWrapperBatch methods assign*NullsRepeating seem to be missing isNull checks

2016-03-22 Thread Matt McCline (JIRA)
Matt McCline created HIVE-13337:
---

 Summary: VectorHashKeyWrapperBatch methods assign*NullsRepeating 
seem to be missing isNull checks
 Key: HIVE-13337
 URL: https://issues.apache.org/jira/browse/HIVE-13337
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical


Jason Dere spotted a probable problem with all assignLongNullsRepeating, etc 
methods in VectorHashKeyWrapperBatch class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13336) Transform unix_timestamp(args) into to_unix_timestamp(args)

2016-03-22 Thread Gopal V (JIRA)
Gopal V created HIVE-13336:
--

 Summary: Transform unix_timestamp(args) into 
to_unix_timestamp(args)
 Key: HIVE-13336
 URL: https://issues.apache.org/jira/browse/HIVE-13336
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 2.1.0
Reporter: Gopal V
Assignee: Jason Dere






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13335) get rid of TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE

2016-03-22 Thread Eugene Koifman (JIRA)
Eugene Koifman created HIVE-13335:
-

 Summary: get rid of TxnHandler.TIMED_OUT_TXN_ABORT_BATCH_SIZE
 Key: HIVE-13335
 URL: https://issues.apache.org/jira/browse/HIVE-13335
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 2.0.0, 1.3.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman


look for usages - it's no longer useful; in fact may be a perf hit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13334) stats state is not captured correctly

2016-03-22 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-13334:
---

 Summary: stats state is not captured correctly
 Key: HIVE-13334
 URL: https://issues.apache.org/jira/browse/HIVE-13334
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer, Statistics
Affects Versions: 2.0.0
Reporter: Ashutosh Chauhan


As a results StatsOptimizer gives incorrect result. Can be reproduced with for 
following queries:
{code}
 mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true 
-Dqfile=insert_orig_table.q,insert_values_orig_table.q,orc_merge9.q,sample_islocalmode_hook.-Dhive.compute.query.using.stats=true
{code}

[~pxiong] Can you take a look at this one ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13333) StatsOptimizer throws ClassCastException

2016-03-22 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-1:
---

 Summary: StatsOptimizer throws ClassCastException
 Key: HIVE-1
 URL: https://issues.apache.org/jira/browse/HIVE-1
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 2.0.0
Reporter: Ashutosh Chauhan


mvn test -Dtest=TestCliDriver -Dtest.output.overwrite=true 
-Dqfile=cbo_rp_udf_udaf.q -Dhive.compute.query.using.stats=true repros the 
issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13332) support dumping all row indexes in ORC FileDump

2016-03-22 Thread Sergey Shelukhin (JIRA)
Sergey Shelukhin created HIVE-13332:
---

 Summary: support dumping all row indexes in ORC FileDump
 Key: HIVE-13332
 URL: https://issues.apache.org/jira/browse/HIVE-13332
 Project: Hive
  Issue Type: Bug
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13331) Failures when concatenating ORC files using tez

2016-03-22 Thread Ashish Shenoy (JIRA)
Ashish Shenoy created HIVE-13331:


 Summary: Failures when concatenating ORC files using tez
 Key: HIVE-13331
 URL: https://issues.apache.org/jira/browse/HIVE-13331
 Project: Hive
  Issue Type: Bug
 Environment: HDP 2.2
Hive 0.14 with Tez as execution engine
Reporter: Ashish Shenoy


I hit this issue consistently when I try to concatenate the ORC files in a hive 
partition using 'ALTER TABLE ... PARTITION(...) CONCATENATE'. In an email 
thread on the hive users mailing list 
[http://mail-archives.apache.org/mod_mbox/hive-user/201504.mbox/%3c553a2a9e.70...@uib.no%3E],
 I read that tez should be used as the execution engine for hive, so I updated 
my hive configs to use tez as the exec engine.

Here's the stack trace when I use the Tez execution engine:

VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED

File Merge FAILED -1 0 0 -1 0 0

VERTICES: 00/01 [>>--] 0% ELAPSED TIME: 1458666880.00 s

Status: Failed
Vertex failed, vertexName=File Merge, vertexId=vertex_1455906569416_0009_1_00, 
diagnostics=[Vertex vertex_1455906569416_0009_1_00 [File Merge] killed/failed 
due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: [] 
initializer failed, vertex=vertex_1455906569416_0009_1_00 [File Merge], 
java.lang.NullPointerException
at org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:265)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:452)
at 
org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateOldSplits(MRInputHelpers.java:441)
at 
org.apache.tez.mapreduce.hadoop.MRInputHelpers.generateInputSplitsToMem(MRInputHelpers.java:295)
at 
org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.initialize(MRInputAMSplitGenerator.java:124)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)
at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
]
DAG failed due to vertex failure. failedVertices:1 killedVertices:0
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.DDLTask

Please let me know if this has been fixed ? This seems like a very basic thing 
for Hive to get wrong, so I am wondering if I am using the right configs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13330) ORC vectorized string dictionary reader does not differentiate null vs empty string dictionary

2016-03-22 Thread Prasanth Jayachandran (JIRA)
Prasanth Jayachandran created HIVE-13330:


 Summary: ORC vectorized string dictionary reader does not 
differentiate null vs empty string dictionary
 Key: HIVE-13330
 URL: https://issues.apache.org/jira/browse/HIVE-13330
 Project: Hive
  Issue Type: Bug
Affects Versions: 2.0.0, 1.3.0, 2.1.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
Priority: Critical


Vectorized string dictionary reader cannot differentiate between the case where 
all dictionary entries are null vs single entry with empty string. This causes 
wrong results when reading data out of such files. 

{code:title=Vectorization On}
SET hive.vectorized.execution.enabled=true;
SET hive.fetch.task.conversion=none;
select vcol from testnullorc3 limit 1;

OK
NULL
{code}

{code:title=Vectorization Off}
SET hive.vectorized.execution.enabled=false;
SET hive.fetch.task.conversion=none;
select vcol from testnullorc3 limit 1;

OK

{code}

The input table testnullorc3 contains a varchar column vcol with few empty 
strings and few nulls. For this table, non vectorized reader returns empty as 
first row but vectorized reader returns NULL. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 45062: HIVE-13241 LLAP: Incremental Caching marks some small chunks as "incomplete CB"

2016-03-22 Thread Sergey Shelukhin


> On March 22, 2016, 5:09 a.m., Gopal V wrote:
> > llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java,
> >  line 844
> > 
> >
> > is Data ever non-null here?
> 
> Sergey Shelukhin wrote:
> Yes, in EncodedReaderImpl
>  // 2.5. Remember the bad estimates for future reference.
> if (badEstimates != null && !badEstimates.isEmpty()) {
>   // Relies on the fact that cache does not actually store these.
>   DiskRange[] cacheKeys = badEstimates.toArray(new 
> DiskRange[badEstimates.size()]);
>   long[] result = cacheWrapper.putFileData(fileKey, cacheKeys, null, 
> baseOffset);
>   assert result == null; // We don't expect conflicts from bad 
> estimates.
> }

and for non-null:
// 6. Finally, put uncompressed data to cache.
if (fileKey != null) {
  long[] collisionMask = cacheWrapper.putFileData(fileKey, cacheKeys, 
targetBuffers, baseOffset);
  processCacheCollisions(collisionMask, toDecompress, targetBuffers, 
csd.getCacheBuffers());
}


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45062/#review124732
---


On March 18, 2016, 11:18 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45062/
> ---
> 
> (Updated March 18, 2016, 11:18 p.m.)
> 
> 
> Review request for hive, Gopal V and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 98c6372 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java
>  bae571e 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapOptionsProcessor.java
>  c292b37 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java 
> dbee823 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java
>  eb251a8 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java
>  PRE-CREATION 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java
>  e970137 
>   
> llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java
>  901e58a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 
> 29b51ec 
>   
> storage-api/src/java/org/apache/hadoop/hive/common/io/encoded/EncodedColumnBatch.java
>  ddba889 
> 
> Diff: https://reviews.apache.org/r/45062/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 45062: HIVE-13241 LLAP: Incremental Caching marks some small chunks as "incomplete CB"

2016-03-22 Thread Sergey Shelukhin


> On March 22, 2016, 5:09 a.m., Gopal V wrote:
> > llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java,
> >  line 844
> > 
> >
> > is Data ever non-null here?

Yes, in EncodedReaderImpl
 // 2.5. Remember the bad estimates for future reference.
if (badEstimates != null && !badEstimates.isEmpty()) {
  // Relies on the fact that cache does not actually store these.
  DiskRange[] cacheKeys = badEstimates.toArray(new 
DiskRange[badEstimates.size()]);
  long[] result = cacheWrapper.putFileData(fileKey, cacheKeys, null, 
baseOffset);
  assert result == null; // We don't expect conflicts from bad estimates.
}


- Sergey


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45062/#review124732
---


On March 18, 2016, 11:18 p.m., Sergey Shelukhin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45062/
> ---
> 
> (Updated March 18, 2016, 11:18 p.m.)
> 
> 
> Review request for hive, Gopal V and Prasanth_J.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> see jira
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 98c6372 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cache/EvictionDispatcher.java
>  bae571e 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/cli/LlapOptionsProcessor.java
>  c292b37 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/io/api/impl/LlapIoImpl.java 
> dbee823 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/io/encoded/OrcEncodedDataReader.java
>  eb251a8 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java
>  PRE-CREATION 
>   
> llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcMetadataCache.java
>  e970137 
>   
> llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestOrcMetadataCache.java
>  901e58a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/encoded/EncodedReaderImpl.java 
> 29b51ec 
>   
> storage-api/src/java/org/apache/hadoop/hive/common/io/encoded/EncodedColumnBatch.java
>  ddba889 
> 
> Diff: https://reviews.apache.org/r/45062/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Sergey Shelukhin
> 
>



Re: Review Request 45137: HiveServer2: Make ZK config publishing configurable

2016-03-22 Thread Vaibhav Gumashta


> On March 22, 2016, 6:24 a.m., Thejas Nair wrote:
> > service/src/java/org/apache/hive/service/server/HiveServer2.java, line 257
> > 
> >
> > better to avoid these cosmetic changes that don't seem to improve style 
> > compliance

I have the hive compliant style set on my editor and selected the method I 
changed and applied the formatting. I'll revert that and limit to specific code 
changes.


- Vaibhav


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45137/#review124741
---


On March 22, 2016, 12:36 a.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45137/
> ---
> 
> (Updated March 22, 2016, 12:36 a.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Bugs: HIVE-13326
> https://issues.apache.org/jira/browse/HIVE-13326
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-13326
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 0f8d67f 
>   jdbc/src/java/org/apache/hive/jdbc/ZooKeeperHiveClientHelper.java 1ca77a1 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java ab834b9 
> 
> Diff: https://reviews.apache.org/r/45137/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



[jira] [Created] (HIVE-13329) Hive query id should not be allowed to be modified by users.

2016-03-22 Thread Vikram Dixit K (JIRA)
Vikram Dixit K created HIVE-13329:
-

 Summary: Hive query id should not be allowed to be modified by 
users.
 Key: HIVE-13329
 URL: https://issues.apache.org/jira/browse/HIVE-13329
 Project: Hive
  Issue Type: Bug
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13328) Cannot Quey Hive External Returning 0 values

2016-03-22 Thread bharath kumar (JIRA)
bharath kumar created HIVE-13328:


 Summary: Cannot Quey Hive External Returning 0 values
 Key: HIVE-13328
 URL: https://issues.apache.org/jira/browse/HIVE-13328
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 0.13.0
 Environment: MAPRFS
Reporter: bharath kumar
Priority: Blocker


Having Issues with Involving ORC format and external table.Below are the 
sequence of steps followed.

 

created an external table


using insert overwrite data is populated to external table with partition.

insert overwrite table external_table partition (data_date='2016-22-03')
select from (select * from db3.table1
where data_date = '2016-22-03') i

left join (select * from db3.table2 where data_date = '2016-22-03') th on 
i.column1 = th.column1 and
i.column2 = th.column2;

 

But when i query the external table data is not present. Tried below procedures 
like 

ALTER TABLE NAME ADD PARTITION(DATA_DATE='2016-22-03');

MSCK REPAIR TABLE TABLENAME;
OK
Partitions not in metastore:   

What could be the issue?




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: select count(*) from table;

2016-03-22 Thread Amey Barve
Thanks Nitin, Mich,

if its just plain vanilla text file format, it needs to run a job to get
the count so the longest of all
--> Hive must be translating some operator like fetch (for count) into a
map-reduce job and getting the result?
Can a custom storage handler get information about the operator/s for
count(*) and then use it to retrieve the results.

I want to know whether custom storage handler can get information about
operators that hive constructs for queries like count, max, min etc. so
that storage handler can map these to internal storage functions?

Regards,
Amey

On Tue, Mar 22, 2016 at 1:32 PM, Mich Talebzadeh 
wrote:

> ORC file has the following stats levels for storage indexes
>
>
>1. ORC File itself
>2. Multiple stripes (chunks) within the ORC file
>3. Multiple row groups (row batches) within each stripe
>
> Assuming that the underlying table has stats updated, count will be stored
> for each column
>
> So when we do something like below:
>
> select count(1) from orctest
>
> you can see stats collected if you do
>
> show create table orctest;
>
>  TBLPROPERTIES (  |
> |   'COLUMN_STATS_ACCURATE'='true',|
> |   'numFiles'='31',   |
> |   *'numRows'='25'*,|
>
>
> File statistics, Stripe statistics and row group statistics are kept. So
> ORC table will rely on those if needed
>
>
> HTH
>
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 22 March 2016 at 07:14, Amey Barve  wrote:
>
>> select count(*) from table;
>>
>> How does hive evaluate count(*) on a table?
>>
>> Does it return count by actually querying table, or directly return count
>> by consulting some statistics locally.
>>
>> For Hive's Text format it takes few seconds while Hive's Orc format takes
>> fraction of seconds.
>>
>> Regards,
>> Amey
>>
>
>


Re: select count(*) from table;

2016-03-22 Thread Nitin Pawar
If you have enabled performance optimization by enabling statistics it will
come from there
if the underlying file format supports infile statistics (like ORC), it
will come from there
if its just plain vanilla text file format, it needs to run a job to get
the count so the longest of all

On Tue, Mar 22, 2016 at 12:44 PM, Amey Barve  wrote:

> select count(*) from table;
>
> How does hive evaluate count(*) on a table?
>
> Does it return count by actually querying table, or directly return count
> by consulting some statistics locally.
>
> For Hive's Text format it takes few seconds while Hive's Orc format takes
> fraction of seconds.
>
> Regards,
> Amey
>



-- 
Nitin Pawar


select count(*) from table;

2016-03-22 Thread Amey Barve
select count(*) from table;

How does hive evaluate count(*) on a table?

Does it return count by actually querying table, or directly return count
by consulting some statistics locally.

For Hive's Text format it takes few seconds while Hive's Orc format takes
fraction of seconds.

Regards,
Amey


Re: Error in Hive on Spark

2016-03-22 Thread Stana
Hi, Xuefu

You are right.
Maybe I should launch spark-submit by HS2 or Hive CLI ?

Thanks a lot,
Stana


2016-03-22 1:16 GMT+08:00 Xuefu Zhang :

> Stana,
>
> I'm not sure if I fully understand the problem. spark-submit is launched in
> the same host as your application, which should be able to access
> hive-exec.jar. Yarn cluster needs the jar also, but HS2 or Hive CLI will
> take care of that. Since you are not using either of which, then, it's your
> application's responsibility to make that happen.
>
> Did I missed anything else?
>
> Thanks,
> Xuefu
>
> On Sun, Mar 20, 2016 at 11:18 PM, Stana  wrote:
>
> > Does anyone have suggestions in setting property of hive-exec-2.0.0.jar
> > path in application?
> > Something like
> >
> >
> 'hiveConf.set("hive.remote.driver.jar","hdfs://storm0:9000/tmp/spark-assembly-1.4.1-hadoop2.6.0.jar")'.
> >
> >
> >
> > 2016-03-11 10:53 GMT+08:00 Stana :
> >
> > > Thanks for reply
> > >
> > > I have set the property spark.home in my application. Otherwise the
> > > application threw 'SPARK_HOME not found exception'.
> > >
> > > I found hive source code in SparkClientImpl.java:
> > >
> > > private Thread startDriver(final RpcServer rpcServer, final String
> > > clientId, final String secret)
> > >   throws IOException {
> > > ...
> > >
> > > List argv = Lists.newArrayList();
> > >
> > > ...
> > >
> > > argv.add("--class");
> > > argv.add(RemoteDriver.class.getName());
> > >
> > > String jar = "spark-internal";
> > > if (SparkContext.jarOfClass(this.getClass()).isDefined()) {
> > > jar = SparkContext.jarOfClass(this.getClass()).get();
> > > }
> > > argv.add(jar);
> > >
> > > ...
> > >
> > > }
> > >
> > > When hive executed spark-submit , it generate the shell command with
> > > --class org.apache.hive.spark.client.RemoteDriver ,and set jar path
> with
> > > SparkContext.jarOfClass(this.getClass()).get(). It will get the local
> > path
> > > of hive-exec-2.0.0.jar.
> > >
> > > In my situation, the application and yarn cluster are in different
> > cluster.
> > > When application executed spark-submit with local path of
> > > hive-exec-2.0.0.jar to yarn cluster, there 's no hive-exec-2.0.0.jar in
> > > yarn cluster. Then application threw the exception:
> "hive-exec-2.0.0.jar
> > >   does not exist ...".
> > >
> > > Can it be set property of hive-exec-2.0.0.jar path in application ?
> > > Something like 'hiveConf.set("hive.remote.driver.jar",
> > > "hdfs://storm0:9000/tmp/spark-assembly-1.4.1-hadoop2.6.0.jar")'.
> > > If not, is it possible to achieve in the future version?
> > >
> > >
> > >
> > >
> > > 2016-03-10 23:51 GMT+08:00 Xuefu Zhang :
> > >
> > >> You can probably avoid the problem by set environment variable
> > SPARK_HOME
> > >> or JVM property spark.home that points to your spark installation.
> > >>
> > >> --Xuefu
> > >>
> > >> On Thu, Mar 10, 2016 at 3:11 AM, Stana  wrote:
> > >>
> > >> >  I am trying out Hive on Spark with hive 2.0.0 and spark 1.4.1, and
> > >> > executing org.apache.hadoop.hive.ql.Driver with java application.
> > >> >
> > >> > Following are my situations:
> > >> > 1.Building spark 1.4.1 assembly jar without Hive .
> > >> > 2.Uploading the spark assembly jar to the hadoop cluster.
> > >> > 3.Executing the java application with eclipse IDE in my client
> > computer.
> > >> >
> > >> > The application went well and it submitted mr job to the yarn
> cluster
> > >> > successfully when using " hiveConf.set("hive.execution.engine",
> "mr")
> > >> > ",but it threw exceptions in spark-engine.
> > >> >
> > >> > Finally, i traced Hive source code and came to the conclusion:
> > >> >
> > >> > In my situation, SparkClientImpl class will generate the
> spark-submit
> > >> > shell and executed it.
> > >> > The shell command allocated  --class with
> RemoteDriver.class.getName()
> > >> > and jar with SparkContext.jarOfClass(this.getClass()).get(), so that
> > >> > my application threw the exception.
> > >> >
> > >> > Is it right? And how can I do to execute the application with
> > >> > spark-engine successfully in my client computer ? Thanks a lot!
> > >> >
> > >> >
> > >> > Java application code:
> > >> >
> > >> > public class TestHiveDriver {
> > >> >
> > >> > private static HiveConf hiveConf;
> > >> > private static Driver driver;
> > >> > private static CliSessionState ss;
> > >> > public static void main(String[] args){
> > >> >
> > >> > String sql = "select * from hadoop0263_0 as a join
> > >> > hadoop0263_0 as b
> > >> > on (a.key = b.key)";
> > >> > ss = new CliSessionState(new
> > >> HiveConf(SessionState.class));
> > >> > hiveConf = new HiveConf(Driver.class);
> > >> > hiveConf.set("fs.default.name",
> > "hdfs://storm0:9000");
> > >> > hiveConf.set("yarn.resourcemanager.address",
> > >> > "storm0:8032");
> > >> >
>  hiveConf.set("yarn.resourcemanager.scheduler.address",
> > >> > "storm0:8030");
> > >> >
> > >> >
> > >>
> >
> hiveConf.set(