Re: Error in Hive on Spark

2016-03-20 Thread Stana
Does anyone have suggestions in setting property of hive-exec-2.0.0.jar
path in application?
Something like
'hiveConf.set("hive.remote.driver.jar","hdfs://storm0:9000/tmp/spark-assembly-1.4.1-hadoop2.6.0.jar")'.



2016-03-11 10:53 GMT+08:00 Stana :

> Thanks for reply
>
> I have set the property spark.home in my application. Otherwise the
> application threw 'SPARK_HOME not found exception'.
>
> I found hive source code in SparkClientImpl.java:
>
> private Thread startDriver(final RpcServer rpcServer, final String
> clientId, final String secret)
>   throws IOException {
> ...
>
> List argv = Lists.newArrayList();
>
> ...
>
> argv.add("--class");
> argv.add(RemoteDriver.class.getName());
>
> String jar = "spark-internal";
> if (SparkContext.jarOfClass(this.getClass()).isDefined()) {
> jar = SparkContext.jarOfClass(this.getClass()).get();
> }
> argv.add(jar);
>
> ...
>
> }
>
> When hive executed spark-submit , it generate the shell command with
> --class org.apache.hive.spark.client.RemoteDriver ,and set jar path with
> SparkContext.jarOfClass(this.getClass()).get(). It will get the local path
> of hive-exec-2.0.0.jar.
>
> In my situation, the application and yarn cluster are in different cluster.
> When application executed spark-submit with local path of
> hive-exec-2.0.0.jar to yarn cluster, there 's no hive-exec-2.0.0.jar in
> yarn cluster. Then application threw the exception: "hive-exec-2.0.0.jar
>   does not exist ...".
>
> Can it be set property of hive-exec-2.0.0.jar path in application ?
> Something like 'hiveConf.set("hive.remote.driver.jar",
> "hdfs://storm0:9000/tmp/spark-assembly-1.4.1-hadoop2.6.0.jar")'.
> If not, is it possible to achieve in the future version?
>
>
>
>
> 2016-03-10 23:51 GMT+08:00 Xuefu Zhang :
>
>> You can probably avoid the problem by set environment variable SPARK_HOME
>> or JVM property spark.home that points to your spark installation.
>>
>> --Xuefu
>>
>> On Thu, Mar 10, 2016 at 3:11 AM, Stana  wrote:
>>
>> >  I am trying out Hive on Spark with hive 2.0.0 and spark 1.4.1, and
>> > executing org.apache.hadoop.hive.ql.Driver with java application.
>> >
>> > Following are my situations:
>> > 1.Building spark 1.4.1 assembly jar without Hive .
>> > 2.Uploading the spark assembly jar to the hadoop cluster.
>> > 3.Executing the java application with eclipse IDE in my client computer.
>> >
>> > The application went well and it submitted mr job to the yarn cluster
>> > successfully when using " hiveConf.set("hive.execution.engine", "mr")
>> > ",but it threw exceptions in spark-engine.
>> >
>> > Finally, i traced Hive source code and came to the conclusion:
>> >
>> > In my situation, SparkClientImpl class will generate the spark-submit
>> > shell and executed it.
>> > The shell command allocated  --class with RemoteDriver.class.getName()
>> > and jar with SparkContext.jarOfClass(this.getClass()).get(), so that
>> > my application threw the exception.
>> >
>> > Is it right? And how can I do to execute the application with
>> > spark-engine successfully in my client computer ? Thanks a lot!
>> >
>> >
>> > Java application code:
>> >
>> > public class TestHiveDriver {
>> >
>> > private static HiveConf hiveConf;
>> > private static Driver driver;
>> > private static CliSessionState ss;
>> > public static void main(String[] args){
>> >
>> > String sql = "select * from hadoop0263_0 as a join
>> > hadoop0263_0 as b
>> > on (a.key = b.key)";
>> > ss = new CliSessionState(new
>> HiveConf(SessionState.class));
>> > hiveConf = new HiveConf(Driver.class);
>> > hiveConf.set("fs.default.name", "hdfs://storm0:9000");
>> > hiveConf.set("yarn.resourcemanager.address",
>> > "storm0:8032");
>> > hiveConf.set("yarn.resourcemanager.scheduler.address",
>> > "storm0:8030");
>> >
>> >
>> hiveConf.set("yarn.resourcemanager.resource-tracker.address","storm0:8031");
>> > hiveConf.set("yarn.resourcemanager.admin.address",
>> > "storm0:8033");
>> > hiveConf.set("mapreduce.framework.name", "yarn");
>> > hiveConf.set("mapreduce.johistory.address",
>> > "storm0:10020");
>> >
>> >
>> hiveConf.set("javax.jdo.option.ConnectionURL","jdbc:mysql://storm0:3306/stana_metastore");
>> >
>> >
>> hiveConf.set("javax.jdo.option.ConnectionDriverName","com.mysql.jdbc.Driver");
>> > hiveConf.set("javax.jdo.option.ConnectionUserName",
>> > "root");
>> > hiveConf.set("javax.jdo.option.ConnectionPassword",
>> > "123456");
>> > hiveConf.setBoolean("hive.auto.convert.join",false);
>> > hiveConf.set("spark.yarn.jar",
>> > "hdfs://storm0:9000/tmp/spark-assembly-1.4.1-hadoop2.6.0.jar");
>> > hiveConf.set("spark.home","target/spark");
>> > hiveConf.set("hive.execution.engine", "spark");
>> > hiveConf.set("hive.dbname", "default");
>> >
>> >
>> >   

Re: Review Request 45032: HIVE-13319: HIVE-4570/HIVE-13319: Propagate external handle in task display

2016-03-20 Thread Amareshwari Sriramadasu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45032/#review124495
---


Ship it!




Ship It!

- Amareshwari Sriramadasu


On March 20, 2016, 8:44 p.m., Rajat Khandelwal wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45032/
> ---
> 
> (Updated March 20, 2016, 8:44 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-13319
> https://issues.apache.org/jira/browse/HIVE-13319
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Currently in Hive Server2, when the query is still executing only the status 
> is set as STILL_EXECUTING. 
> 
> This issue is to give more information to the user such as progress and 
> running job handles, if possible.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryDisplay.java 
> 467dab66e454d895742e96d4ac5db452fea00551 
>   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java 
> e145eb434159d43b90480bad6711f965a82072c5 
> 
> Diff: https://reviews.apache.org/r/45032/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Rajat Khandelwal
> 
>



Re: Review Request 45032: HIVE-13319: HIVE-4570/HIVE-13319: Propagate external handle in task display

2016-03-20 Thread Rajat Khandelwal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45032/
---

(Updated March 21, 2016, 2:14 a.m.)


Review request for hive.


Summary (updated)
-

HIVE-13319: HIVE-4570/HIVE-13319: Propagate external handle in task display


Bugs: HIVE-13319
https://issues.apache.org/jira/browse/HIVE-13319


Repository: hive-git


Description
---

Currently in Hive Server2, when the query is still executing only the status is 
set as STILL_EXECUTING. 

This issue is to give more information to the user such as progress and running 
job handles, if possible.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/QueryDisplay.java 
467dab66e454d895742e96d4ac5db452fea00551 
  service/src/test/org/apache/hive/service/cli/CLIServiceTest.java 
e145eb434159d43b90480bad6711f965a82072c5 

Diff: https://reviews.apache.org/r/45032/diff/


Testing
---


Thanks,

Rajat Khandelwal



Re: Review Request 45032: HIVE-4570/HIVE-13319: Propagate external handle in task display

2016-03-20 Thread Rajat Khandelwal

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45032/
---

(Updated March 21, 2016, 2:12 a.m.)


Review request for hive.


Summary (updated)
-

HIVE-4570/HIVE-13319: Propagate external handle in task display


Bugs: HIVE-13319
https://issues.apache.org/jira/browse/HIVE-13319


Repository: hive-git


Description
---

Currently in Hive Server2, when the query is still executing only the status is 
set as STILL_EXECUTING. 

This issue is to give more information to the user such as progress and running 
job handles, if possible.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/QueryDisplay.java 
467dab66e454d895742e96d4ac5db452fea00551 
  service/src/test/org/apache/hive/service/cli/CLIServiceTest.java 
e145eb434159d43b90480bad6711f965a82072c5 

Diff: https://reviews.apache.org/r/45032/diff/


Testing
---


Thanks,

Rajat Khandelwal



[jira] [Created] (HIVE-13319) Propagate external handles in task display

2016-03-20 Thread Rajat Khandelwal (JIRA)
Rajat Khandelwal created HIVE-13319:
---

 Summary: Propagate external handles in task display
 Key: HIVE-13319
 URL: https://issues.apache.org/jira/browse/HIVE-13319
 Project: Hive
  Issue Type: Improvement
Reporter: Rajat Khandelwal
Assignee: Rajat Khandelwal






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13318) Cache the result of getTable from metaStore

2016-03-20 Thread Pengcheng Xiong (JIRA)
Pengcheng Xiong created HIVE-13318:
--

 Summary: Cache the result of getTable from metaStore
 Key: HIVE-13318
 URL: https://issues.apache.org/jira/browse/HIVE-13318
 Project: Hive
  Issue Type: Sub-task
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong


getTable by name from metaStore is called many times. We plan to cache it to 
save calls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13312) TABLESAMPLE with PERCENT throws FAILED: SemanticException 1:68 Percentage sampling is not supported in org.apache.hadoop.hive.ql.io.HiveInputFormat. Error encountered nea

2016-03-20 Thread Artem Ervits (JIRA)
Artem Ervits created HIVE-13312:
---

 Summary: TABLESAMPLE with PERCENT throws FAILED: SemanticException 
1:68 Percentage sampling is not supported in 
org.apache.hadoop.hive.ql.io.HiveInputFormat. Error encountered near token '20'
 Key: HIVE-13312
 URL: https://issues.apache.org/jira/browse/HIVE-13312
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 1.2.1
Reporter: Artem Ervits
Priority: Minor


FAILED: SemanticException 1:68 Percentage sampling is not supported in 
org.apache.hadoop.hive.ql.io.HiveInputFormat. Error encountered near token '20'
when I execute

SELECT * FROM tablename TABLESAMPLE(20 percent);

tried with ORC and TEXT tables. Confirmed with Gopal, a temporary workaround is

set hive.tez.input.format=${hive.input.format};



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-13317) HCatalog unable to read changed column structure

2016-03-20 Thread Bala Divvela (JIRA)
Bala Divvela created HIVE-13317:
---

 Summary: HCatalog unable to read changed column structure
 Key: HIVE-13317
 URL: https://issues.apache.org/jira/browse/HIVE-13317
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Bala Divvela
Priority: Minor


I have a table t1 which has a single column of datatype array of struct like 
`a_details array>`. It has few 
records for partition up=1 which can be read from both Hive and Pig(via 
HCatalog) properly.

Now i need a requirement of adding one more sub column in a_details struct. new 
structure would be 
`a_details array>`. 
After the column change few more records have been appended in one more 
partition ups=2.

Now i am able to read only up=2 partitions data from Pig(HCatalog).
When i try to load up=1 data from t1 table using Pig HCatalog it throws an 
exception "ERROR converting read value to tuple".

FYI, I can read all the data from hive properly, it throws exception only when 
i load from pig using HCatalog. 

Please help me in resolving this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 44910: HIVE-13294: AvroSerde leaks the connection in a case when reading schema from a url

2016-03-20 Thread Chaoyu Tang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44910/
---

(Updated March 16, 2016, 5:14 p.m.)


Review request for hive, Aihua Xu, Szehon Ho, and Yongzhi Chen.


Changes
---

uploaded a new patch with fix to the typo and empty space. Thanks


Bugs: HIVE-13294
https://issues.apache.org/jira/browse/HIVE-13294


Repository: hive-git


Description
---

AvroSerde leaks the connection in a case when reading schema from url:
In
public static Schema determineSchemaOrThrowException
{ ... return AvroSerdeUtils.getSchemaFor(new URL(schemaString).openStream()); 
... }

The opened inputStream is never closed.

The patch is to close the inputStream (thus the connection) in finally block


Diffs (updated)
-

  serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 08ae6ef 

Diff: https://reviews.apache.org/r/44910/diff/


Testing
---

precommit test


Thanks,

Chaoyu Tang



[jira] [Created] (HIVE-13311) MetaDataFormatUtils throws NPE when HiveDecimal.create is null

2016-03-20 Thread Reuben Kuhnert (JIRA)
Reuben Kuhnert created HIVE-13311:
-

 Summary: MetaDataFormatUtils throws NPE when HiveDecimal.create is 
null
 Key: HIVE-13311
 URL: https://issues.apache.org/jira/browse/HIVE-13311
 Project: Hive
  Issue Type: Bug
Reporter: Reuben Kuhnert
Assignee: Reuben Kuhnert


The {{MetadataFormatUtils.convertToString}} functions have guards to validate 
for when valid is null, however the 

{code}
  private static String convertToString(Decimal val) {
if (val == null) {
  return "";
}

return HiveDecimal.create(new BigInteger(val.getUnscaled()), 
val.getScale()).toString();
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)