Re: Error in Hive on Spark
Does anyone have suggestions in setting property of hive-exec-2.0.0.jar path in application? Something like 'hiveConf.set("hive.remote.driver.jar","hdfs://storm0:9000/tmp/spark-assembly-1.4.1-hadoop2.6.0.jar")'. 2016-03-11 10:53 GMT+08:00 Stana : > Thanks for reply > > I have set the property spark.home in my application. Otherwise the > application threw 'SPARK_HOME not found exception'. > > I found hive source code in SparkClientImpl.java: > > private Thread startDriver(final RpcServer rpcServer, final String > clientId, final String secret) > throws IOException { > ... > > List argv = Lists.newArrayList(); > > ... > > argv.add("--class"); > argv.add(RemoteDriver.class.getName()); > > String jar = "spark-internal"; > if (SparkContext.jarOfClass(this.getClass()).isDefined()) { > jar = SparkContext.jarOfClass(this.getClass()).get(); > } > argv.add(jar); > > ... > > } > > When hive executed spark-submit , it generate the shell command with > --class org.apache.hive.spark.client.RemoteDriver ,and set jar path with > SparkContext.jarOfClass(this.getClass()).get(). It will get the local path > of hive-exec-2.0.0.jar. > > In my situation, the application and yarn cluster are in different cluster. > When application executed spark-submit with local path of > hive-exec-2.0.0.jar to yarn cluster, there 's no hive-exec-2.0.0.jar in > yarn cluster. Then application threw the exception: "hive-exec-2.0.0.jar > does not exist ...". > > Can it be set property of hive-exec-2.0.0.jar path in application ? > Something like 'hiveConf.set("hive.remote.driver.jar", > "hdfs://storm0:9000/tmp/spark-assembly-1.4.1-hadoop2.6.0.jar")'. > If not, is it possible to achieve in the future version? > > > > > 2016-03-10 23:51 GMT+08:00 Xuefu Zhang : > >> You can probably avoid the problem by set environment variable SPARK_HOME >> or JVM property spark.home that points to your spark installation. >> >> --Xuefu >> >> On Thu, Mar 10, 2016 at 3:11 AM, Stana wrote: >> >> > I am trying out Hive on Spark with hive 2.0.0 and spark 1.4.1, and >> > executing org.apache.hadoop.hive.ql.Driver with java application. >> > >> > Following are my situations: >> > 1.Building spark 1.4.1 assembly jar without Hive . >> > 2.Uploading the spark assembly jar to the hadoop cluster. >> > 3.Executing the java application with eclipse IDE in my client computer. >> > >> > The application went well and it submitted mr job to the yarn cluster >> > successfully when using " hiveConf.set("hive.execution.engine", "mr") >> > ",but it threw exceptions in spark-engine. >> > >> > Finally, i traced Hive source code and came to the conclusion: >> > >> > In my situation, SparkClientImpl class will generate the spark-submit >> > shell and executed it. >> > The shell command allocated --class with RemoteDriver.class.getName() >> > and jar with SparkContext.jarOfClass(this.getClass()).get(), so that >> > my application threw the exception. >> > >> > Is it right? And how can I do to execute the application with >> > spark-engine successfully in my client computer ? Thanks a lot! >> > >> > >> > Java application code: >> > >> > public class TestHiveDriver { >> > >> > private static HiveConf hiveConf; >> > private static Driver driver; >> > private static CliSessionState ss; >> > public static void main(String[] args){ >> > >> > String sql = "select * from hadoop0263_0 as a join >> > hadoop0263_0 as b >> > on (a.key = b.key)"; >> > ss = new CliSessionState(new >> HiveConf(SessionState.class)); >> > hiveConf = new HiveConf(Driver.class); >> > hiveConf.set("fs.default.name", "hdfs://storm0:9000"); >> > hiveConf.set("yarn.resourcemanager.address", >> > "storm0:8032"); >> > hiveConf.set("yarn.resourcemanager.scheduler.address", >> > "storm0:8030"); >> > >> > >> hiveConf.set("yarn.resourcemanager.resource-tracker.address","storm0:8031"); >> > hiveConf.set("yarn.resourcemanager.admin.address", >> > "storm0:8033"); >> > hiveConf.set("mapreduce.framework.name", "yarn"); >> > hiveConf.set("mapreduce.johistory.address", >> > "storm0:10020"); >> > >> > >> hiveConf.set("javax.jdo.option.ConnectionURL","jdbc:mysql://storm0:3306/stana_metastore"); >> > >> > >> hiveConf.set("javax.jdo.option.ConnectionDriverName","com.mysql.jdbc.Driver"); >> > hiveConf.set("javax.jdo.option.ConnectionUserName", >> > "root"); >> > hiveConf.set("javax.jdo.option.ConnectionPassword", >> > "123456"); >> > hiveConf.setBoolean("hive.auto.convert.join",false); >> > hiveConf.set("spark.yarn.jar", >> > "hdfs://storm0:9000/tmp/spark-assembly-1.4.1-hadoop2.6.0.jar"); >> > hiveConf.set("spark.home","target/spark"); >> > hiveConf.set("hive.execution.engine", "spark"); >> > hiveConf.set("hive.dbname", "default"); >> > >> > >> >
Re: Review Request 45032: HIVE-13319: HIVE-4570/HIVE-13319: Propagate external handle in task display
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/45032/#review124495 --- Ship it! Ship It! - Amareshwari Sriramadasu On March 20, 2016, 8:44 p.m., Rajat Khandelwal wrote: > > --- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/45032/ > --- > > (Updated March 20, 2016, 8:44 p.m.) > > > Review request for hive. > > > Bugs: HIVE-13319 > https://issues.apache.org/jira/browse/HIVE-13319 > > > Repository: hive-git > > > Description > --- > > Currently in Hive Server2, when the query is still executing only the status > is set as STILL_EXECUTING. > > This issue is to give more information to the user such as progress and > running job handles, if possible. > > > Diffs > - > > ql/src/java/org/apache/hadoop/hive/ql/QueryDisplay.java > 467dab66e454d895742e96d4ac5db452fea00551 > service/src/test/org/apache/hive/service/cli/CLIServiceTest.java > e145eb434159d43b90480bad6711f965a82072c5 > > Diff: https://reviews.apache.org/r/45032/diff/ > > > Testing > --- > > > Thanks, > > Rajat Khandelwal > >
Re: Review Request 45032: HIVE-13319: HIVE-4570/HIVE-13319: Propagate external handle in task display
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/45032/ --- (Updated March 21, 2016, 2:14 a.m.) Review request for hive. Summary (updated) - HIVE-13319: HIVE-4570/HIVE-13319: Propagate external handle in task display Bugs: HIVE-13319 https://issues.apache.org/jira/browse/HIVE-13319 Repository: hive-git Description --- Currently in Hive Server2, when the query is still executing only the status is set as STILL_EXECUTING. This issue is to give more information to the user such as progress and running job handles, if possible. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/QueryDisplay.java 467dab66e454d895742e96d4ac5db452fea00551 service/src/test/org/apache/hive/service/cli/CLIServiceTest.java e145eb434159d43b90480bad6711f965a82072c5 Diff: https://reviews.apache.org/r/45032/diff/ Testing --- Thanks, Rajat Khandelwal
Re: Review Request 45032: HIVE-4570/HIVE-13319: Propagate external handle in task display
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/45032/ --- (Updated March 21, 2016, 2:12 a.m.) Review request for hive. Summary (updated) - HIVE-4570/HIVE-13319: Propagate external handle in task display Bugs: HIVE-13319 https://issues.apache.org/jira/browse/HIVE-13319 Repository: hive-git Description --- Currently in Hive Server2, when the query is still executing only the status is set as STILL_EXECUTING. This issue is to give more information to the user such as progress and running job handles, if possible. Diffs - ql/src/java/org/apache/hadoop/hive/ql/QueryDisplay.java 467dab66e454d895742e96d4ac5db452fea00551 service/src/test/org/apache/hive/service/cli/CLIServiceTest.java e145eb434159d43b90480bad6711f965a82072c5 Diff: https://reviews.apache.org/r/45032/diff/ Testing --- Thanks, Rajat Khandelwal
[jira] [Created] (HIVE-13319) Propagate external handles in task display
Rajat Khandelwal created HIVE-13319: --- Summary: Propagate external handles in task display Key: HIVE-13319 URL: https://issues.apache.org/jira/browse/HIVE-13319 Project: Hive Issue Type: Improvement Reporter: Rajat Khandelwal Assignee: Rajat Khandelwal -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13318) Cache the result of getTable from metaStore
Pengcheng Xiong created HIVE-13318: -- Summary: Cache the result of getTable from metaStore Key: HIVE-13318 URL: https://issues.apache.org/jira/browse/HIVE-13318 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong getTable by name from metaStore is called many times. We plan to cache it to save calls. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13312) TABLESAMPLE with PERCENT throws FAILED: SemanticException 1:68 Percentage sampling is not supported in org.apache.hadoop.hive.ql.io.HiveInputFormat. Error encountered nea
Artem Ervits created HIVE-13312: --- Summary: TABLESAMPLE with PERCENT throws FAILED: SemanticException 1:68 Percentage sampling is not supported in org.apache.hadoop.hive.ql.io.HiveInputFormat. Error encountered near token '20' Key: HIVE-13312 URL: https://issues.apache.org/jira/browse/HIVE-13312 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 1.2.1 Reporter: Artem Ervits Priority: Minor FAILED: SemanticException 1:68 Percentage sampling is not supported in org.apache.hadoop.hive.ql.io.HiveInputFormat. Error encountered near token '20' when I execute SELECT * FROM tablename TABLESAMPLE(20 percent); tried with ORC and TEXT tables. Confirmed with Gopal, a temporary workaround is set hive.tez.input.format=${hive.input.format}; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-13317) HCatalog unable to read changed column structure
Bala Divvela created HIVE-13317: --- Summary: HCatalog unable to read changed column structure Key: HIVE-13317 URL: https://issues.apache.org/jira/browse/HIVE-13317 Project: Hive Issue Type: Bug Components: Hive Reporter: Bala Divvela Priority: Minor I have a table t1 which has a single column of datatype array of struct like `a_details array>`. It has few records for partition up=1 which can be read from both Hive and Pig(via HCatalog) properly. Now i need a requirement of adding one more sub column in a_details struct. new structure would be `a_details array>`. After the column change few more records have been appended in one more partition ups=2. Now i am able to read only up=2 partitions data from Pig(HCatalog). When i try to load up=1 data from t1 table using Pig HCatalog it throws an exception "ERROR converting read value to tuple". FYI, I can read all the data from hive properly, it throws exception only when i load from pig using HCatalog. Please help me in resolving this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 44910: HIVE-13294: AvroSerde leaks the connection in a case when reading schema from a url
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/44910/ --- (Updated March 16, 2016, 5:14 p.m.) Review request for hive, Aihua Xu, Szehon Ho, and Yongzhi Chen. Changes --- uploaded a new patch with fix to the typo and empty space. Thanks Bugs: HIVE-13294 https://issues.apache.org/jira/browse/HIVE-13294 Repository: hive-git Description --- AvroSerde leaks the connection in a case when reading schema from url: In public static Schema determineSchemaOrThrowException { ... return AvroSerdeUtils.getSchemaFor(new URL(schemaString).openStream()); ... } The opened inputStream is never closed. The patch is to close the inputStream (thus the connection) in finally block Diffs (updated) - serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 08ae6ef Diff: https://reviews.apache.org/r/44910/diff/ Testing --- precommit test Thanks, Chaoyu Tang
[jira] [Created] (HIVE-13311) MetaDataFormatUtils throws NPE when HiveDecimal.create is null
Reuben Kuhnert created HIVE-13311: - Summary: MetaDataFormatUtils throws NPE when HiveDecimal.create is null Key: HIVE-13311 URL: https://issues.apache.org/jira/browse/HIVE-13311 Project: Hive Issue Type: Bug Reporter: Reuben Kuhnert Assignee: Reuben Kuhnert The {{MetadataFormatUtils.convertToString}} functions have guards to validate for when valid is null, however the {code} private static String convertToString(Decimal val) { if (val == null) { return ""; } return HiveDecimal.create(new BigInteger(val.getUnscaled()), val.getScale()).toString(); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)