[jira] [Commented] (HIVE-6226) It should be possible to get hadoop, hive, and pig version being used by WebHCat
[ https://issues.apache.org/jira/browse/HIVE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14084101#comment-14084101 ] Eugene Koifman commented on HIVE-6226: -- Thanks [~leftylev] It should be possible to get hadoop, hive, and pig version being used by WebHCat Key: HIVE-6226 URL: https://issues.apache.org/jira/browse/HIVE-6226 Project: Hive Issue Type: New Feature Components: WebHCat Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.13.0 Attachments: HIVE-6226.2.patch, HIVE-6226.patch Calling /version on WebHCat tells the caller the protocol verison, but there is no way to determine the versions of software being run by the applications that WebHCat spawns. I propose to add an end-point: /version/\{module\} where module could be pig, hive, or hadoop. The response will then be: {code} { module : _module_name_, version : _version_string_ } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7513) Add ROW__ID VirtualColumn
[ https://issues.apache.org/jira/browse/HIVE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7513: - Attachment: HIVE-7513.3.patch HIVE-7513.3.patch has 1st stab Add ROW__ID VirtualColumn - Key: HIVE-7513 URL: https://issues.apache.org/jira/browse/HIVE-7513 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7513.3.patch In order to support Update/Delete we need to read rowId from AcidInputFormat and pass that along through the operator pipeline (built from the WHERE clause of the SQL Statement) so that it can be written to the delta file by the update/delete (sink) operators. The parser will add this column to the projection list to make sure it's passed along. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7513) Add ROW__ID VirtualColumn
[ https://issues.apache.org/jira/browse/HIVE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7513: - Status: Patch Available (was: Open) Add ROW__ID VirtualColumn - Key: HIVE-7513 URL: https://issues.apache.org/jira/browse/HIVE-7513 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7513.3.patch In order to support Update/Delete we need to read rowId from AcidInputFormat and pass that along through the operator pipeline (built from the WHERE clause of the SQL Statement) so that it can be written to the delta file by the update/delete (sink) operators. The parser will add this column to the projection list to make sure it's passed along. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7517) RecordIdentifier overrides equals() but not hashCode()
Eugene Koifman created HIVE-7517: Summary: RecordIdentifier overrides equals() but not hashCode() Key: HIVE-7517 URL: https://issues.apache.org/jira/browse/HIVE-7517 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7513) Add ROW__ID VirtualColumn
Eugene Koifman created HIVE-7513: Summary: Add ROW__ID VirtualColumn Key: HIVE-7513 URL: https://issues.apache.org/jira/browse/HIVE-7513 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman In order to support Update/Delete we need to read rowId from AcidInputFormat and pass that along through the operator pipeline (built from the WHERE clause of the SQL Statement) so that it can be written to the delta file by the update/delete (sink) operators. The parser will add this column to the projection list to make sure it's passed along. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7483) hive insert overwrite table select from self dead lock
[ https://issues.apache.org/jira/browse/HIVE-7483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071344#comment-14071344 ] Eugene Koifman commented on HIVE-7483: -- DbTxnManager will use DbLockManager(), so I think hive.zookeeper.quorum is not relevant. hive insert overwrite table select from self dead lock -- Key: HIVE-7483 URL: https://issues.apache.org/jira/browse/HIVE-7483 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.1 Reporter: Xiaoyu Wang CREATE TABLE test( id int, msg string) PARTITIONED BY ( continent string, country string) CLUSTERED BY (id) INTO 10 BUCKETS STORED AS ORC; alter table test add partition(continent='Asia',country='India'); in hive-site.xml: hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; hive.support.concurrency=true; hive.zookeeper.quorum=zk1,zk2,zk3; in hive shell: set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; insert into test table some records first. then execute sql: insert overwrite table test partition(continent='Asia',country='India') select id,msg from test; the log stop at : INFO log.PerfLogger: PERFLOG method=acquireReadWriteLocks from=org.apache.hadoop.hive.ql.Driver i think it has dead lock when insert overwrite table from it self. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-4590) HCatalog documentation example is wrong
[ https://issues.apache.org/jira/browse/HIVE-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14068228#comment-14068228 ] Eugene Koifman commented on HIVE-4590: -- [~leftylev] 1. The MR program does value.get(1) in reduce() which means it's col1 is the 2nd column. Presumably the 1st (0th) column could have been UserName. 2. you are correct on both HCatalog documentation example is wrong --- Key: HIVE-4590 URL: https://issues.apache.org/jira/browse/HIVE-4590 Project: Hive Issue Type: Bug Components: Documentation, HCatalog Affects Versions: 0.10.0 Reporter: Eugene Koifman Assignee: Lefty Leverenz Priority: Minor http://hive.apache.org/docs/hcat_r0.5.0/inputoutput.html#Read+Example reads The following very simple MapReduce program reads data from one table which it assumes to have an integer in the second column, and counts how many different values it sees. That is, it does the equivalent of select col1, count(*) from $table group by col1;. The description of the query is wrong. It actually counts how many instances of each distinct value it find. For example, if values of col1 are {1,1,1,3,3,3,5) it will produce 1, 3 3, 2, 5, 1 -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7423) produce hive-exec-core.jar from ql module
Eugene Koifman created HIVE-7423: Summary: produce hive-exec-core.jar from ql module Key: HIVE-7423 URL: https://issues.apache.org/jira/browse/HIVE-7423 Project: Hive Issue Type: Bug Reporter: Eugene Koifman Assignee: Eugene Koifman currently ql module produces hive-exec-$version.jar which is an uber jar. It's also useful to have a thin jar, let's call it hive-exec-$version-core.jar, that only has classes from ql. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7423) produce hive-exec-core.jar from ql module
[ https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7423: - Affects Version/s: 0.13.1 produce hive-exec-core.jar from ql module - Key: HIVE-7423 URL: https://issues.apache.org/jira/browse/HIVE-7423 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman currently ql module produces hive-exec-$version.jar which is an uber jar. It's also useful to have a thin jar, let's call it hive-exec-$version-core.jar, that only has classes from ql. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7423) produce hive-exec-core.jar from ql module
[ https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7423: - Component/s: Build Infrastructure produce hive-exec-core.jar from ql module - Key: HIVE-7423 URL: https://issues.apache.org/jira/browse/HIVE-7423 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman currently ql module produces hive-exec-$version.jar which is an uber jar. It's also useful to have a thin jar, let's call it hive-exec-$version-core.jar, that only has classes from ql. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7423) produce hive-exec-core.jar from ql module
[ https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7423: - Attachment: HIVE-7423.patch produce hive-exec-core.jar from ql module - Key: HIVE-7423 URL: https://issues.apache.org/jira/browse/HIVE-7423 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7423.patch currently ql module produces hive-exec-$version.jar which is an uber jar. It's also useful to have a thin jar, let's call it hive-exec-$version-core.jar, that only has classes from ql. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7423) produce hive-exec-core.jar from ql module
[ https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7423: - Status: Patch Available (was: Open) produce hive-exec-core.jar from ql module - Key: HIVE-7423 URL: https://issues.apache.org/jira/browse/HIVE-7423 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7423.patch currently ql module produces hive-exec-$version.jar which is an uber jar. It's also useful to have a thin jar, let's call it hive-exec-$version-core.jar, that only has classes from ql. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7423) produce hive-exec-core.jar from ql module
[ https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063102#comment-14063102 ] Eugene Koifman commented on HIVE-7423: -- I don't think Maven supports that. The classifier (core) goes at the end. For example, hive-exec-0.14.0-SNAPSHOT-tests.jar. produce hive-exec-core.jar from ql module - Key: HIVE-7423 URL: https://issues.apache.org/jira/browse/HIVE-7423 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7423.patch currently ql module produces hive-exec-$version.jar which is an uber jar. It's also useful to have a thin jar, let's call it hive-exec-$version-core.jar, that only has classes from ql. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7423) produce hive-exec-core.jar from ql module
[ https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063128#comment-14063128 ] Eugene Koifman commented on HIVE-7423: -- I'm far from a maven expert, but using classifier lets other projects refer to this jar as a dependency. If we name it using other means, can they still do that? produce hive-exec-core.jar from ql module - Key: HIVE-7423 URL: https://issues.apache.org/jira/browse/HIVE-7423 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7423.patch currently ql module produces hive-exec-$version.jar which is an uber jar. It's also useful to have a thin jar, let's call it hive-exec-$version-core.jar, that only has classes from ql. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7423) produce hive-exec-core.jar from ql module
[ https://issues.apache.org/jira/browse/HIVE-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063148#comment-14063148 ] Eugene Koifman commented on HIVE-7423: -- test failures not related: 2 have been failing for a while now and TestHCatLoader doesn't even use hive-exec. produce hive-exec-core.jar from ql module - Key: HIVE-7423 URL: https://issues.apache.org/jira/browse/HIVE-7423 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7423.patch currently ql module produces hive-exec-$version.jar which is an uber jar. It's also useful to have a thin jar, let's call it hive-exec-$version-core.jar, that only has classes from ql. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7376) add minimizeJar to jdbc/pom.xml
[ https://issues.apache.org/jira/browse/HIVE-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14059916#comment-14059916 ] Eugene Koifman commented on HIVE-7376: -- my understanding is that by default uber jar will pull in whole-jar dependencies even if only some of the classes in the jar are needed. this option makes it only include classes from any given jar that are necessary (transitively) by classes in module add minimizeJar to jdbc/pom.xml --- Key: HIVE-7376 URL: https://issues.apache.org/jira/browse/HIVE-7376 Project: Hive Issue Type: Bug Reporter: Eugene Koifman Attachments: HIVE-7376.1.patch.txt adding {code}minimizeJartrue/minimizeJar{code} to maven-shade-plugin reduces the uber jar (hive-jdbc-0.14.0-SNAPSHOT-standalone.jar) from 51MB to 27MB. Is there any reason not to add it? https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#minimizeJar -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-538) make hive_jdbc.jar self-containing
[ https://issues.apache.org/jira/browse/HIVE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058873#comment-14058873 ] Eugene Koifman commented on HIVE-538: - yes, where is it published to? It seems like one would have to build Hive to get it. make hive_jdbc.jar self-containing -- Key: HIVE-538 URL: https://issues.apache.org/jira/browse/HIVE-538 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.3.0, 0.4.0, 0.6.0, 0.13.0 Reporter: Raghotham Murthy Assignee: Nick White Fix For: 0.14.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.2.patch, HIVE-538.patch Currently, most jars in hive/build/dist/lib and the hadoop-*-core.jar are required in the classpath to run jdbc applications on hive. We need to do atleast the following to get rid of most unnecessary dependencies: 1. get rid of dynamic serde and use a standard serialization format, maybe tab separated, json or avro 2. dont use hadoop configuration parameters 3. repackage thrift and fb303 classes into hive_jdbc.jar -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-538) make hive_jdbc.jar self-containing
[ https://issues.apache.org/jira/browse/HIVE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14058185#comment-14058185 ] Eugene Koifman commented on HIVE-538: - the current build system produces 2 jdbc jars: hive-jdbc-0.14.0-SNAPSHOT-standalone.jar - the 51MB uber jar hive-jdbc-0.14.0-SNAPSHOT.jar - the 135K jar The pom file hive-jdbc-0.14.0-SNAPSHOT.pom (which I will attach) does not mention the hive-jdbc-0.14.0-SNAPSHOT-standalone.jar at all. Standalone jar is not part of hive tar bundle either. How is the end user supposed to access this standalone jar? make hive_jdbc.jar self-containing -- Key: HIVE-538 URL: https://issues.apache.org/jira/browse/HIVE-538 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.3.0, 0.4.0, 0.6.0, 0.13.0 Reporter: Raghotham Murthy Assignee: Nick White Fix For: 0.14.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-538.D2553.2.patch, HIVE-538.patch Currently, most jars in hive/build/dist/lib and the hadoop-*-core.jar are required in the classpath to run jdbc applications on hive. We need to do atleast the following to get rid of most unnecessary dependencies: 1. get rid of dynamic serde and use a standard serialization format, maybe tab separated, json or avro 2. dont use hadoop configuration parameters 3. repackage thrift and fb303 classes into hive_jdbc.jar -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7288) Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs
[ https://issues.apache.org/jira/browse/HIVE-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056487#comment-14056487 ] Eugene Koifman commented on HIVE-7288: -- [~shanyu] I left some comments on RB. Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs - Key: HIVE-7288 URL: https://issues.apache.org/jira/browse/HIVE-7288 Project: Hive Issue Type: New Feature Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1 Environment: HDInsight deploying HDP 2.1; Also HDP 2.1 on Windows Reporter: Azim Uddin Assignee: shanyu zhao Attachments: HIVE-7288.1.patch, hive-7288.patch Issue: == Due to lack of parameters (or support for) equivalent of '-libjars' and '-archives' in WebHcat REST API, we cannot use an external Java Jars or Archive files with a Streaming MapReduce job, when the job is submitted via WebHcat/templeton. I am citing a few use cases here, but there can be plenty of scenarios like this- #1 (for -archives):In order to use R with a hadoop distribution like HDInsight or HDP on Windows, we could package the R directory up in a zip file and rename it to r.jar and put it into HDFS or WASB. We can then do something like this from hadoop command line (ignore the wasb syntax, same command can be run with hdfs) - hadoop jar %HADOOP_HOME%\lib\hadoop-streaming.jar -archives wasb:///example/jars/r.jar -files wasb:///example/apps/mapper.r,wasb:///example/apps/reducer.r -mapper ./r.jar/bin/Rscript.exe mapper.r -reducer ./r.jar/bin/Rscript.exe reducer.r -input /example/data/gutenberg -output /probe/r/wordcount This works from hadoop command line, but due to lack of support for '-archives' parameter in WebHcat, we can't submit the same Streaming MR job via WebHcat. #2 (for -libjars): Consider a scenario where a user would like to use a custom inputFormat with a Streaming MapReduce job and wrote his own custom InputFormat JAR. From a hadoop command line we can do something like this - hadoop jar /path/to/hadoop-streaming.jar \ -libjars /path/to/custom-formats.jar \ -D map.output.key.field.separator=, \ -D mapred.text.key.partitioner.options=-k1,1 \ -input my_data/ \ -output my_output/ \ -outputformat test.example.outputformat.DateFieldMultipleOutputFormat \ -mapper my_mapper.py \ -reducer my_reducer.py \ But due to lack of support for '-libjars' parameter for streaming MapReduce job in WebHcat, we can't submit the above streaming MR job (that uses a custom Java JAR) via WebHcat. Impact: We think, being able to submit jobs remotely is a vital feature for hadoop to be enterprise-ready and WebHcat plays an important role there. Streaming MapReduce job is also very important for interoperability. So, it would be very useful to keep WebHcat on par with hadoop command line in terms of streaming MR job submission capability. Ask: Enable parameter support for 'libjars' and 'archives' in WebHcat for Hadoop streaming jobs in WebHcat. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7376) add minimizeJar to jdbc/pom.xml
Eugene Koifman created HIVE-7376: Summary: add minimizeJar to jdbc/pom.xml Key: HIVE-7376 URL: https://issues.apache.org/jira/browse/HIVE-7376 Project: Hive Issue Type: Bug Reporter: Eugene Koifman adding {code}minimizeJartrue/minimizeJar{code} to maven-shade-plugin reduces the uber jar from 51MB to 27MB. Is there any reason not to add it? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7376) add minimizeJar to jdbc/pom.xml
[ https://issues.apache.org/jira/browse/HIVE-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7376: - Description: adding {code}minimizeJartrue/minimizeJar{code} to maven-shade-plugin reduces the uber jar (hive-jdbc-0.14.0-SNAPSHOT-standalone.jar) from 51MB to 27MB. Is there any reason not to add it? (was: adding {code}minimizeJartrue/minimizeJar{code} to maven-shade-plugin reduces the uber jar from 51MB to 27MB. Is there any reason not to add it?) add minimizeJar to jdbc/pom.xml --- Key: HIVE-7376 URL: https://issues.apache.org/jira/browse/HIVE-7376 Project: Hive Issue Type: Bug Reporter: Eugene Koifman adding {code}minimizeJartrue/minimizeJar{code} to maven-shade-plugin reduces the uber jar (hive-jdbc-0.14.0-SNAPSHOT-standalone.jar) from 51MB to 27MB. Is there any reason not to add it? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7376) add minimizeJar to jdbc/pom.xml
[ https://issues.apache.org/jira/browse/HIVE-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7376: - Description: adding {code}minimizeJartrue/minimizeJar{code} to maven-shade-plugin reduces the uber jar (hive-jdbc-0.14.0-SNAPSHOT-standalone.jar) from 51MB to 27MB. Is there any reason not to add it? https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#minimizeJar was:adding {code}minimizeJartrue/minimizeJar{code} to maven-shade-plugin reduces the uber jar (hive-jdbc-0.14.0-SNAPSHOT-standalone.jar) from 51MB to 27MB. Is there any reason not to add it? add minimizeJar to jdbc/pom.xml --- Key: HIVE-7376 URL: https://issues.apache.org/jira/browse/HIVE-7376 Project: Hive Issue Type: Bug Reporter: Eugene Koifman adding {code}minimizeJartrue/minimizeJar{code} to maven-shade-plugin reduces the uber jar (hive-jdbc-0.14.0-SNAPSHOT-standalone.jar) from 51MB to 27MB. Is there any reason not to add it? https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#minimizeJar -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7342) support hiveserver2,metastore specific config files
[ https://issues.apache.org/jira/browse/HIVE-7342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055535#comment-14055535 ] Eugene Koifman commented on HIVE-7342: -- does this have effect on HCatCLI, HCat and WebHCat? support hiveserver2,metastore specific config files --- Key: HIVE-7342 URL: https://issues.apache.org/jira/browse/HIVE-7342 Project: Hive Issue Type: Bug Components: Configuration, HiveServer2, Metastore Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-7342.1.patch There is currently a single configuration file for all components in hive. ie, components such as hive cli, hiveserver2 and metastore all read from the same hive-site.xml. It will be useful to have a server specific hive-site.xml, so that you can have some different configuration value set for a server. For example, you might want to enabled authorization checks for hiveserver2, while disabling the checks for hive cli. The workaround today is to add any component specific configuration as a commandline (-hiveconf) argument. Using server specific config files (eg hiveserver2-site.xml, metastore-site.xml) that override the entries in hive-site.xml will make the configuration much more easy to manage. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5510) [WebHCat] GET job/queue return wrong job information
[ https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053877#comment-14053877 ] Eugene Koifman commented on HIVE-5510: -- [~leftylev] the 1st example (under JSON Output (fields)) seems to be of the behavior before the bug fix - isn't likely to confuse users. Should the example be of 'correct' output? [~daijy] Does that make sense to you? [WebHCat] GET job/queue return wrong job information Key: HIVE-5510 URL: https://issues.apache.org/jira/browse/HIVE-5510 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.13.0 Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, HIVE-5510-3.patch, HIVE-5510-4.patch, test_harnesss_1381798977 GET job/queue of a TempletonController job return weird information. It is a mix of child job and itself. It should only pull the information of the controller job itself. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7282) HCatLoader fail to load Orc map with null key
[ https://issues.apache.org/jira/browse/HIVE-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050878#comment-14050878 ] Eugene Koifman commented on HIVE-7282: -- I agree that null key in a map is a bad idea. Since we still have to deal with data which already has been written with null key, could we add some table property that will let user say if data contains a map with null key, replace null with 'my_value' on read. (Perhaps the same property can be used to change a null key to 'my_value' on write to support existing writers, but this of course won't work for all cases.) This way null key can be disallowed. HCatLoader fail to load Orc map with null key - Key: HIVE-7282 URL: https://issues.apache.org/jira/browse/HIVE-7282 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.14.0 Attachments: HIVE-7282-1.patch, HIVE-7282-2.patch Here is the stack: Get exception: AttemptID:attempt_1403634189382_0011_m_00_0 Info:Error: org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.NullPointerException at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToPigMap(PigHCatUtil.java:469) at org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:404) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:456) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:374) at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64) ... 13 more -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7249) HiveTxnManager.closeTxnManger() throws if called after commitTxn()
[ https://issues.apache.org/jira/browse/HIVE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048358#comment-14048358 ] Eugene Koifman commented on HIVE-7249: -- BTW, I did test with this patch. I don't see the issue any more. HiveTxnManager.closeTxnManger() throws if called after commitTxn() -- Key: HIVE-7249 URL: https://issues.apache.org/jira/browse/HIVE-7249 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Alan Gates Attachments: HIVE-7249.patch I openTxn() and acquireLocks() for a query that looks like INSERT INTO T PARTITION(p) SELECT * FROM T. Then I call commitTxn(). Then I call closeTxnManger() I get an exception saying lock not found (the only lock in this txn). So it seems TxnMgr doesn't know that commit released the locks. Here is the stack trace and some log output which maybe useful: {noformat} 2014-06-17 15:54:40,771 DEBUG mapreduce.TransactionContext (TransactionContext.java:onCommitJob(128)) - onCommitJob(job_local557130041_0001). this=46719652 2014-06-17 15:54:40,771 DEBUG lockmgr.DbTxnManager (DbTxnManager.java:commitTxn(205)) - Committing txn 1 2014-06-17 15:54:40,771 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - Going to execute query values current_timestamp 2014-06-17 15:54:40,772 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1423)) - Going to execute query select txn_state from TXNS where txn_id = 1 for\ update 2014-06-17 15:54:40,773 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1438)) - Going to execute update update TXNS set txn_last_heartbeat = 140304568\ 0772 where txn_id = 1 2014-06-17 15:54:40,778 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1440)) - Going to commit 2014-06-17 15:54:40,779 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(344)) - Going to execute insert insert into COMPLETED_TXN_COMPONENTS select tc_txn\ id, tc_database, tc_table, tc_partition from TXN_COMPONENTS where tc_txnid = 1 2014-06-17 15:54:40,784 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(352)) - Going to execute update delete from TXN_COMPONENTS where tc_txnid = 1 2014-06-17 15:54:40,788 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(356)) - Going to execute update delete from HIVE_LOCKS where hl_txnid = 1 2014-06-17 15:54:40,791 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(359)) - Going to execute update delete from TXNS where txn_id = 1 2014-06-17 15:54:40,794 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(361)) - Going to commit 2014-06-17 15:54:40,795 WARN mapreduce.TransactionContext (TransactionContext.java:cleanup(317)) - cleanupJob(JobID=job_local557130041_0001)this=46719652 2014-06-17 15:54:40,795 DEBUG lockmgr.DbLockManager (DbLockManager.java:unlock(109)) - Unlocking id:1 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - Going to execute query values current_timestamp 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatLock(1402)) - Going to execute update update HIVE_LOCKS set hl_last_heartbeat = 140\ 3045680796 where hl_lock_ext_id = 1 2014-06-17 15:54:40,800 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatLock(1405)) - Going to rollback 2014-06-17 15:54:40,804 ERROR metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(143)) - NoSuchLockException(message:No such lock: 1) at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatLock(TxnHandler.java:1407) at org.apache.hadoop.hive.metastore.txn.TxnHandler.unlock(TxnHandler.java:477) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.unlock(HiveMetaStore.java:4817) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy14.unlock(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.unlock(HiveMetaStoreClient.java:1598) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.unlock(DbLockManager.java:110) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.close(DbLockManager.java:162) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:300) at org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.closeTxnManager(HiveTxnManagerImpl.java:39) at
[jira] [Updated] (HIVE-7256) HiveTxnManager should be stateless
[ https://issues.apache.org/jira/browse/HIVE-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7256: - Attachment: HIVE-7256.addendum.patch HIVE-7256.addendum.patch is in addition to HIVE-7256.patch. Contains fix to TxnManagerFactory so that both types of managers can coexist. Contains a fix to HcatDbTxnManager.reconstructTxnInfo() to make it work for read-only transactions (as designed in HIVE-7256.patch). Adds some more detailed logging. HiveTxnManager should be stateless -- Key: HIVE-7256 URL: https://issues.apache.org/jira/browse/HIVE-7256 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Alan Gates Attachments: HIVE-7256.addendum.patch, HIVE-7256.patch In order to integrate HCat with Hive ACID, we should be able to create an instance of HiveTxnManager and use it to acquire locks, and release locks from a different instance of HiveTxnManager. One use case where this shows up is when a job using HCat is retried, since calls to TxnManager are made from the jobs OutputCommitter. Another, is HCatReader/Writer. For example, TestReaderWriter, calls setupJob() from one instance of OutputCommitterContainer and commitJob() from another instance. The 2nd case is perhaps better solved by ensuring there is only 1 instance of OutputCommitterContainer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6207) Integrate HCatalog with locking
[ https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6207: - Attachment: (was: HIVE-6207.4.patch) Integrate HCatalog with locking --- Key: HIVE-6207 URL: https://issues.apache.org/jira/browse/HIVE-6207 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: ACIDHCatalogDesign.pdf HCatalog currently ignores any locks created by Hive users. It should respect the locks Hive creates as well as create locks itself when locking is configured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6207) Integrate HCatalog with locking
[ https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6207: - Attachment: HIVE-6207.patch HIVE-6207.patch - preliminary patch which includes https://issues.apache.org/jira/secure/attachment/12651359/HIVE-7249.patch https://issues.apache.org/jira/secure/attachment/12652273/HIVE-7256.patch https://issues.apache.org/jira/secure/attachment/12653274/HIVE-7256.addendum.patch Integrate HCatalog with locking --- Key: HIVE-6207 URL: https://issues.apache.org/jira/browse/HIVE-6207 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: ACIDHCatalogDesign.pdf, HIVE-6207.patch HCatalog currently ignores any locks created by Hive users. It should respect the locks Hive creates as well as create locks itself when locking is configured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7282) HCatLoader fail to load Orc map with null key
[ https://issues.apache.org/jira/browse/HIVE-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046250#comment-14046250 ] Eugene Koifman commented on HIVE-7282: -- Would it not make more sense to add the new test to TestHCatLoaderComplexSchema, so that it's run with both ORC and RCFile? HCatLoader fail to load Orc map with null key - Key: HIVE-7282 URL: https://issues.apache.org/jira/browse/HIVE-7282 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.14.0 Attachments: HIVE-7282-1.patch, HIVE-7282-2.patch Here is the stack: Get exception: AttemptID:attempt_1403634189382_0011_m_00_0 Info:Error: org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.NullPointerException at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToPigMap(PigHCatUtil.java:469) at org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:404) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:456) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:374) at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64) ... 13 more -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7288) Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs
[ https://issues.apache.org/jira/browse/HIVE-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046261#comment-14046261 ] Eugene Koifman commented on HIVE-7288: -- [~shanyu] Please add tests for this feature. Enable support for -libjars and -archives in WebHcat for Streaming MapReduce jobs - Key: HIVE-7288 URL: https://issues.apache.org/jira/browse/HIVE-7288 Project: Hive Issue Type: New Feature Components: WebHCat Affects Versions: 0.11.0, 0.12.0, 0.13.0, 0.13.1 Environment: HDInsight deploying HDP 2.1; Also HDP 2.1 on Windows Reporter: Azim Uddin Assignee: shanyu zhao Attachments: hive-7288.patch Issue: == Due to lack of parameters (or support for) equivalent of '-libjars' and '-archives' in WebHcat REST API, we cannot use an external Java Jars or Archive files with a Streaming MapReduce job, when the job is submitted via WebHcat/templeton. I am citing a few use cases here, but there can be plenty of scenarios like this- #1 (for -archives):In order to use R with a hadoop distribution like HDInsight or HDP on Windows, we could package the R directory up in a zip file and rename it to r.jar and put it into HDFS or WASB. We can then do something like this from hadoop command line (ignore the wasb syntax, same command can be run with hdfs) - hadoop jar %HADOOP_HOME%\lib\hadoop-streaming.jar -archives wasb:///example/jars/r.jar -files wasb:///example/apps/mapper.r,wasb:///example/apps/reducer.r -mapper ./r.jar/bin/Rscript.exe mapper.r -reducer ./r.jar/bin/Rscript.exe reducer.r -input /example/data/gutenberg -output /probe/r/wordcount This works from hadoop command line, but due to lack of support for '-archives' parameter in WebHcat, we can't submit the same Streaming MR job via WebHcat. #2 (for -libjars): Consider a scenario where a user would like to use a custom inputFormat with a Streaming MapReduce job and wrote his own custom InputFormat JAR. From a hadoop command line we can do something like this - hadoop jar /path/to/hadoop-streaming.jar \ -libjars /path/to/custom-formats.jar \ -D map.output.key.field.separator=, \ -D mapred.text.key.partitioner.options=-k1,1 \ -input my_data/ \ -output my_output/ \ -outputformat test.example.outputformat.DateFieldMultipleOutputFormat \ -mapper my_mapper.py \ -reducer my_reducer.py \ But due to lack of support for '-libjars' parameter for streaming MapReduce job in WebHcat, we can't submit the above streaming MR job (that uses a custom Java JAR) via WebHcat. Impact: We think, being able to submit jobs remotely is a vital feature for hadoop to be enterprise-ready and WebHcat plays an important role there. Streaming MapReduce job is also very important for interoperability. So, it would be very useful to keep WebHcat on par with hadoop command line in terms of streaming MR job submission capability. Ask: Enable parameter support for 'libjars' and 'archives' in WebHcat for Hadoop streaming jobs in WebHcat. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7282) HCatLoader fail to load Orc map with null key
[ https://issues.apache.org/jira/browse/HIVE-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14046273#comment-14046273 ] Eugene Koifman commented on HIVE-7282: -- Also, can HIVE-5020 now be closed as duplicate? HCatLoader fail to load Orc map with null key - Key: HIVE-7282 URL: https://issues.apache.org/jira/browse/HIVE-7282 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.14.0 Attachments: HIVE-7282-1.patch, HIVE-7282-2.patch Here is the stack: Get exception: AttemptID:attempt_1403634189382_0011_m_00_0 Info:Error: org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.NullPointerException at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToPigMap(PigHCatUtil.java:469) at org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:404) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:456) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:374) at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64) ... 13 more -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7249) HiveTxnManager.closeTxnManger() throws if called after commitTxn()
[ https://issues.apache.org/jira/browse/HIVE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14041158#comment-14041158 ] Eugene Koifman commented on HIVE-7249: -- yes, i did turn on DbTxnManager, but since we are creating a HCat specific API, let me retest it once that is ready HiveTxnManager.closeTxnManger() throws if called after commitTxn() -- Key: HIVE-7249 URL: https://issues.apache.org/jira/browse/HIVE-7249 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Alan Gates Attachments: HIVE-7249.patch I openTxn() and acquireLocks() for a query that looks like INSERT INTO T PARTITION(p) SELECT * FROM T. Then I call commitTxn(). Then I call closeTxnManger() I get an exception saying lock not found (the only lock in this txn). So it seems TxnMgr doesn't know that commit released the locks. Here is the stack trace and some log output which maybe useful: {noformat} 2014-06-17 15:54:40,771 DEBUG mapreduce.TransactionContext (TransactionContext.java:onCommitJob(128)) - onCommitJob(job_local557130041_0001). this=46719652 2014-06-17 15:54:40,771 DEBUG lockmgr.DbTxnManager (DbTxnManager.java:commitTxn(205)) - Committing txn 1 2014-06-17 15:54:40,771 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - Going to execute query values current_timestamp 2014-06-17 15:54:40,772 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1423)) - Going to execute query select txn_state from TXNS where txn_id = 1 for\ update 2014-06-17 15:54:40,773 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1438)) - Going to execute update update TXNS set txn_last_heartbeat = 140304568\ 0772 where txn_id = 1 2014-06-17 15:54:40,778 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1440)) - Going to commit 2014-06-17 15:54:40,779 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(344)) - Going to execute insert insert into COMPLETED_TXN_COMPONENTS select tc_txn\ id, tc_database, tc_table, tc_partition from TXN_COMPONENTS where tc_txnid = 1 2014-06-17 15:54:40,784 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(352)) - Going to execute update delete from TXN_COMPONENTS where tc_txnid = 1 2014-06-17 15:54:40,788 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(356)) - Going to execute update delete from HIVE_LOCKS where hl_txnid = 1 2014-06-17 15:54:40,791 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(359)) - Going to execute update delete from TXNS where txn_id = 1 2014-06-17 15:54:40,794 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(361)) - Going to commit 2014-06-17 15:54:40,795 WARN mapreduce.TransactionContext (TransactionContext.java:cleanup(317)) - cleanupJob(JobID=job_local557130041_0001)this=46719652 2014-06-17 15:54:40,795 DEBUG lockmgr.DbLockManager (DbLockManager.java:unlock(109)) - Unlocking id:1 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - Going to execute query values current_timestamp 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatLock(1402)) - Going to execute update update HIVE_LOCKS set hl_last_heartbeat = 140\ 3045680796 where hl_lock_ext_id = 1 2014-06-17 15:54:40,800 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatLock(1405)) - Going to rollback 2014-06-17 15:54:40,804 ERROR metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(143)) - NoSuchLockException(message:No such lock: 1) at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatLock(TxnHandler.java:1407) at org.apache.hadoop.hive.metastore.txn.TxnHandler.unlock(TxnHandler.java:477) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.unlock(HiveMetaStore.java:4817) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy14.unlock(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.unlock(HiveMetaStoreClient.java:1598) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.unlock(DbLockManager.java:110) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.close(DbLockManager.java:162) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:300) at org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.closeTxnManager(HiveTxnManagerImpl.java:39) at
[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive
[ https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14041250#comment-14041250 ] Eugene Koifman commented on HIVE-7090: -- If the client fails, how does the temp table get cleaned up? Support session-level temporary tables in Hive -- Key: HIVE-7090 URL: https://issues.apache.org/jira/browse/HIVE-7090 Project: Hive Issue Type: Bug Components: SQL Reporter: Gunther Hagleitner Assignee: Harish Butani Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch It's common to see sql scripts that create some temporary table as an intermediate result, run some additional queries against it and then clean up at the end. We should support temporary tables properly, meaning automatically manage the life cycle and make sure the visibility is restricted to the creating connection/session. Without these it's common to see left over tables in meta-store or weird errors with clashing tmp table names. Proposed syntax: CREATE TEMPORARY TABLE CTAS, CTL, INSERT INTO, should all be supported as usual. Knowing that a user wants a temp table can enable us to further optimize access to it. E.g.: temp tables should be kept in memory where possible, compactions and merging table files aren't required, ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6207) Integrate HCatalog with locking
[ https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6207: - Attachment: ACIDHCatalogDesign.pdf Integrate HCatalog with locking --- Key: HIVE-6207 URL: https://issues.apache.org/jira/browse/HIVE-6207 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: ACIDHCatalogDesign.pdf, HIVE-6207.4.patch HCatalog currently ignores any locks created by Hive users. It should respect the locks Hive creates as well as create locks itself when locking is configured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive
[ https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14041378#comment-14041378 ] Eugene Koifman commented on HIVE-7090: -- In that case it may make sense to generate unique names for artifacts that may be left over. The initial description in this ticket mentions 3rd party tools that will use this feature - I imagine they will generate the same Temp table name each time which may cause weird failures after crash. Support session-level temporary tables in Hive -- Key: HIVE-7090 URL: https://issues.apache.org/jira/browse/HIVE-7090 Project: Hive Issue Type: Bug Components: SQL Reporter: Gunther Hagleitner Assignee: Harish Butani Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch It's common to see sql scripts that create some temporary table as an intermediate result, run some additional queries against it and then clean up at the end. We should support temporary tables properly, meaning automatically manage the life cycle and make sure the visibility is restricted to the creating connection/session. Without these it's common to see left over tables in meta-store or weird errors with clashing tmp table names. Proposed syntax: CREATE TEMPORARY TABLE CTAS, CTL, INSERT INTO, should all be supported as usual. Knowing that a user wants a temp table can enable us to further optimize access to it. E.g.: temp tables should be kept in memory where possible, compactions and merging table files aren't required, ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7256) HiveTxnManager should be stateless
[ https://issues.apache.org/jira/browse/HIVE-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7256: - Assignee: Alan Gates (was: Eugene Koifman) HiveTxnManager should be stateless -- Key: HIVE-7256 URL: https://issues.apache.org/jira/browse/HIVE-7256 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Alan Gates In order to integrate HCat with Hive ACID, we should be able to create an instance of HiveTxnManager and use it to acquire locks, and release locks from a different instance of HiveTxnManager. One use case where this shows up is when a job using HCat is retried, since calls to TxnManager are made from the jobs OutputCommitter. Another, is HCatReader/Writer. For example, TestReaderWriter, calls setupJob() from one instance of OutputCommitterContainer and commitJob() from another instance. The 2nd case is perhaps better solved by ensuring there is only 1 instance of OutputCommitterContainer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7249) HiveTxnManager.closeTxnManger() throws if called after commitTxn()
[ https://issues.apache.org/jira/browse/HIVE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038092#comment-14038092 ] Eugene Koifman commented on HIVE-7249: -- [~alangates] org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned gets wedged with this patch HiveTxnManager.closeTxnManger() throws if called after commitTxn() -- Key: HIVE-7249 URL: https://issues.apache.org/jira/browse/HIVE-7249 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Alan Gates Attachments: HIVE-7249.patch I openTxn() and acquireLocks() for a query that looks like INSERT INTO T PARTITION(p) SELECT * FROM T. Then I call commitTxn(). Then I call closeTxnManger() I get an exception saying lock not found (the only lock in this txn). So it seems TxnMgr doesn't know that commit released the locks. Here is the stack trace and some log output which maybe useful: {noformat} 2014-06-17 15:54:40,771 DEBUG mapreduce.TransactionContext (TransactionContext.java:onCommitJob(128)) - onCommitJob(job_local557130041_0001). this=46719652 2014-06-17 15:54:40,771 DEBUG lockmgr.DbTxnManager (DbTxnManager.java:commitTxn(205)) - Committing txn 1 2014-06-17 15:54:40,771 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - Going to execute query values current_timestamp 2014-06-17 15:54:40,772 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1423)) - Going to execute query select txn_state from TXNS where txn_id = 1 for\ update 2014-06-17 15:54:40,773 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1438)) - Going to execute update update TXNS set txn_last_heartbeat = 140304568\ 0772 where txn_id = 1 2014-06-17 15:54:40,778 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1440)) - Going to commit 2014-06-17 15:54:40,779 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(344)) - Going to execute insert insert into COMPLETED_TXN_COMPONENTS select tc_txn\ id, tc_database, tc_table, tc_partition from TXN_COMPONENTS where tc_txnid = 1 2014-06-17 15:54:40,784 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(352)) - Going to execute update delete from TXN_COMPONENTS where tc_txnid = 1 2014-06-17 15:54:40,788 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(356)) - Going to execute update delete from HIVE_LOCKS where hl_txnid = 1 2014-06-17 15:54:40,791 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(359)) - Going to execute update delete from TXNS where txn_id = 1 2014-06-17 15:54:40,794 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(361)) - Going to commit 2014-06-17 15:54:40,795 WARN mapreduce.TransactionContext (TransactionContext.java:cleanup(317)) - cleanupJob(JobID=job_local557130041_0001)this=46719652 2014-06-17 15:54:40,795 DEBUG lockmgr.DbLockManager (DbLockManager.java:unlock(109)) - Unlocking id:1 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - Going to execute query values current_timestamp 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatLock(1402)) - Going to execute update update HIVE_LOCKS set hl_last_heartbeat = 140\ 3045680796 where hl_lock_ext_id = 1 2014-06-17 15:54:40,800 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatLock(1405)) - Going to rollback 2014-06-17 15:54:40,804 ERROR metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(143)) - NoSuchLockException(message:No such lock: 1) at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatLock(TxnHandler.java:1407) at org.apache.hadoop.hive.metastore.txn.TxnHandler.unlock(TxnHandler.java:477) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.unlock(HiveMetaStore.java:4817) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy14.unlock(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.unlock(HiveMetaStoreClient.java:1598) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.unlock(DbLockManager.java:110) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.close(DbLockManager.java:162) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:300) at org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.closeTxnManager(HiveTxnManagerImpl.java:39) at
[jira] [Commented] (HIVE-7249) HiveTxnManager.closeTxnManger() throws if called after commitTxn()
[ https://issues.apache.org/jira/browse/HIVE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038119#comment-14038119 ] Eugene Koifman commented on HIVE-7249: -- Here is the thread dump though there doesn't appear to be anything interesting in it {noformat} Picked up JAVA_TOOL_OPTIONS: -Djava.awt.headless=true -Dapple.awt.UIElement=true 57554 87066 /Users/ekoifman/dev/hive/hcatalog/core/target/surefire/surefirebooter3727332902234772866.jar 87243 sun.tools.jps.Jps 87056 org.codehaus.plexus.classworlds.launcher.Launcher ekoifman:hcatalog ekoifman$ jstack 87066 Picked up JAVA_TOOL_OPTIONS: -Djava.awt.headless=true -Dapple.awt.UIElement=true 2014-06-19 16:38:27 Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.51-b01-457 mixed mode): Attach Listener daemon prio=9 tid=7ffded8c7800 nid=0x10c84 waiting on condition [] java.lang.Thread.State: RUNNABLE BoneCP-pool-watch-thread daemon prio=5 tid=7ffde9e89000 nid=0x10defb000 waiting on condition [10defa000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 7b8e93d10 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:322) at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:680) BoneCP-keep-alive-scheduler daemon prio=5 tid=7ffde9e88000 nid=0x10ddf8000 waiting on condition [10ddf7000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 7b8fde4d8 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025) at java.util.concurrent.DelayQueue.take(DelayQueue.java:164) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:609) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:602) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:957) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:917) at java.lang.Thread.run(Thread.java:680) com.google.common.base.internal.Finalizer daemon prio=5 tid=7ffde9e9a000 nid=0x10dcf5000 in Object.wait() [10dcf4000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 7b906a3a8 (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) - locked 7b906a3a8 (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) at com.google.common.base.internal.Finalizer.run(Finalizer.java:127) BoneCP-pool-watch-thread daemon prio=5 tid=7ffde91c6800 nid=0x10d068000 waiting on condition [10d067000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 7b870b118 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:322) at com.jolbox.bonecp.PoolWatchThread.run(PoolWatchThread.java:75) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:680) BoneCP-keep-alive-scheduler daemon prio=5 tid=7ffdec031800 nid=0x10cf65000 waiting on condition [10cf64000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 7b86fd7c0 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196) at
[jira] [Updated] (HIVE-6207) Integrate HCatalog with locking
[ https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6207: - Attachment: HIVE-6207.4.patch preliminary patch Integrate HCatalog with locking --- Key: HIVE-6207 URL: https://issues.apache.org/jira/browse/HIVE-6207 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-6207.4.patch HCatalog currently ignores any locks created by Hive users. It should respect the locks Hive creates as well as create locks itself when locking is configured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7256) HiveTxnManager should be stateless
Eugene Koifman created HIVE-7256: Summary: HiveTxnManager should be stateless Key: HIVE-7256 URL: https://issues.apache.org/jira/browse/HIVE-7256 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman In order to integrate HCat with Hive ACID, we should be able to create an instance of HiveTxnManager and use it to acquire locks, and release locks from a different instance of HiveTxnManager. One use case where this shows up is when a job using HCat is retried, since calls to TxnManager are made from the jobs OutputCommitter. Another, is HCatReader/Writer. For example, TestReaderWriter, calls setupJob() from one instance of OutputCommitterContainer and commitJob() from another instance. The 2nd case is perhaps better solved by ensuring there is only 1 instance of OutputCommitterContainer. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7159) For inner joins push a 'is not null predicate' to the join sources for every non nullSafe join condition
[ https://issues.apache.org/jira/browse/HIVE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14034620#comment-14034620 ] Eugene Koifman commented on HIVE-7159: -- FWIW, you can do the same with outer joins on inner side R left outer join S on R.r=S.s is the same as R LOJ (select * from S where s is not null) as S on R.r=S.s and symmetrically for ROJ. For inner joins push a 'is not null predicate' to the join sources for every non nullSafe join condition Key: HIVE-7159 URL: https://issues.apache.org/jira/browse/HIVE-7159 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7159.1.patch, HIVE-7159.2.patch, HIVE-7159.3.patch, HIVE-7159.4.patch, HIVE-7159.5.patch, HIVE-7159.6.patch, HIVE-7159.7.patch, HIVE-7159.8.patch A join B on A.x = B.y can be transformed to (A where x is not null) join (B where y is not null) on A.x = B.y Apart from avoiding shuffling null keyed rows it also avoids issues with reduce-side skew when there are a lot of null values in the data. Thanks to [~gopalv] for the analysis and coming up with the solution. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7249) HiveTxnManager.closeTxnManger() throws if called after commitTxn()
Eugene Koifman created HIVE-7249: Summary: HiveTxnManager.closeTxnManger() throws if called after commitTxn() Key: HIVE-7249 URL: https://issues.apache.org/jira/browse/HIVE-7249 Project: Hive Issue Type: Bug Components: Locking Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Alan Gates I openTxn() and acquireLocks() for a query that looks like INSERT INTO T PARTITION(p) SELECT * FROM T. Then I call commitTxn(). Then I call closeTxnManger() I get an exception saying lock not found (the only lock in this txn). So it seems TxnMgr doesn't know that commit released the locks. Here is the stack trace and some log output which maybe useful: 2014-06-17 15:54:40,771 DEBUG mapreduce.TransactionContext (TransactionContext.java:onCommitJob(128)) - onCommitJob(job_local557130041_0001). this=46719652 2014-06-17 15:54:40,771 DEBUG lockmgr.DbTxnManager (DbTxnManager.java:commitTxn(205)) - Committing txn 1 2014-06-17 15:54:40,771 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - Going to execute query values current_timestamp 2014-06-17 15:54:40,772 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1423)) - Going to execute query select txn_state from TXNS where txn_id = 1 for\ update 2014-06-17 15:54:40,773 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1438)) - Going to execute update update TXNS set txn_last_heartbeat = 140304568\ 0772 where txn_id = 1 2014-06-17 15:54:40,778 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1440)) - Going to commit 2014-06-17 15:54:40,779 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(344)) - Going to execute insert insert into COMPLETED_TXN_COMPONENTS select tc_txn\ id, tc_database, tc_table, tc_partition from TXN_COMPONENTS where tc_txnid = 1 2014-06-17 15:54:40,784 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(352)) - Going to execute update delete from TXN_COMPONENTS where tc_txnid = 1 2014-06-17 15:54:40,788 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(356)) - Going to execute update delete from HIVE_LOCKS where hl_txnid = 1 2014-06-17 15:54:40,791 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(359)) - Going to execute update delete from TXNS where txn_id = 1 2014-06-17 15:54:40,794 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(361)) - Going to commit 2014-06-17 15:54:40,795 WARN mapreduce.TransactionContext (TransactionContext.java:cleanup(317)) - cleanupJob(JobID=job_local557130041_0001)this=46719652 2014-06-17 15:54:40,795 DEBUG lockmgr.DbLockManager (DbLockManager.java:unlock(109)) - Unlocking id:1 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - Going to execute query values current_timestamp 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatLock(1402)) - Going to execute update update HIVE_LOCKS set hl_last_heartbeat = 140\ 3045680796 where hl_lock_ext_id = 1 2014-06-17 15:54:40,800 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatLock(1405)) - Going to rollback 2014-06-17 15:54:40,804 ERROR metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(143)) - NoSuchLockException(message:No such lock: 1) at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatLock(TxnHandler.java:1407) at org.apache.hadoop.hive.metastore.txn.TxnHandler.unlock(TxnHandler.java:477) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.unlock(HiveMetaStore.java:4817) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy14.unlock(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.unlock(HiveMetaStoreClient.java:1598) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.unlock(DbLockManager.java:110) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.close(DbLockManager.java:162) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:300) at org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.closeTxnManager(HiveTxnManagerImpl.java:39) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.closeTxnManager(DbTxnManager.java:43) at org.apache.hive.hcatalog.mapreduce.TransactionContext.cleanup(TransactionContext.java:327) at org.apache.hive.hcatalog.mapreduce.TransactionContext.onCommitJob(TransactionContext.java:142) at org.apache.hive.hcatalog.mapreduce.OutputCommitterContainer.commitJob(OutputCommitterContainer.java:61) at
[jira] [Updated] (HIVE-7249) HiveTxnManager.closeTxnManger() throws if called after commitTxn()
[ https://issues.apache.org/jira/browse/HIVE-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7249: - Description: I openTxn() and acquireLocks() for a query that looks like INSERT INTO T PARTITION(p) SELECT * FROM T. Then I call commitTxn(). Then I call closeTxnManger() I get an exception saying lock not found (the only lock in this txn). So it seems TxnMgr doesn't know that commit released the locks. Here is the stack trace and some log output which maybe useful: {noformat} 2014-06-17 15:54:40,771 DEBUG mapreduce.TransactionContext (TransactionContext.java:onCommitJob(128)) - onCommitJob(job_local557130041_0001). this=46719652 2014-06-17 15:54:40,771 DEBUG lockmgr.DbTxnManager (DbTxnManager.java:commitTxn(205)) - Committing txn 1 2014-06-17 15:54:40,771 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - Going to execute query values current_timestamp 2014-06-17 15:54:40,772 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1423)) - Going to execute query select txn_state from TXNS where txn_id = 1 for\ update 2014-06-17 15:54:40,773 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1438)) - Going to execute update update TXNS set txn_last_heartbeat = 140304568\ 0772 where txn_id = 1 2014-06-17 15:54:40,778 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatTxn(1440)) - Going to commit 2014-06-17 15:54:40,779 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(344)) - Going to execute insert insert into COMPLETED_TXN_COMPONENTS select tc_txn\ id, tc_database, tc_table, tc_partition from TXN_COMPONENTS where tc_txnid = 1 2014-06-17 15:54:40,784 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(352)) - Going to execute update delete from TXN_COMPONENTS where tc_txnid = 1 2014-06-17 15:54:40,788 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(356)) - Going to execute update delete from HIVE_LOCKS where hl_txnid = 1 2014-06-17 15:54:40,791 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(359)) - Going to execute update delete from TXNS where txn_id = 1 2014-06-17 15:54:40,794 DEBUG txn.TxnHandler (TxnHandler.java:commitTxn(361)) - Going to commit 2014-06-17 15:54:40,795 WARN mapreduce.TransactionContext (TransactionContext.java:cleanup(317)) - cleanupJob(JobID=job_local557130041_0001)this=46719652 2014-06-17 15:54:40,795 DEBUG lockmgr.DbLockManager (DbLockManager.java:unlock(109)) - Unlocking id:1 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:getDbTime(872)) - Going to execute query values current_timestamp 2014-06-17 15:54:40,796 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatLock(1402)) - Going to execute update update HIVE_LOCKS set hl_last_heartbeat = 140\ 3045680796 where hl_lock_ext_id = 1 2014-06-17 15:54:40,800 DEBUG txn.TxnHandler (TxnHandler.java:heartbeatLock(1405)) - Going to rollback 2014-06-17 15:54:40,804 ERROR metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(143)) - NoSuchLockException(message:No such lock: 1) at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeatLock(TxnHandler.java:1407) at org.apache.hadoop.hive.metastore.txn.TxnHandler.unlock(TxnHandler.java:477) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.unlock(HiveMetaStore.java:4817) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy14.unlock(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.unlock(HiveMetaStoreClient.java:1598) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.unlock(DbLockManager.java:110) at org.apache.hadoop.hive.ql.lockmgr.DbLockManager.close(DbLockManager.java:162) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:300) at org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.closeTxnManager(HiveTxnManagerImpl.java:39) at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.closeTxnManager(DbTxnManager.java:43) at org.apache.hive.hcatalog.mapreduce.TransactionContext.cleanup(TransactionContext.java:327) at org.apache.hive.hcatalog.mapreduce.TransactionContext.onCommitJob(TransactionContext.java:142) at org.apache.hive.hcatalog.mapreduce.OutputCommitterContainer.commitJob(OutputCommitterContainer.java:61) at org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.commitJob(FileOutputCommitterContainer.java:251) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:537) 2014-06-17 15:54:40,804 ERROR lockmgr.DbLockManager (DbLockManager.java:unlock(114)) -
[jira] [Commented] (HIVE-7190) WebHCat launcher task failure can cause two concurent user jobs to run
[ https://issues.apache.org/jira/browse/HIVE-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14030933#comment-14030933 ] Eugene Koifman commented on HIVE-7190: -- +1 WebHCat launcher task failure can cause two concurent user jobs to run -- Key: HIVE-7190 URL: https://issues.apache.org/jira/browse/HIVE-7190 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Ivan Mitic Attachments: HIVE-7190.2.patch, HIVE-7190.3.patch, HIVE-7190.patch Templeton uses launcher jobs to launch the actual user jobs. Launcher jobs are 1-map jobs (a single task jobs) which kick off the actual user job and monitor it until it finishes. Given that the launcher is a task, like any other MR task, it has a retry policy in case it fails (due to a task crash, tasktracker/nodemanager crash, machine level outage, etc.). Further, when launcher task is retried, it will again launch the same user job, *however* the previous attempt user job is already running. What this means is that we can have two identical user jobs running in parallel. In case of MRv2, there will be an MRAppMaster and the launcher task, which are subject to failure. In case any of the two fails, another instance of a user job will be launched again in parallel. Above situation is already a bug. Now going further to RM HA, what RM does on failover/restart is that it kills all containers, and it restarts all applications. This means that if our customer had 10 jobs on the cluster (this is 10 launcher jobs and 10 user jobs), on RM failover, all 20 jobs will be restarted, and launcher jobs will queue user jobs again. There are two issues with this design: 1. There are *possible* chances for corruption of job outputs (it would be useful to analyze this scenario more and confirm this statement). 2. Cluster resources are spent on jobs redundantly To address the issue at least on Yarn (Hadoop 2.0) clusters, webhcat should do the same thing Oozie does in this scenario, and that is to tag all its child jobs with an id, and kill those jobs on task restart before they are kicked off again. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14029851#comment-14029851 ] Eugene Koifman commented on HIVE-7065: -- none of the failed tests are related to WebHCat Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14028317#comment-14028317 ] Eugene Koifman commented on HIVE-7065: -- I'm looking at it now. Will make changes in this ticket Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Status: Patch Available (was: Reopened) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Attachment: HIVE-7065.2.patch HIVE-7065.2.patch is an ADDITIONAL patch to fix the regression. Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Status: Open (was: Patch Available) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Attachment: (was: HIVE-7065.2.patch) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Status: Patch Available (was: Open) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Attachment: HIVE-7065.2.patch Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.2.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6564) WebHCat E2E tests that launch MR jobs fail on check job completion timeout
[ https://issues.apache.org/jira/browse/HIVE-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026829#comment-14026829 ] Eugene Koifman commented on HIVE-6564: -- +1 WebHCat E2E tests that launch MR jobs fail on check job completion timeout -- Key: HIVE-6564 URL: https://issues.apache.org/jira/browse/HIVE-6564 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.13.0 Reporter: Deepesh Khandelwal Assignee: Deepesh Khandelwal Attachments: HIVE-6564.2.patch, HIVE-6564.patch WebHCat E2E tests that fire off an MR job are not correctly being detected as complete so those tests are timing out. The problem is happening because of JSON module available through cpan which returns 1 or 0 instead of true or false. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Description: WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine was: WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine NO PRECOMMIT TESTS Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025469#comment-14025469 ] Eugene Koifman commented on HIVE-7065: -- good point, this should have pre-commit tests Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025480#comment-14025480 ] Eugene Koifman commented on HIVE-7065: -- [~leftylev] Is there a way to make this table in the wiki be autogenerated from the webhcat-default.xml? It would ensure there is a single source of truth. Tez was shipped in 0.13, so yes I think hive.execution.engine can be mentioned for 0.13. Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6226) It should be possible to get hadoop, hive, and pig version being used by WebHCat
[ https://issues.apache.org/jira/browse/HIVE-6226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025556#comment-14025556 ] Eugene Koifman commented on HIVE-6226: -- [~leftylev] Here is an example: http://localhost:50111/templeton/v1/version/hive?user.name=ekoifman which returns: {module:hive,version:0.14.0-SNAPSHOT} http://localhost:50111/templeton/v1/version/hadoop?user.name=ekoifman returns: {module:hadoop,version:2.4.1-SNAPSHOT} http://localhost:50111/templeton/v1/version/pig?user.name=ekoifman and http://localhost:50111/templeton/v1/version/sqoop?user.name=ekoifman are both there as well, but will return {error:Pig version request not yet implemented} So the last 2 are not really implemented, so I'm not sure they should be documented. It should be possible to get hadoop, hive, and pig version being used by WebHCat Key: HIVE-6226 URL: https://issues.apache.org/jira/browse/HIVE-6226 Project: Hive Issue Type: New Feature Components: WebHCat Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.13.0 Attachments: HIVE-6226.2.patch, HIVE-6226.patch Calling /version on WebHCat tells the caller the protocol verison, but there is no way to determine the versions of software being run by the applications that WebHCat spawns. I propose to add an end-point: /version/\{module\} where module could be pig, hive, or hadoop. The response will then be: {code} { module : _module_name_, version : _version_string_ } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025613#comment-14025613 ] Eugene Koifman commented on HIVE-7065: -- java.lang.IllegalArgumentException: Illegal escaped string hive.some.fake.path=C:\foo\bar.txt\ unescaped \ at 22at org.apache.hadoop.util.StringUtils.unEscapeString(StringUtils.java:565) at org.apache.hadoop.util.StringUtils.unEscapeString(StringUtils.java:547) at org.apache.hadoop.util.StringUtils.unEscapeString(StringUtils.java:533) at org.apache.hive.hcatalog.templeton.tool.TestTempletonUtils.testPropertiesParsing(TestTempletonUtils.java:308) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14025627#comment-14025627 ] Eugene Koifman commented on HIVE-7065: -- What's strange is that that is the test added for this this ticket. Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7065.1.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7202) DbTxnManager deadlocks in hcatalog.cli.TestSematicAnalysis.testAlterTblFFpart()
Eugene Koifman created HIVE-7202: Summary: DbTxnManager deadlocks in hcatalog.cli.TestSematicAnalysis.testAlterTblFFpart() Key: HIVE-7202 URL: https://issues.apache.org/jira/browse/HIVE-7202 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Alan Gates select * from HIVE_LOCKS produces {noformat} 6 |1 |0 |default |junit_sem_analysis |NULL |w|r|1402354627716 |NULL |unknown |ekoifman.local 6 |2 |0 |default |junit_sem_analysis |b=2010-10-10 |w|e|1402354627716 |NULL |unknown |ekoifman.local 2 rows selected {noformat} easiest way to repro this is to add hiveConf.setBoolVar(HiveConf.ConfVars.HIVE_SUPPORT_CONCURRENCY, true); hiveConf.setVar(HiveConf.ConfVars.HIVE_TXN_MANAGER, org.apache.hadoop.hive.ql.lockmgr.DbTxnManager); in HCatBaseTest.setUpHiveConf() -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7110) TestHCatPartitionPublish test failure: No FileSystem or scheme: pfile
[ https://issues.apache.org/jira/browse/HIVE-7110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020982#comment-14020982 ] Eugene Koifman commented on HIVE-7110: -- I see the same test failure on current tunk (OSX 10.8.5) TestHCatPartitionPublish test failure: No FileSystem or scheme: pfile - Key: HIVE-7110 URL: https://issues.apache.org/jira/browse/HIVE-7110 Project: Hive Issue Type: Bug Components: HCatalog Reporter: David Chen Assignee: David Chen Attachments: HIVE-7110.1.patch, HIVE-7110.2.patch, HIVE-7110.3.patch, HIVE-7110.4.patch I got the following TestHCatPartitionPublish test failure when running all unit tests against Hadoop 1. This also appears when testing against Hadoop 2. {code} Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 26.06 sec FAILURE! - in org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish testPartitionPublish(org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish) Time elapsed: 1.361 sec ERROR! org.apache.hive.hcatalog.common.HCatException: org.apache.hive.hcatalog.common.HCatException : 2001 : Error setting output information. Cause : java.io.IOException: No FileSystem for scheme: pfile at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1443) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187) at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:212) at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70) at org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.runMRCreateFail(TestHCatPartitionPublish.java:191) at org.apache.hive.hcatalog.mapreduce.TestHCatPartitionPublish.testPartitionPublish(TestHCatPartitionPublish.java:155) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7187) Reconcile jetty versions in hive
[ https://issues.apache.org/jira/browse/HIVE-7187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020170#comment-14020170 ] Eugene Koifman commented on HIVE-7187: -- also, the current release of Jetty is 9.x. Reconcile jetty versions in hive Key: HIVE-7187 URL: https://issues.apache.org/jira/browse/HIVE-7187 Project: Hive Issue Type: Bug Components: HiveServer2, Web UI, WebHCat Reporter: Vaibhav Gumashta Hive root pom has 3 parameters for specifying jetty dependency versions: {code} jetty.version6.1.26/jetty.version jetty.webhcat.version7.6.0.v20120127/jetty.webhcat.version jetty.hive-service.version7.6.0.v20120127/jetty.hive-service.version {code} 1st one is used by HWI, 2nd by WebHCat and 3rd by HiveServer2 (in http mode). We should probably use the same jetty version for all hive components. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7190) WebHCat launcher task failure can cause two concurent user jobs to run
[ https://issues.apache.org/jira/browse/HIVE-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7190: - Affects Version/s: 0.13.0 WebHCat launcher task failure can cause two concurent user jobs to run -- Key: HIVE-7190 URL: https://issues.apache.org/jira/browse/HIVE-7190 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Ivan Mitic Attachments: HIVE-7190.patch Templeton uses launcher jobs to launch the actual user jobs. Launcher jobs are 1-map jobs (a single task jobs) which kick off the actual user job and monitor it until it finishes. Given that the launcher is a task, like any other MR task, it has a retry policy in case it fails (due to a task crash, tasktracker/nodemanager crash, machine level outage, etc.). Further, when launcher task is retried, it will again launch the same user job, *however* the previous attempt user job is already running. What this means is that we can have two identical user jobs running in parallel. In case of MRv2, there will be an MRAppMaster and the launcher task, which are subject to failure. In case any of the two fails, another instance of a user job will be launched again in parallel. Above situation is already a bug. Now going further to RM HA, what RM does on failover/restart is that it kills all containers, and it restarts all applications. This means that if our customer had 10 jobs on the cluster (this is 10 launcher jobs and 10 user jobs), on RM failover, all 20 jobs will be restarted, and launcher jobs will queue user jobs again. There are two issues with this design: 1. There are *possible* chances for corruption of job outputs (it would be useful to analyze this scenario more and confirm this statement). 2. Cluster resources are spent on jobs redundantly To address the issue at least on Yarn (Hadoop 2.0) clusters, webhcat should do the same thing Oozie does in this scenario, and that is to tag all its child jobs with an id, and kill those jobs on task restart before they are kicked off again. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7155) WebHCat controller job exceeds container memory limit
[ https://issues.apache.org/jira/browse/HIVE-7155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020636#comment-14020636 ] Eugene Koifman commented on HIVE-7155: -- +1 WebHCat controller job exceeds container memory limit - Key: HIVE-7155 URL: https://issues.apache.org/jira/browse/HIVE-7155 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: shanyu zhao Assignee: shanyu zhao Attachments: HIVE-7155.1.patch, HIVE-7155.patch Submit a Hive query on a large table via WebHCat results in failure because the WebHCat controller job is killed by Yarn since it exceeds the memory limit (set by mapreduce.map.memory.mb, defaults to 1GB): {code} INSERT OVERWRITE TABLE Temp_InjusticeEvents_2014_03_01_00_00 SELECT * from Stage_InjusticeEvents where LogTimestamp '2014-03-01 00:00:00' and LogTimestamp = '2014-03-01 01:00:00'; {code} We could increase mapreduce.map.memory.mb to solve this problem, but this way we are changing this setting system wise. We need to provide a WebHCat configuration to overwrite mapreduce.map.memory.mb when submitting the controller job. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7155) WebHCat controller job exceeds container memory limit
[ https://issues.apache.org/jira/browse/HIVE-7155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018909#comment-14018909 ] Eugene Koifman commented on HIVE-7155: -- [~shanyu] I can't comment on RB. Did you perhaps not publish it? WebHCat controller job exceeds container memory limit - Key: HIVE-7155 URL: https://issues.apache.org/jira/browse/HIVE-7155 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: shanyu zhao Assignee: shanyu zhao Attachments: HIVE-7155.1.patch, HIVE-7155.patch Submit a Hive query on a large table via WebHCat results in failure because the WebHCat controller job is killed by Yarn since it exceeds the memory limit (set by mapreduce.map.memory.mb, defaults to 1GB): {code} INSERT OVERWRITE TABLE Temp_InjusticeEvents_2014_03_01_00_00 SELECT * from Stage_InjusticeEvents where LogTimestamp '2014-03-01 00:00:00' and LogTimestamp = '2014-03-01 01:00:00'; {code} We could increase mapreduce.map.memory.mb to solve this problem, but this way we are changing this setting system wise. We need to provide a WebHCat configuration to overwrite mapreduce.map.memory.mb when submitting the controller job. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6316) Document support for new types in HCat
[ https://issues.apache.org/jira/browse/HIVE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018154#comment-14018154 ] Eugene Koifman commented on HIVE-6316: -- [~leftylev], w.r.t. HCatLoader/Storer - tinyint and smallint Hive types support was there prior to 0.13. (Though perhaps that was not documented). altogether support for 5 new types was added (both HCatLoader/Storer and HCatInput/OutputFormat): date, timestamp, char, varchar, decimal. If you look at 1st column of https://issues.apache.org/jira/secure/attachment/12626251/HCat-Pig%20Type%20Mapping%20Hive%200.13.pdf, it lists type name/java class/primitive. The Java class/primitive is what the user can expect in HCatRecord produced by using HCatInputFormat and what they should use in HCatRecord to write it with HCatOutputFormat. The only omission in the PDF doc is that DATE maps to java.sql.Date. Thus in https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput#HCatalogInputOutput-HCatRecord, these 5 types should be added to the table. (DECIMAL is already there, but was not supported until 0.13 and it maps to HiveDecimal Java class) The range of values for primitive types is what is dictated by Java, and for Object types, users could look at the JavaDoc for corresponding Java classes. Document support for new types in HCat -- Key: HIVE-6316 URL: https://issues.apache.org/jira/browse/HIVE-6316 Project: Hive Issue Type: Sub-task Components: Documentation, HCatalog Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Lefty Leverenz HIVE-5814 added support for new types in HCat. The PDF file in that bug explains exactly how these map to Pig types. This should be added to the Wiki somewhere (probably here https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore). In particular it should be highlighted that copying data from Hive TIMESTAMP to Pig DATETIME, any 'nanos' in the timestamp will be lost. Also, HCatStorer now takes new parameter which is described in the PDF doc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Description: WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine NO PRECOMMIT TESTS was: WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Status: Patch Available (was: Open) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7065.1.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Attachment: HIVE-7065.1.patch update based on feedback from @Thejas Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7065.1.patch, HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7152) OutputJobInfo.setPosOfPartCols() Comparator bug
Eugene Koifman created HIVE-7152: Summary: OutputJobInfo.setPosOfPartCols() Comparator bug Key: HIVE-7152 URL: https://issues.apache.org/jira/browse/HIVE-7152 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman this method compares Integer objects using '=='. This may break for wide tables that have more than 127 columns. http://stackoverflow.com/questions/2602636/why-cant-the-compiler-jvm-just-make-autoboxing-just-work -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Status: Open (was: Patch Available) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6316) Document support for new types in HCat
[ https://issues.apache.org/jira/browse/HIVE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010341#comment-14010341 ] Eugene Koifman commented on HIVE-6316: -- [~leftylev], Null and Throw are the only possible values. The description of HIVE-5814 has a usage example: {noformat} HCatStorer('','', '-onOutOfRangeValue Throw') {noformat} hcat.pig.store.onoutofrangevalue does NOT need to be documented, it's internal. This only applies when using HCat fro Pig, where the user is expected to use 'onOutOfRangeValue in HCatStorer. Is not really related to Data Promotion Behavior. The HCatInputFormat and HCatOutputFormat section need the same update to the type mapping tables as HCatLoader/HCatStorer. I think it would be easier to just create link from all 4 current tables to a single page that has the whole table in https://issues.apache.org/jira/secure/attachment/12626251/HCat-Pig%20Type%20Mapping%20Hive%200.13.pdf exactly. The headers in the table actually indicate a mapping of Hive Type/Value system to Pig Type/Value system. Logically speaking there is no such thing as HCatalog type/value system. HCatalog connects Hive tables to Pig/Map Reduce. Pig has it's own type/value system; MR does not as such and is expected to use (in HCatRecord) the same classes as used in Hive internally. so the data type mapping is really Hive-Pig (HCatLoader/Storer) and Hive-MR (HCatInput/OutputFormat) which is why it's all summarized in a single table in my document. Document support for new types in HCat -- Key: HIVE-6316 URL: https://issues.apache.org/jira/browse/HIVE-6316 Project: Hive Issue Type: Sub-task Components: Documentation, HCatalog Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Lefty Leverenz HIVE-5814 added support for new types in HCat. The PDF file in that bug explains exactly how these map to Pig types. This should be added to the Wiki somewhere (probably here https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore). In particular it should be highlighted that copying data from Hive TIMESTAMP to Pig DATETIME, any 'nanos' in the timestamp will be lost. Also, HCatStorer now takes new parameter which is described in the PDF doc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6316) Document support for new types in HCat
[ https://issues.apache.org/jira/browse/HIVE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007255#comment-14007255 ] Eugene Koifman commented on HIVE-6316: -- no, the PDF there should be sufficient Document support for new types in HCat -- Key: HIVE-6316 URL: https://issues.apache.org/jira/browse/HIVE-6316 Project: Hive Issue Type: Sub-task Components: Documentation, HCatalog Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Lefty Leverenz HIVE-5814 added support for new types in HCat. The PDF file in that bug explains exactly how these map to Pig types. This should be added to the Wiki somewhere (probably here https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore). In particular it should be highlighted that copying data from Hive TIMESTAMP to Pig DATETIME, any 'nanos' in the timestamp will be lost. Also, HCatStorer now takes new parameter which is described in the PDF doc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6316) Document support for new types in HCat
[ https://issues.apache.org/jira/browse/HIVE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14006387#comment-14006387 ] Eugene Koifman commented on HIVE-6316: -- ping Does anyone have cycles to take a look at this? https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore and https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput are both pretty badly out of date at this point Document support for new types in HCat -- Key: HIVE-6316 URL: https://issues.apache.org/jira/browse/HIVE-6316 Project: Hive Issue Type: Sub-task Components: Documentation, HCatalog Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Lefty Leverenz HIVE-5814 added support for new types in HCat. The PDF file in that bug explains exactly how these map to Pig types. This should be added to the Wiki somewhere (probably here https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore). In particular it should be highlighted that copying data from Hive TIMESTAMP to Pig DATETIME, any 'nanos' in the timestamp will be lost. Also, HCatStorer now takes new parameter which is described in the PDF doc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6651) broken link in WebHCat doc: Job Information — GET queue/:jobid
[ https://issues.apache.org/jira/browse/HIVE-6651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003721#comment-14003721 ] Eugene Koifman commented on HIVE-6651: -- +1 broken link in WebHCat doc: Job Information — GET queue/:jobid -- Key: HIVE-6651 URL: https://issues.apache.org/jira/browse/HIVE-6651 Project: Hive Issue Type: Bug Components: Documentation, WebHCat Reporter: Eugene Koifman Assignee: Lefty Leverenz Priority: Minor https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+JobInfo#WebHCatReferenceJobInfo-Results the link in the table to Class JobProfile is broken -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-7084) TestWebHCatE2e is failing on trunk
[ https://issues.apache.org/jira/browse/HIVE-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reassigned HIVE-7084: Assignee: Eugene Koifman TestWebHCatE2e is failing on trunk -- Key: HIVE-7084 URL: https://issues.apache.org/jira/browse/HIVE-7084 Project: Hive Issue Type: Test Components: WebHCat Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Eugene Koifman I am able to repro it consistently on fresh checkout. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7084) TestWebHCatE2e is failing on trunk
[ https://issues.apache.org/jira/browse/HIVE-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003832#comment-14003832 ] Eugene Koifman commented on HIVE-7084: -- root cause here is {noformat} Caused by: java.lang.RuntimeException: The resource 'resourcedoc.xml' does not exist. at com.sun.jersey.api.wadl.config.WadlGeneratorLoader.setProperty(WadlGeneratorLoader.java:203) at com.sun.jersey.api.wadl.config.WadlGeneratorLoader.loadWadlGenerator(WadlGeneratorLoader.java:139) at com.sun.jersey.api.wadl.config.WadlGeneratorLoader.loadWadlGeneratorDescriptions(WadlGeneratorLoader.java:114) at com.sun.jersey.api.wadl.config.WadlGeneratorConfig.createWadlGenerator(WadlGeneratorConfig.java:182) ... 49 more {noformat} hcatalog/webhcat/svr/pom.xml uses maven-javadoc-plugin which generates resourcedoc.xml which ends up in hcatalog/webhcat/svr/target/classes/resourcedoc.xml in 0.13.1 build but is missing from trunk build tree. TestWebHCatE2e is failing on trunk -- Key: HIVE-7084 URL: https://issues.apache.org/jira/browse/HIVE-7084 Project: Hive Issue Type: Test Components: WebHCat Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Eugene Koifman I am able to repro it consistently on fresh checkout. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7084) TestWebHCatE2e is failing on trunk
[ https://issues.apache.org/jira/browse/HIVE-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7084: - Assignee: Harish Butani (was: Eugene Koifman) TestWebHCatE2e is failing on trunk -- Key: HIVE-7084 URL: https://issues.apache.org/jira/browse/HIVE-7084 Project: Hive Issue Type: Test Components: WebHCat Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Harish Butani I am able to repro it consistently on fresh checkout. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7084) TestWebHCatE2e is failing on trunk
[ https://issues.apache.org/jira/browse/HIVE-7084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003856#comment-14003856 ] Eugene Koifman commented on HIVE-7084: -- this works before HIVE-7000 and breaks immediately after TestWebHCatE2e is failing on trunk -- Key: HIVE-7084 URL: https://issues.apache.org/jira/browse/HIVE-7084 Project: Hive Issue Type: Test Components: WebHCat Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Eugene Koifman I am able to repro it consistently on fresh checkout. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HIVE-6207) Integrate HCatalog with locking
[ https://issues.apache.org/jira/browse/HIVE-6207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reassigned HIVE-6207: Assignee: Eugene Koifman (was: Alan Gates) Integrate HCatalog with locking --- Key: HIVE-6207 URL: https://issues.apache.org/jira/browse/HIVE-6207 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 0.13.0 Reporter: Alan Gates Assignee: Eugene Koifman Fix For: 0.14.0 HCatalog currently ignores any locks created by Hive users. It should respect the locks Hive creates as well as create locks itself when locking is configured. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5814) Add DATE, TIMESTAMP, DECIMAL, CHAR, VARCHAR types support in HCat
[ https://issues.apache.org/jira/browse/HIVE-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14002163#comment-14002163 ] Eugene Koifman commented on HIVE-5814: -- org.apache.hcatalog.pig.HCatLoader has been deprecated in Hive 0.12. In fact every class in org.apache.hcatalog has been deprecated. All new features are added in org.apache.hive.hcatalog which contains all the classes/methods from org.apache.hcatalog and new APIs. Add DATE, TIMESTAMP, DECIMAL, CHAR, VARCHAR types support in HCat - Key: HIVE-5814 URL: https://issues.apache.org/jira/browse/HIVE-5814 Project: Hive Issue Type: New Feature Components: HCatalog Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.13.0 Attachments: HCat-Pig Type Mapping Hive 0.13.pdf, HIVE-5814.2.patch, HIVE-5814.3.patch, HIVE-5814.4.patch, HIVE-5814.5.patch Hive 0.12 added support for new data types. Pig 0.12 added some as well. HCat should handle these as well.Also note that CHAR was added recently. Also allow user to specify a parameter in Pig like so HCatStorer('','', '-onOutOfRangeValue Throw') to control what happens when Pig's value is out of range for target Hive column. Valid values for the option are Throw and Null. Throw - make the runtime raise an exception, Null, which is the default, means NULL is written to target column and a message to that effect is emitted to the log. Only 1 message per column/data type is sent to the log. See attached HCat-Pig Type Mapping Hive 0.13.pdf for exact mappings. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
Eugene Koifman created HIVE-7065: Summary: Hive jobs in webhcat run in default mr mode even in Hive on Tez setup Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6768) remove hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties
[ https://issues.apache.org/jira/browse/HIVE-6768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993793#comment-13993793 ] Eugene Koifman commented on HIVE-6768: -- [~thejas],[~hashutosh] the attached patch reverts all changes that were part of HIVE-5511 needed to handle the 'special' override-container-log4j, just like the bug description says. HIVE-5511 also included some refactoring, which should not be reverted. remove hcatalog/webhcat/svr/src/main/config/override-container-log4j.properties --- Key: HIVE-6768 URL: https://issues.apache.org/jira/browse/HIVE-6768 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-6768.patch now that MAPREDUCE-5806 is fixed we can remove override-container-log4j.properties and and all the logic around this which was introduced in HIVE-5511 to work around MAPREDUCE-5806 NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7075) JsonSerde raises NullPointerException when object key is not lower case
[ https://issues.apache.org/jira/browse/HIVE-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7075: - Component/s: HCatalog JsonSerde raises NullPointerException when object key is not lower case --- Key: HIVE-7075 URL: https://issues.apache.org/jira/browse/HIVE-7075 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.12.0 Reporter: Yibing Shi We have noticed that the JsonSerde produces a NullPointerException if a JSON object has a key value that is not lower case. For example. Assume we have the file one.json: { empId : 123, name : John } { empId : 456, name : Jane } hive CREATE TABLE emps (empId INT, name STRING) ROW FORMAT SERDE org.apache.hive.hcatalog.data.JsonSerDe; hive LOAD DATA LOCAL INPATH 'one.json' INTO TABLE emps; hive SELECT * FROM emps; Failed with exception java.io.IOException:java.lang.NullPointerException Notice, it seems to work if the keys are lower case. Assume we have the file 'two.json': { empid : 123, name : John } { empid : 456, name : Jane } hive DROP TABLE emps; hive CREATE TABLE emps (empId INT, name STRING) ROW FORMAT SERDE org.apache.hive.hcatalog.data.JsonSerDe; hive LOAD DATA LOCAL INPATH 'two.json' INTO TABLE emps; hive SELECT * FROM emps; OK 123 John 456 Jane -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6549) remove templeton.jar from webhcat-default.xml, remove hcatalog/bin/hive-config.sh
[ https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000446#comment-14000446 ] Eugene Koifman commented on HIVE-6549: -- no, this is definitely used. I guess the proper version number is being set in the release brunch but not trunk remove templeton.jar from webhcat-default.xml, remove hcatalog/bin/hive-config.sh - Key: HIVE-6549 URL: https://issues.apache.org/jira/browse/HIVE-6549 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Minor Fix For: 0.14.0 Attachments: HIVE-6549.2.patch, HIVE-6549.patch this property is no longer used also removed corresponding AppConfig.TEMPLETON_JAR_NAME hcatalog/bin/hive-config.sh is not used NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7035) Templeton returns 500 for user errors - when job cannot be found
[ https://issues.apache.org/jira/browse/HIVE-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7035: - Attachment: HIVE-7035.patch Templeton returns 500 for user errors - when job cannot be found Key: HIVE-7035 URL: https://issues.apache.org/jira/browse/HIVE-7035 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7035.patch curl -i 'http://localhost:50111/templeton/v1/jobs/job_139949638_00011?user.name=ekoifman' should return HTTP Status code 4xx when no such job exists; it currently returns 500. {noformat} {error:org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_201304291205_0015' doesn't exist in RM.\r\n\tat org.apache.hadoop.yarn.server.resourcemanager .ClientRMService.getApplicationReport(ClientRMService.java:247)\r\n\tat org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocol PBServiceImpl.java:120)\r\n\tat org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\r\n\tat org.apache.hado op.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\r\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\r\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Serve r.java:2053)\r\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\r\n\tat java.security.AccessController.doPrivileged(Native Method)\r\n\tat javax.security.auth.Subject.doAs(Subject.ja va:415)\r\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\r\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\r\n} {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Status: Patch Available (was: Open) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier
[ https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7056: - Description: on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is looking for \*hcatalog-core-\*.jar etc. In Pig 12.1 it's looking for hcatalog-core-\*.jar, which doesn't work with Hive 0.13. The TestPig_11 job fails with {noformat} 2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] Failed to parse: Pig script failed to parse: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411) at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344) at org.apache.pig.PigServer.executeBatch(PigServer.java:369) at org.apache.pig.PigServer.executeBatch(PigServer.java:355) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:478) at org.apache.pig.Main.main(Main.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299) at org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284) at org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158) at org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 16 more Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653) at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296) ... 24 more {noformat} the key to this is {noformat} ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-*.jar: No such file or directory ls:
[jira] [Updated] (HIVE-7065) Hive jobs in webhcat run in default mr mode even in Hive on Tez setup
[ https://issues.apache.org/jira/browse/HIVE-7065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7065: - Attachment: HIVE-7065.patch Hive jobs in webhcat run in default mr mode even in Hive on Tez setup - Key: HIVE-7065 URL: https://issues.apache.org/jira/browse/HIVE-7065 Project: Hive Issue Type: Bug Components: Tez, WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7065.patch WebHCat config has templeton.hive.properties to specify Hive config properties that need to be passed to Hive client on node executing a job submitted through WebHCat (hive query, for example). this should include hive.execution.engine -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier
[ https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7056: - Description: on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is looking for \*hcatalog-core-*.jar etc. In Pig 12.1 it's looking for hcatalog-core-*.jar, which doesn't work with Hive 0.13. The TestPig_11 job fails with {noformat} 2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] Failed to parse: Pig script failed to parse: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411) at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344) at org.apache.pig.PigServer.executeBatch(PigServer.java:369) at org.apache.pig.PigServer.executeBatch(PigServer.java:355) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:478) at org.apache.pig.Main.main(Main.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299) at org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284) at org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158) at org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 16 more Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653) at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296) ... 24 more {noformat} the key to this is {noformat} ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-*.jar: No such file or directory ls:
[jira] [Updated] (HIVE-6316) Document support for new types in HCat
[ https://issues.apache.org/jira/browse/HIVE-6316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6316: - Assignee: Lefty Leverenz Document support for new types in HCat -- Key: HIVE-6316 URL: https://issues.apache.org/jira/browse/HIVE-6316 Project: Hive Issue Type: Sub-task Components: Documentation, HCatalog Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Lefty Leverenz HIVE-5814 added support for new types in HCat. The PDF file in that bug explains exactly how these map to Pig types. This should be added to the Wiki somewhere (probably here https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore). In particular it should be highlighted that copying data from Hive TIMESTAMP to Pig DATETIME, any 'nanos' in the timestamp will be lost. Also, HCatStorer now takes new parameter which is described in the PDF doc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier
[ https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7056: - Affects Version/s: 0.13.0 TestPig_11 fails with Pig 12.1 and earlier -- Key: HIVE-7056 URL: https://issues.apache.org/jira/browse/HIVE-7056 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is looking for *hcatalog-core-*.jar etc. In Pig 12.1 it's looking for hcatalog-core-*.jar, which doesn't work with Hive 0.13. The TestPig_11 job fails with {noformat} 2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] Failed to parse: Pig script failed to parse: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411) at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344) at org.apache.pig.PigServer.executeBatch(PigServer.java:369) at org.apache.pig.PigServer.executeBatch(PigServer.java:355) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:478) at org.apache.pig.Main.main(Main.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299) at org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284) at org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158) at org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 16 more Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653) at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296) ... 24 more {noformat} the key to this is {noformat} ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar: No such file or directory ls:
[jira] [Updated] (HIVE-7057) webhcat e2e deployment scripts don't have x bit set
[ https://issues.apache.org/jira/browse/HIVE-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7057: - Status: Patch Available (was: Open) webhcat e2e deployment scripts don't have x bit set --- Key: HIVE-7057 URL: https://issues.apache.org/jira/browse/HIVE-7057 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7057.patch also, update env.sh to use latest Pig release NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7057) webhcat e2e deployment scripts don't have x bit set
[ https://issues.apache.org/jira/browse/HIVE-7057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7057: - Description: also, update env.sh to use latest Pig release NO PRECOMMIT TESTS was:also, update env.sh to use latest Pig release webhcat e2e deployment scripts don't have x bit set --- Key: HIVE-7057 URL: https://issues.apache.org/jira/browse/HIVE-7057 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Eugene Koifman Assignee: Eugene Koifman also, update env.sh to use latest Pig release NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7057) webhcat e2e deployment scripts don't have x bit set
Eugene Koifman created HIVE-7057: Summary: webhcat e2e deployment scripts don't have x bit set Key: HIVE-7057 URL: https://issues.apache.org/jira/browse/HIVE-7057 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Eugene Koifman Assignee: Eugene Koifman also, update env.sh to use latest Pig release -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7056) TestPig_11 fails with Pig 12.1 and earlier
Eugene Koifman created HIVE-7056: Summary: TestPig_11 fails with Pig 12.1 and earlier Key: HIVE-7056 URL: https://issues.apache.org/jira/browse/HIVE-7056 Project: Hive Issue Type: Bug Components: WebHCat Reporter: Eugene Koifman on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is looking for *hcatalog-core-*.jar etc. In Pig 12.1 it's looking for hcatalog-core-*.jar, which doesn't work with Hive 0.13. The TestPig_11 job fails with {noformat} 2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] Failed to parse: Pig script failed to parse: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411) at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344) at org.apache.pig.PigServer.executeBatch(PigServer.java:369) at org.apache.pig.PigServer.executeBatch(PigServer.java:355) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:478) at org.apache.pig.Main.main(Main.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299) at org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284) at org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158) at org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 16 more Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653) at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296) ... 24 more {noformat} the key to this is {noformat} ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-*.jar: No such file or directory ls:
[jira] [Updated] (HIVE-6549) remove templeton.jar from webhcat-default.xml, remove hcatalog/bin/hive-config.sh
[ https://issues.apache.org/jira/browse/HIVE-6549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-6549: - Attachment: HIVE-6549.2.patch adressed [~thejas]'s comments remove templeton.jar from webhcat-default.xml, remove hcatalog/bin/hive-config.sh - Key: HIVE-6549 URL: https://issues.apache.org/jira/browse/HIVE-6549 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.12.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Minor Attachments: HIVE-6549.2.patch, HIVE-6549.patch this property is no longer used also removed corresponding AppConfig.TEMPLETON_JAR_NAME hcatalog/bin/hive-config.sh is not used NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7056) WebHCat TestPig_11 fails with Pig 12.1 and earlier on Hive 0.13
[ https://issues.apache.org/jira/browse/HIVE-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7056: - Summary: WebHCat TestPig_11 fails with Pig 12.1 and earlier on Hive 0.13 (was: TestPig_11 fails with Pig 12.1 and earlier) WebHCat TestPig_11 fails with Pig 12.1 and earlier on Hive 0.13 --- Key: HIVE-7056 URL: https://issues.apache.org/jira/browse/HIVE-7056 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman on trunk, pig script (http://svn.apache.org/repos/asf/pig/trunk/bin/pig) is looking for \*hcatalog-core-\*.jar etc. In Pig 12.1 it's looking for hcatalog-core-\*.jar, which doesn't work with Hive 0.13. The TestPig_11 job fails with {noformat} 2014-05-13 17:47:10,760 [main] ERROR org.apache.pig.PigServer - exception during parsing: Error during parsing. Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] Failed to parse: Pig script failed to parse: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:196) at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1678) at org.apache.pig.PigServer$Graph.access$000(PigServer.java:1411) at org.apache.pig.PigServer.parseAndBuild(PigServer.java:344) at org.apache.pig.PigServer.executeBatch(PigServer.java:369) at org.apache.pig.PigServer.executeBatch(PigServer.java:355) at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173) at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84) at org.apache.pig.Main.run(Main.java:478) at org.apache.pig.Main.main(Main.java:156) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: file hcatloadstore.pig, line 19, column 34 pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1299) at org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1284) at org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:5158) at org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:7756) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1669) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:1102) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:560) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:421) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:188) ... 16 more Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve org.apache.hive.hcatalog.pig.HCatStorer using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:653) at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1296) ... 24 more {noformat} the key to this is {noformat} ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/lib/slf4j-api-*.jar: No such file or directory ls: /private/tmp/hadoop-ekoifman/nm-local-dir/usercache/ekoifman/appcache/application_1400018007772_0045/container_1400018007772_0045_01_02/apache-hive-0.14.0-SNAPSHOT-bin.tar.gz/apache-hive-0.14.0-SNAPSHOT-bin/hcatalog/share/hcatalog/hcatalog-core-*.jar:
[jira] [Updated] (HIVE-7035) Templeton returns 500 for user errors - when job cannot be found
[ https://issues.apache.org/jira/browse/HIVE-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-7035: - Status: Patch Available (was: Open) Templeton returns 500 for user errors - when job cannot be found Key: HIVE-7035 URL: https://issues.apache.org/jira/browse/HIVE-7035 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-7035.patch curl -i 'http://localhost:50111/templeton/v1/jobs/job_139949638_00011?user.name=ekoifman' should return HTTP Status code 4xx when no such job exists; it currently returns 500. {noformat} {error:org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_201304291205_0015' doesn't exist in RM.\r\n\tat org.apache.hadoop.yarn.server.resourcemanager .ClientRMService.getApplicationReport(ClientRMService.java:247)\r\n\tat org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocol PBServiceImpl.java:120)\r\n\tat org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\r\n\tat org.apache.hado op.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\r\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\r\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Serve r.java:2053)\r\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\r\n\tat java.security.AccessController.doPrivileged(Native Method)\r\n\tat javax.security.auth.Subject.doAs(Subject.ja va:415)\r\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\r\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\r\n} {noformat} NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)