[jira] [Commented] (HIVE-8881) Receiving json {error:Could not find job job_1415748506143_0002} when web client tries to fetch all jobs from webhcat where HDFS does not have the data.
[ https://issues.apache.org/jira/browse/HIVE-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14215419#comment-14215419 ] Eugene Koifman commented on HIVE-8881: -- The change is a WebHCat only changes and above test failure is not related. Committed to trunk. Thanks [~thejas] for the review. Receiving json {error:Could not find job job_1415748506143_0002} when web client tries to fetch all jobs from webhcat where HDFS does not have the data. -- Key: HIVE-8881 URL: https://issues.apache.org/jira/browse/HIVE-8881 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8881.patch when a job is deleted from HDFS and the curl call is executed http://localhost:50111/templeton/v1/jobs?user.name=hcat\showall=true\fields=*' Occasionally receiving json like this {noformat} {error:Could not find job job_1415748506143_0002} {noformat} this is an intermittent issue that happens when Hadoop for some reason can't find the details for a given job id. This REST call should be more flexible since it is designed to return information about many jobs at once. It should just skip over bad IDs and produce as much output as it can. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8881) Receiving json {error:Could not find job job_1415748506143_0002} when web client tries to fetch all jobs from webhcat where HDFS does not have the data.
[ https://issues.apache.org/jira/browse/HIVE-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8881: - Fix Version/s: 0.15.0 Receiving json {error:Could not find job job_1415748506143_0002} when web client tries to fetch all jobs from webhcat where HDFS does not have the data. -- Key: HIVE-8881 URL: https://issues.apache.org/jira/browse/HIVE-8881 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.15.0 Attachments: HIVE-8881.patch when a job is deleted from HDFS and the curl call is executed http://localhost:50111/templeton/v1/jobs?user.name=hcat\showall=true\fields=*' Occasionally receiving json like this {noformat} {error:Could not find job job_1415748506143_0002} {noformat} this is an intermittent issue that happens when Hadoop for some reason can't find the details for a given job id. This REST call should be more flexible since it is designed to return information about many jobs at once. It should just skip over bad IDs and produce as much output as it can. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8881) Receiving json {error:Could not find job job_1415748506143_0002} when web client tries to fetch all jobs from webhcat where HDFS does not have the data.
[ https://issues.apache.org/jira/browse/HIVE-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8881: - Resolution: Fixed Status: Resolved (was: Patch Available) Receiving json {error:Could not find job job_1415748506143_0002} when web client tries to fetch all jobs from webhcat where HDFS does not have the data. -- Key: HIVE-8881 URL: https://issues.apache.org/jira/browse/HIVE-8881 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8881.patch when a job is deleted from HDFS and the curl call is executed http://localhost:50111/templeton/v1/jobs?user.name=hcat\showall=true\fields=*' Occasionally receiving json like this {noformat} {error:Could not find job job_1415748506143_0002} {noformat} this is an intermittent issue that happens when Hadoop for some reason can't find the details for a given job id. This REST call should be more flexible since it is designed to return information about many jobs at once. It should just skip over bad IDs and produce as much output as it can. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8877) improve context logging during job submission via WebHCat
Eugene Koifman created HIVE-8877: Summary: improve context logging during job submission via WebHCat Key: HIVE-8877 URL: https://issues.apache.org/jira/browse/HIVE-8877 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman currently the logging includes env variables. it should also include contents of cur dir (helpful in tar shipping scenarios). contents of sqoop/lib (for pre-installed sqoop scenarios) java props (for general debugging) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8881) Receiving json {error:Could not find job job_1415748506143_0002} when web client tries to fetch all jobs from webhcat where HDFS does not have the data.
Eugene Koifman created HIVE-8881: Summary: Receiving json {error:Could not find job job_1415748506143_0002} when web client tries to fetch all jobs from webhcat where HDFS does not have the data. Key: HIVE-8881 URL: https://issues.apache.org/jira/browse/HIVE-8881 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman when a job is deleted from HDFS and the curl call is executed http://localhost:50111/templeton/v1/jobs?user.name=hcat\showall=true\fields=*' Occasionally receiving json like this {noformat} {error:Could not find job job_1415748506143_0002} {noformat} this is an intermittent issue that happens when Hadoop for some reason can't find the details for a given job id. This REST call should be more flexible since it is designed to return information about many jobs at once. It should just skip over bad IDs and produce as much output as it can. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8881) Receiving json {error:Could not find job job_1415748506143_0002} when web client tries to fetch all jobs from webhcat where HDFS does not have the data.
[ https://issues.apache.org/jira/browse/HIVE-8881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8881: - Attachment: HIVE-8881.patch Receiving json {error:Could not find job job_1415748506143_0002} when web client tries to fetch all jobs from webhcat where HDFS does not have the data. -- Key: HIVE-8881 URL: https://issues.apache.org/jira/browse/HIVE-8881 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8881.patch when a job is deleted from HDFS and the curl call is executed http://localhost:50111/templeton/v1/jobs?user.name=hcat\showall=true\fields=*' Occasionally receiving json like this {noformat} {error:Could not find job job_1415748506143_0002} {noformat} this is an intermittent issue that happens when Hadoop for some reason can't find the details for a given job id. This REST call should be more flexible since it is designed to return information about many jobs at once. It should just skip over bad IDs and produce as much output as it can. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8694) every WebHCat e2e test should specify statusdir parameter
[ https://issues.apache.org/jira/browse/HIVE-8694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14208490#comment-14208490 ] Eugene Koifman commented on HIVE-8694: -- perhaps a better idea is if statusdir is not explicitly specified to log to some predefined dir (in which case need some clean up logic/retention policy) every WebHCat e2e test should specify statusdir parameter - Key: HIVE-8694 URL: https://issues.apache.org/jira/browse/HIVE-8694 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman e.g. 'statusdir=TestSqoop_:TNUM:' This captures stdout/stderr for job submission and helps diagnosing failures. See if it's easy to add something to the test harness to collect all the info in these dirs to make it available after cluster shutdown. NO _PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8295) Add batch retrieve partition objects for metastore direct sql
[ https://issues.apache.org/jira/browse/HIVE-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207428#comment-14207428 ] Eugene Koifman commented on HIVE-8295: -- One thing that may be useful (for Oracle) is to make the query like ... a IN (1,1000) or a IN (1001, ... 2000)... it should still avoid the 1000 limit and run only 1 query. Add batch retrieve partition objects for metastore direct sql -- Key: HIVE-8295 URL: https://issues.apache.org/jira/browse/HIVE-8295 Project: Hive Issue Type: Bug Reporter: Selina Zhang Assignee: Selina Zhang Attachments: HIVE-8295.02.patch, HIVE-8295.02.patch, HIVE-8295.03.patch, HIVE-8295.1.patch Currently in MetastoreDirectSql partition objects are constructed in a way that fetching partition ids first. However, if the partition ids that match the filter is larger than 1000, direct sql will fail with the following stack trace: {code} 2014-09-29 19:30:02,942 DEBUG [pool-1-thread-1] metastore.MetaStoreDirectSql (MetaStoreDirectSql.java:timingTrace(604)) - Direct SQL query in 122.085893ms + 13.048901ms, the query is [select PARTITIONS.PART_ID from PARTITIONS inner join TBLS on PARTITIONS.TBL_ID = TBLS.TBL_ID and TBLS.TBL_NAME = ? inner join DBS on TBLS.DB_ID = DBS.DB_ID and DBS.NAME = ? inner join PARTITION_KEY_VALS FILTER2 on FILTER2.PART_ID = PARTITIONS.PART_ID and FILTER2.INTEGER_IDX = 2 where ((FILTER2.PART_KEY_VAL = ?))] 2014-09-29 19:30:02,949 ERROR [pool-1-thread-1] metastore.ObjectStore (ObjectStore.java:handleDirectSqlError(2248)) - Direct SQL failed, falling back to ORM javax.jdo.JDODataStoreException: Error executing SQL query select PARTITIONS.PART_ID, SDS.SD_ID, SDS.CD_ID, SERDES.SERDE_ID, PARTITIONS.CREATE_TIME, PARTITIONS.LAST_ACCESS_TIME, SDS.INPUT_FORMAT, SDS.IS_COMPRESSED, SDS.IS_STOREDASSUBDIRECTORIES, SDS.LOCATION, SDS.NUM_BUCKETS, SDS.OUTPUT_FORMAT, SERDES.NAME, SERDES.SLIB from PARTITIONS left outer join SDS on PARTITIONS.SD_ID = SDS.SD_ID left outer join SERDES on SDS.SERDE_ID = SERDES.SERDE_ID where PART_ID in (136,140,143,147,152,156,160,163,167,171,174,180,185,191,196,198,203,208,212,217... ) order by PART_NAME asc. at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:422) at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:321) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:331) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:211) at org.apache.hadoop.hive.metastore.ObjectStore$3.getSqlResult(ObjectStore.java:1920) at org.apache.hadoop.hive.metastore.ObjectStore$3.getSqlResult(ObjectStore.java:1914) at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2213) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExprInternal(ObjectStore.java:1914) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByExpr(ObjectStore.java:1887) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:98) at com.sun.proxy.$Proxy8.getPartitionsByExpr(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_expr(HiveMetaStore.java:3800) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_expr.getResult(ThriftHiveMetastore.java:9366) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_expr.getResult(ThriftHiveMetastore.java:9350) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:617) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:613) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:613) at
[jira] [Commented] (HIVE-8830) hcatalog process don't exit because of non daemon thread
[ https://issues.apache.org/jira/browse/HIVE-8830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207520#comment-14207520 ] Eugene Koifman commented on HIVE-8830: -- t.setName(HiveClientCache cleaner); It's a minor thing, but it would be good to add a counter in the ThreadFactory and add it to the name so that it's like this HiveClientCache cleaner-1. If the thread in the pool dies, a new one is created to replace it. With the counter, we'll know about that in thread dumps and logs. +1 pending tests hcatalog process don't exit because of non daemon thread Key: HIVE-8830 URL: https://issues.apache.org/jira/browse/HIVE-8830 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.15.0 Attachments: HIVE-8830.1.patch HiveClientCache has a cleanup thread which is not a daemon. It can cause hcat client process to hang even after the complete. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7513) Add ROW__ID VirtualColumn
[ https://issues.apache.org/jira/browse/HIVE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14205807#comment-14205807 ] Eugene Koifman commented on HIVE-7513: -- I don't think this needs documentation. At the moment this is an internal implementation detail to support ACID updates. Add ROW__ID VirtualColumn - Key: HIVE-7513 URL: https://issues.apache.org/jira/browse/HIVE-7513 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-7513.10.patch, HIVE-7513.11.patch, HIVE-7513.12.patch, HIVE-7513.13.patch, HIVE-7513.14.patch, HIVE-7513.3.patch, HIVE-7513.4.patch, HIVE-7513.5.patch, HIVE-7513.8.patch, HIVE-7513.9.patch, HIVE-7513.codeOnly.txt In order to support Update/Delete we need to read rowId from AcidInputFormat and pass that along through the operator pipeline (built from the WHERE clause of the SQL Statement) so that it can be written to the delta file by the update/delete (sink) operators. The parser will add this column to the projection list to make sure it's passed along. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8754) Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster
[ https://issues.apache.org/jira/browse/HIVE-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14200353#comment-14200353 ] Eugene Koifman commented on HIVE-8754: -- this is a webhcat only change, specifically around job submission. There are no unit tests that cover this Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster --- Key: HIVE-8754 URL: https://issues.apache.org/jira/browse/HIVE-8754 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8754.2.patch, HIVE-8754.patch HIVE-8588 added support for this by copying jdbc jars to lib/ of localized/exploded Sqoop tar. Unfortunately, in a secure cluster, Dist Cache intentionally sets permissions on exploded tars such that they are not writable. this needs to be fixed, otherwise the users would have to modify their sqoop tar to include the relevant jdbc jars which is burdensome is different DBs are used and may create headache around licensing issues NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8754) Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster
[ https://issues.apache.org/jira/browse/HIVE-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8754: - Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~thejas] for the review. Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster --- Key: HIVE-8754 URL: https://issues.apache.org/jira/browse/HIVE-8754 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8754.2.patch, HIVE-8754.patch HIVE-8588 added support for this by copying jdbc jars to lib/ of localized/exploded Sqoop tar. Unfortunately, in a secure cluster, Dist Cache intentionally sets permissions on exploded tars such that they are not writable. this needs to be fixed, otherwise the users would have to modify their sqoop tar to include the relevant jdbc jars which is burdensome is different DBs are used and may create headache around licensing issues NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8754) Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster
Eugene Koifman created HIVE-8754: Summary: Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster Key: HIVE-8754 URL: https://issues.apache.org/jira/browse/HIVE-8754 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 HIVE-8588 added support for this by copying jdbc jars to lib/ of localized/exploded Sqoop tar. Unfortunately, in a secure cluster, Dist Cache intentionally sets permissions on exploded tars such that they are not writable. this needs to be fixed, otherwise the users would have to modify their sqoop tar to include the relevant jdbc jars which is burdensome is different DBs are used and may create headache around licensing issues -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8754) Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster
[ https://issues.apache.org/jira/browse/HIVE-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8754: - Status: Patch Available (was: Open) Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster --- Key: HIVE-8754 URL: https://issues.apache.org/jira/browse/HIVE-8754 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8754.patch HIVE-8588 added support for this by copying jdbc jars to lib/ of localized/exploded Sqoop tar. Unfortunately, in a secure cluster, Dist Cache intentionally sets permissions on exploded tars such that they are not writable. this needs to be fixed, otherwise the users would have to modify their sqoop tar to include the relevant jdbc jars which is burdensome is different DBs are used and may create headache around licensing issues NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8754) Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster
[ https://issues.apache.org/jira/browse/HIVE-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8754: - Attachment: HIVE-8754.patch Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster --- Key: HIVE-8754 URL: https://issues.apache.org/jira/browse/HIVE-8754 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8754.patch HIVE-8588 added support for this by copying jdbc jars to lib/ of localized/exploded Sqoop tar. Unfortunately, in a secure cluster, Dist Cache intentionally sets permissions on exploded tars such that they are not writable. this needs to be fixed, otherwise the users would have to modify their sqoop tar to include the relevant jdbc jars which is burdensome is different DBs are used and may create headache around licensing issues -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8754) Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster
[ https://issues.apache.org/jira/browse/HIVE-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8754: - Description: HIVE-8588 added support for this by copying jdbc jars to lib/ of localized/exploded Sqoop tar. Unfortunately, in a secure cluster, Dist Cache intentionally sets permissions on exploded tars such that they are not writable. this needs to be fixed, otherwise the users would have to modify their sqoop tar to include the relevant jdbc jars which is burdensome is different DBs are used and may create headache around licensing issues NO PRECOMMIT TESTS was: HIVE-8588 added support for this by copying jdbc jars to lib/ of localized/exploded Sqoop tar. Unfortunately, in a secure cluster, Dist Cache intentionally sets permissions on exploded tars such that they are not writable. this needs to be fixed, otherwise the users would have to modify their sqoop tar to include the relevant jdbc jars which is burdensome is different DBs are used and may create headache around licensing issues Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster --- Key: HIVE-8754 URL: https://issues.apache.org/jira/browse/HIVE-8754 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8754.patch HIVE-8588 added support for this by copying jdbc jars to lib/ of localized/exploded Sqoop tar. Unfortunately, in a secure cluster, Dist Cache intentionally sets permissions on exploded tars such that they are not writable. this needs to be fixed, otherwise the users would have to modify their sqoop tar to include the relevant jdbc jars which is burdensome is different DBs are used and may create headache around licensing issues NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8711) DB deadlocks not handled in TxnHandler for Postgres, Oracle, and SQLServer
[ https://issues.apache.org/jira/browse/HIVE-8711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14199734#comment-14199734 ] Eugene Koifman commented on HIVE-8711: -- +1 DB deadlocks not handled in TxnHandler for Postgres, Oracle, and SQLServer -- Key: HIVE-8711 URL: https://issues.apache.org/jira/browse/HIVE-8711 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8711.2.patch, HIVE-8711.patch TxnHandler.detectDeadlock has code to catch deadlocks in MySQL and Derby. But it does not detect a deadlock for Postgres, Oracle, or SQLServer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8754) Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster
[ https://issues.apache.org/jira/browse/HIVE-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14199839#comment-14199839 ] Eugene Koifman commented on HIVE-8754: -- [~hagleitn], could we get this into 0.14 please? Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster --- Key: HIVE-8754 URL: https://issues.apache.org/jira/browse/HIVE-8754 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8754.patch HIVE-8588 added support for this by copying jdbc jars to lib/ of localized/exploded Sqoop tar. Unfortunately, in a secure cluster, Dist Cache intentionally sets permissions on exploded tars such that they are not writable. this needs to be fixed, otherwise the users would have to modify their sqoop tar to include the relevant jdbc jars which is burdensome is different DBs are used and may create headache around licensing issues NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8754) Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster
[ https://issues.apache.org/jira/browse/HIVE-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8754: - Attachment: HIVE-8754.2.patch Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster --- Key: HIVE-8754 URL: https://issues.apache.org/jira/browse/HIVE-8754 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8754.2.patch, HIVE-8754.patch HIVE-8588 added support for this by copying jdbc jars to lib/ of localized/exploded Sqoop tar. Unfortunately, in a secure cluster, Dist Cache intentionally sets permissions on exploded tars such that they are not writable. this needs to be fixed, otherwise the users would have to modify their sqoop tar to include the relevant jdbc jars which is burdensome is different DBs are used and may create headache around licensing issues NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8754) Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster
[ https://issues.apache.org/jira/browse/HIVE-8754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14199899#comment-14199899 ] Eugene Koifman commented on HIVE-8754: -- done Sqoop job submission via WebHCat doesn't properly localize required jdbc jars in secure cluster --- Key: HIVE-8754 URL: https://issues.apache.org/jira/browse/HIVE-8754 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8754.2.patch, HIVE-8754.patch HIVE-8588 added support for this by copying jdbc jars to lib/ of localized/exploded Sqoop tar. Unfortunately, in a secure cluster, Dist Cache intentionally sets permissions on exploded tars such that they are not writable. this needs to be fixed, otherwise the users would have to modify their sqoop tar to include the relevant jdbc jars which is burdensome is different DBs are used and may create headache around licensing issues NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8711) DB deadlocks not handled in TxnHandler for Postgres, Oracle, and SQLServer
[ https://issues.apache.org/jira/browse/HIVE-8711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14197323#comment-14197323 ] Eugene Koifman commented on HIVE-8711: -- 1. TxnHandler.detectDeadlock() - I think it makes sense to implement it like getDbTime() so that checks only what is expected for a given DB type. For example, if Oracle throws 40001, current code will assume it's a deadlock. 2. DeadLockCreator - it seems that it can easily guarantee deadlock by calling updateTxns(conn1), updateLocks(conn2), updateLocks(conn1). It could then be enabled permanently. 3. Nit: why are constants in TxnHandler, such as LOCK_ACQUIRED 'protected' They are only used in test code which is the same package as TxnHandler. 4. This not related to this bug, but MetaStoreThread.BooleanPointer() has multiple threads reading/writing a boolean variable which is not volatile or AtomicBoolean... this looks like a recipe for trouble DB deadlocks not handled in TxnHandler for Postgres, Oracle, and SQLServer -- Key: HIVE-8711 URL: https://issues.apache.org/jira/browse/HIVE-8711 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8711.patch TxnHandler.detectDeadlock has code to catch deadlocks in MySQL and Derby. But it does not detect a deadlock for Postgres, Oracle, or SQLServer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8710) Add more tests for transactional inserts
[ https://issues.apache.org/jira/browse/HIVE-8710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14194892#comment-14194892 ] Eugene Koifman commented on HIVE-8710: -- Could you add a comment to the test to explain why it's needed or a bug fix number that it provides coverage for? Otherwise, +1. Add more tests for transactional inserts Key: HIVE-8710 URL: https://issues.apache.org/jira/browse/HIVE-8710 Project: Hive Issue Type: Improvement Components: Transactions Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Attachments: HIVE-8710.patch Test cases are needed for inserting the results of a join and reading from a transactional table and inserting into a non-transactional table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Description: This makes DDL commands fail This was stupidly broken in HIVE-8643 was: This makes DDL commands fail This was stupidly broken in HIVE-8643 NO PRECOMMIT TESTS DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Status: Patch Available (was: Open) DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Status: Open (was: Patch Available) DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Attachment: HIVE-8685.3.patch patch 2 and 3 are the same - just trying to kick off build bot DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Status: Patch Available (was: Open) DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Status: Open (was: Patch Available) DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8694) every WebHCat e2e test should specify statusdir parameter
Eugene Koifman created HIVE-8694: Summary: every WebHCat e2e test should specify statusdir parameter Key: HIVE-8694 URL: https://issues.apache.org/jira/browse/HIVE-8694 Project: Hive Issue Type: Bug Components: Tests, WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman e.g. 'statusdir=TestSqoop_:TNUM:' This captures stdout/stderr for job submission and helps diagnosing failures. See if it's easy to add something to the test harness to collect all the info in these dirs to make it available after cluster shutdown. NO _PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14193409#comment-14193409 ] Eugene Koifman commented on HIVE-8685: -- the 2 test failures are not related testNegativeTokenAuth has been failing for many builds now org.apache.hive.hcatalog.streaming.TestStreaming is failing intermittently, for example, http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1594/testReport/junit/org.apache.hive.hcatalog.streaming/TestStreaming/testRemainingTransactions/ has exactly the same stack trace DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Resolution: Fixed Fix Version/s: 0.15.0 0.14.0 Status: Resolved (was: Patch Available) Committed to 0.14 and 0.15. Thanks [~thejas] for reivew DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8685.2.patch, HIVE-8685.3.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8685) DDL operations in WebHCat do not set proxy user in unsecure mode
Eugene Koifman created HIVE-8685: Summary: DDL operations in WebHCat do not set proxy user in unsecure mode Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat do not set proxy user in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Description: This was stupidly broken in HIVE-8643 was:This was stupidly broken in HIVE-8643 DDL operations in WebHCat do not set proxy user in unsecure mode Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Description: This makes DDL commands fail This was stupidly broken in HIVE-8643 was: This was stupidly broken in HIVE-8643 DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical This makes DDL commands fail This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Summary: DDL operations in WebHCat set proxy user to null in unsecure mode (was: DDL operations in WebHCat do not set proxy user in unsecure mode) DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical This was stupidly broken in HIVE-8643 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Description: This makes DDL commands fail This was stupidly broken in HIVE-8643 NO PRECOMMIT TESTS was: This makes DDL commands fail This was stupidly broken in HIVE-8643 DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical This makes DDL commands fail This was stupidly broken in HIVE-8643 NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Attachment: HIVE-8685.patch [~thejas] could you review please DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Status: Patch Available (was: Open) DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Attachment: HIVE-8685.2.patch DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Attachment: HIVE-8685.2.patch DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.2.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8685: - Attachment: (was: HIVE-8685.2.patch) DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8685) DDL operations in WebHCat set proxy user to null in unsecure mode
[ https://issues.apache.org/jira/browse/HIVE-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192930#comment-14192930 ] Eugene Koifman commented on HIVE-8685: -- [~hagleitn], this is a followup to HIVE-8643 which is needed in 0.14 as well DDL operations in WebHCat set proxy user to null in unsecure mode --- Key: HIVE-8685 URL: https://issues.apache.org/jira/browse/HIVE-8685 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8685.2.patch, HIVE-8685.patch This makes DDL commands fail This was stupidly broken in HIVE-8643 NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
[ https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14190653#comment-14190653 ] Eugene Koifman commented on HIVE-8643: -- committed to 0.14 and 0.15 DDL operations via WebHCat with doAs parameter in secure cluster fail - Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8643.2.patch, HIVE-8643.3.patch, HIVE-8643.patch webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
[ https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8643: - Resolution: Fixed Fix Version/s: 0.15.0 Status: Resolved (was: Patch Available) Thanks [~thejas] for the review DDL operations via WebHCat with doAs parameter in secure cluster fail - Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8643.2.patch, HIVE-8643.3.patch, HIVE-8643.patch webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
Eugene Koifman created HIVE-8643: Summary: DDL operations via WebHCat with doAs parameter in secure cluster fail Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
[ https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188098#comment-14188098 ] Eugene Koifman commented on HIVE-8643: -- [~thejas], could you review? DDL operations via WebHCat with doAs parameter in secure cluster fail - Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8643.patch webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
[ https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8643: - Status: Patch Available (was: Open) DDL operations via WebHCat with doAs parameter in secure cluster fail - Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8643.patch webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
[ https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8643: - Attachment: HIVE-8643.patch DDL operations via WebHCat with doAs parameter in secure cluster fail - Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8643.patch webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
[ https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188557#comment-14188557 ] Eugene Koifman commented on HIVE-8643: -- testNegativeTokenAuth has been failing for past 57 builds so it's not related [~hagleitn], it would be very useful to have this in 0.14. W/o it DDL ops via WebHCat with doAs (Hue, Knox, etc) will suffer. DDL operations via WebHCat with doAs parameter in secure cluster fail - Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8643.patch webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
[ https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188702#comment-14188702 ] Eugene Koifman commented on HIVE-8643: -- This issue was surfaced by HIVE-6847 which ensured that scratch dir permissions are sensible. DDL operations via WebHCat with doAs parameter in secure cluster fail - Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8643.patch webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
[ https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188744#comment-14188744 ] Eugene Koifman commented on HIVE-8643: -- I did, but unsuccessfully. DDL operations via WebHCat with doAs parameter in secure cluster fail - Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8643.patch webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
[ https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8643: - Attachment: HIVE-8643.2.patch HIVE-8643.2.patch to address [~thejas]' comments DDL operations via WebHCat with doAs parameter in secure cluster fail - Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8643.2.patch, HIVE-8643.patch webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
[ https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8643: - Attachment: HIVE-8643.3.patch DDL operations via WebHCat with doAs parameter in secure cluster fail - Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8643.2.patch, HIVE-8643.3.patch, HIVE-8643.patch webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8643) DDL operations via WebHCat with doAs parameter in secure cluster fail
[ https://issues.apache.org/jira/browse/HIVE-8643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14189580#comment-14189580 ] Eugene Koifman commented on HIVE-8643: -- testCliDriver_optimize_nullscan fails in other (i.e. ones w/o this patch) bot runs so it's not related. DDL operations via WebHCat with doAs parameter in secure cluster fail - Key: HIVE-8643 URL: https://issues.apache.org/jira/browse/HIVE-8643 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8643.2.patch, HIVE-8643.3.patch, HIVE-8643.patch webhcat handles DDL command by forking to 'hcat', i.e. HCatCli This starts a session. SessionState.start() creates scratch dir based on current user name via startSs.createSessionDirs(sessionUGI.getShortUserName()); This UGI is not aware of doAs param, so the name of the dir always ends up 'hcat', but because a delegation token is generated in WebHCat for HDFS access, the owner of the scratch dir is the calling user. Thus next time a session is started (because of a new DDL call from different user), it ends up trying to use the same scratch dir but cannot as it has 700 permission set. We need to pass in doAs user into SessionState -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
[ https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8588: - Resolution: Fixed Fix Version/s: 0.15.0 0.14.0 Status: Resolved (was: Patch Available) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster Key: HIVE-8588 URL: https://issues.apache.org/jira/browse/HIVE-8588 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8588.1.patch, HIVE-8588.2.patch This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
[ https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184229#comment-14184229 ] Eugene Koifman commented on HIVE-8588: -- Patch committed to trunk and 0.14. Thanks [~thejas] for the review. sqoop REST endpoint fails to send appropriate JDBC driver to the cluster Key: HIVE-8588 URL: https://issues.apache.org/jira/browse/HIVE-8588 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8588.1.patch, HIVE-8588.2.patch This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
[ https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184361#comment-14184361 ] Eugene Koifman commented on HIVE-8588: -- Sqoop end point should have it's own section like https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Pig for example sqoop REST endpoint fails to send appropriate JDBC driver to the cluster Key: HIVE-8588 URL: https://issues.apache.org/jira/browse/HIVE-8588 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Labels: TODOC14 Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8588.1.patch, HIVE-8588.2.patch This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
[ https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8588: - Priority: Critical (was: Major) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster Key: HIVE-8588 URL: https://issues.apache.org/jira/browse/HIVE-8588 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8588.1.patch This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
[ https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8588: - Status: Patch Available (was: Open) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster Key: HIVE-8588 URL: https://issues.apache.org/jira/browse/HIVE-8588 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8588.1.patch This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
[ https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8588: - Attachment: HIVE-8588.1.patch sqoop REST endpoint fails to send appropriate JDBC driver to the cluster Key: HIVE-8588 URL: https://issues.apache.org/jira/browse/HIVE-8588 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8588.1.patch This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
[ https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182970#comment-14182970 ] Eugene Koifman commented on HIVE-8588: -- I meant to say in my previous comment that it would be good to get this into 0.14. sqoop REST endpoint fails to send appropriate JDBC driver to the cluster Key: HIVE-8588 URL: https://issues.apache.org/jira/browse/HIVE-8588 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8588.1.patch This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
[ https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182969#comment-14182969 ] Eugene Koifman commented on HIVE-8588: -- [~vikram.dixit] w/o this change, for users to submit Sqoop jobs via WebHCat requires them to modify the Sqoop tar file to include the additional JDBC jars in it which is a major usability issue especially when working with multiple DBs and upgrades. sqoop REST endpoint fails to send appropriate JDBC driver to the cluster Key: HIVE-8588 URL: https://issues.apache.org/jira/browse/HIVE-8588 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8588.1.patch This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
[ https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8588: - Attachment: HIVE-8588.2.patch sqoop REST endpoint fails to send appropriate JDBC driver to the cluster Key: HIVE-8588 URL: https://issues.apache.org/jira/browse/HIVE-8588 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Attachments: HIVE-8588.1.patch, HIVE-8588.2.patch This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
Eugene Koifman created HIVE-8588: Summary: sqoop REST endpoint fails to send appropriate JDBC driver to the cluster Key: HIVE-8588 URL: https://issues.apache.org/jira/browse/HIVE-8588 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
[ https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182419#comment-14182419 ] Eugene Koifman commented on HIVE-8588: -- [~thejas] https://reviews.apache.org/r/27131/ sqoop REST endpoint fails to send appropriate JDBC driver to the cluster Key: HIVE-8588 URL: https://issues.apache.org/jira/browse/HIVE-8588 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8588) sqoop REST endpoint fails to send appropriate JDBC driver to the cluster
[ https://issues.apache.org/jira/browse/HIVE-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8588: - Description: This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:226) at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:295) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1773) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1578) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:96) at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64) at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100) at org.apache.sqoop.Sqoop.run(Sqoop.java:143) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227) at org.apache.sqoop.Sqoop.main(Sqoop.java:236) {noformat} Note that the Sqoop tar bundle does not contain the JDBC connector jar. I think the problem here maybe that the mysql connector jar added to libjars isn't available to the Sqoop tool which first connects to the database through JDBC driver to collect some table information before running the MR job. libjars will only add the connector jar for the MR job and not the local one. NO PRECOMMIT TESTS was: This is originally discovered by [~deepesh] When running a Sqoop integration test from WebHCat {noformat} curl --show-error -d command=export -libjars hdfs:///tmp/mysql-connector-java.jar --connect jdbc:mysql://deepesh-c6-1.cs1cloud.internal/sqooptest --username sqoop --password passwd --export-dir /tmp/templeton_test_data/sqoop --table person -d statusdir=sqoop.output -X POST http://deepesh-c6-1.cs1cloud.internal:50111/templeton/v1/sqoop?user.name=hrt_qa; {noformat} the job is failing with the following error: {noformat} $ hadoop fs -cat /user/hrt_qa/sqoop.output/stderr 14/10/15 23:52:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5.2.2.0.0-897 14/10/15 23:52:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 14/10/15 23:52:54 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 14/10/15 23:52:54 INFO tool.CodeGenTool: Beginning code generation 14/10/15 23:52:54 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:848) at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:736) at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:759) at org.apache.sqoop.manager.SqlManager.getColumnInfoForRawQuery(SqlManager.java:269) at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:240) at
[jira] [Commented] (HIVE-6940) [WebHCat]Update documentation for Templeton-Sqoop action
[ https://issues.apache.org/jira/browse/HIVE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182428#comment-14182428 ] Eugene Koifman commented on HIVE-6940: -- The comment about -libjars above is wrong. When WebHCat is configured to auto-ship Sqoop tar file, the user/admin may place any necessary JDBC jars into an HDFS directory. Then use libdir param when making the REST call to supply this directory path. WebHCat will then make sure that the jars from this dir are are placed in lib/ of the exploded Sqoop tar on the remote node. [WebHCat]Update documentation for Templeton-Sqoop action Key: HIVE-6940 URL: https://issues.apache.org/jira/browse/HIVE-6940 Project: Hive Issue Type: Bug Components: Documentation, WebHCat Affects Versions: 0.14.0 Reporter: Shuaishuai Nie Labels: TODOC14 WebHCat documentation need to be updated based on the new feature introduced in HIVE-5072 Here is some examples using the endpoint templeton/v1/sqoop example1: (passing Sqoop command directly) curl -s -d command=import --connect jdbc:sqlserver://localhost:4033;databaseName=SqoopDB;user=hadoop;password=password --table mytable --target-dir user/hadoop/importtable -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' example2: (passing source file which contains sqoop command) curl -s -d optionsfile=/sqoopcommand/command0.txt -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' example3: (using --options-file in the middle of sqoop command to enable reuse part of Sqoop command like connection string) curl -s -d files=/sqoopcommand/command1.txt,/sqoopcommand/command2.txt -d command=import --options-file command1.txt --options-file command2.txt -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' Also, for user to pass their JDBC driver jar, they can use the -libjars generic option in the Sqoop command. This is a functionality provided by Sqoop. Set of parameters can be passed to the endpoint: command (Sqoop command string to run) optionsfile (Options file which contain Sqoop command need to run, each section in the Sqoop command separated by space should be a single line in the options file) files (Comma seperated files to be copied to the map reduce cluster) statusdir (A directory where WebHCat will write the status of the Sqoop job. If provided, it is the caller’s responsibility to remove this directory when done) callback (Define a URL to be called upon job completion. You may embed a specific job ID into the URL using $jobId. This tag will be replaced in the callback URL with the job’s job ID. ) enablelog (when set to true, WebHCat will upload job log to statusdir. Need to define statusdir when enabled) All the above parameters are optional, but use have to provide either command or optionsfile in the command. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6940) [WebHCat]Update documentation for Templeton-Sqoop action
[ https://issues.apache.org/jira/browse/HIVE-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182429#comment-14182429 ] Eugene Koifman commented on HIVE-6940: -- see TestSqoop group in https://github.com/apache/hive/blob/trunk/hcatalog/src/test/e2e/templeton/tests/jobsubmission.conf for some examples [WebHCat]Update documentation for Templeton-Sqoop action Key: HIVE-6940 URL: https://issues.apache.org/jira/browse/HIVE-6940 Project: Hive Issue Type: Bug Components: Documentation, WebHCat Affects Versions: 0.14.0 Reporter: Shuaishuai Nie Labels: TODOC14 WebHCat documentation need to be updated based on the new feature introduced in HIVE-5072 Here is some examples using the endpoint templeton/v1/sqoop example1: (passing Sqoop command directly) curl -s -d command=import --connect jdbc:sqlserver://localhost:4033;databaseName=SqoopDB;user=hadoop;password=password --table mytable --target-dir user/hadoop/importtable -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' example2: (passing source file which contains sqoop command) curl -s -d optionsfile=/sqoopcommand/command0.txt -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' example3: (using --options-file in the middle of sqoop command to enable reuse part of Sqoop command like connection string) curl -s -d files=/sqoopcommand/command1.txt,/sqoopcommand/command2.txt -d command=import --options-file command1.txt --options-file command2.txt -d statusdir=sqoop.output 'http://localhost:50111/templeton/v1/sqoop?user.name=hadoop' Also, for user to pass their JDBC driver jar, they can use the -libjars generic option in the Sqoop command. This is a functionality provided by Sqoop. Set of parameters can be passed to the endpoint: command (Sqoop command string to run) optionsfile (Options file which contain Sqoop command need to run, each section in the Sqoop command separated by space should be a single line in the options file) files (Comma seperated files to be copied to the map reduce cluster) statusdir (A directory where WebHCat will write the status of the Sqoop job. If provided, it is the caller’s responsibility to remove this directory when done) callback (Define a URL to be called upon job completion. You may embed a specific job ID into the URL using $jobId. This tag will be replaced in the callback URL with the job’s job ID. ) enablelog (when set to true, WebHCat will upload job log to statusdir. Need to define statusdir when enabled) All the above parameters are optional, but use have to provide either command or optionsfile in the command. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8543) Compactions fail on metastore using postgres
[ https://issues.apache.org/jira/browse/HIVE-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14180161#comment-14180161 ] Eugene Koifman commented on HIVE-8543: -- +1 Compactions fail on metastore using postgres Key: HIVE-8543 URL: https://issues.apache.org/jira/browse/HIVE-8543 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8543.patch The worker fails to update the stats when the metastore is using Postgres as the RDBMS. {code} org.postgresql.util.PSQLException: ERROR: relation tab_col_stats does not exist {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8387: - Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks [~thejas] for the review. add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.14.0 Attachments: HIVE-8387.2.patch, HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8387: - Fix Version/s: (was: 0.14.0) 0.15.0 add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Fix For: 0.15.0 Attachments: HIVE-8387.2.patch, HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14177482#comment-14177482 ] Eugene Koifman commented on HIVE-8387: -- [~vikram.dixit] Please consider this for inclusion into 0.14 branch add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.15.0 Attachments: HIVE-8387.2.patch, HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8387: - Fix Version/s: 0.14.0 add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.14.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Priority: Critical Fix For: 0.14.0, 0.15.0 Attachments: HIVE-8387.2.patch, HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8387: - Attachment: HIVE-8387.2.patch HIVE-8387.2.patch contains Curator based implementation add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8387.2.patch, HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176174#comment-14176174 ] Eugene Koifman commented on HIVE-8387: -- [~thejas] done add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8387.2.patch, HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8387: - Status: Patch Available (was: Open) add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8387: - Attachment: HIVE-8387.patch [~thejas], [~sushanth] Could one of you review this please add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
[ https://issues.apache.org/jira/browse/HIVE-8387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14172785#comment-14172785 ] Eugene Koifman commented on HIVE-8387: -- https://reviews.apache.org/r/26771/ add retry logic to ZooKeeperStorage in WebHCat -- Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-8387.patch ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8368) compactor is improperly writing delete records in base file
[ https://issues.apache.org/jira/browse/HIVE-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167174#comment-14167174 ] Eugene Koifman commented on HIVE-8368: -- +1 compactor is improperly writing delete records in base file --- Key: HIVE-8368 URL: https://issues.apache.org/jira/browse/HIVE-8368 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8368.2.patch, HIVE-8368.patch When the compactor reads records from the base and deltas, it is not properly dropping delete records. This leads to oversized base files, and possibly to wrong query results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.
[ https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167348#comment-14167348 ] Eugene Koifman commented on HIVE-8258: -- getAcidState() has {noformat} if (bestBase != null) { // remove the entries so we don't get confused later and think we should // use them. original.clear(); } else { // Okay, we're going to need these originals. Recurse through them and figure out what we // really need. for (FileStatus origDir : originalDirectories) { findOriginals(fs, origDir, original); } } {noformat} The 'if' part doesn't do anything useful. Also, I think the logic for why this changes addresses the race condition is very subtle, so a more detailed comment would be useful. Otherwise, +1. Compactor cleaners can be starved on a busy table or partition. --- Key: HIVE-8258 URL: https://issues.apache.org/jira/browse/HIVE-8258 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, HIVE-8258.5.patch, HIVE-8258.6.patch, HIVE-8258.patch Currently the cleaning thread in the compactor does not run on a table or partition while any locks are held on this partition. This leaves it open to starvation in the case of a busy table or partition. It only needs to wait until all locks on the table/partition at the time of the compaction have expired. Any jobs initiated after that (and thus any locks obtained) will be for the new versions of the files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8367) delete writes records in wrong order in some cases
[ https://issues.apache.org/jira/browse/HIVE-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165432#comment-14165432 ] Eugene Koifman commented on HIVE-8367: -- +1 pending tests delete writes records in wrong order in some cases -- Key: HIVE-8367 URL: https://issues.apache.org/jira/browse/HIVE-8367 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.14.0 Attachments: HIVE-8367.2.patch, HIVE-8367.patch I have found one query with 10k records where you do: create table insert into table -- 10k records delete from table -- just some records The records in the delete delta are not ordered properly by rowid. I assume this applies to updates as well, but I haven't tested it yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8368) compactor is improperly writing delete records in base file
[ https://issues.apache.org/jira/browse/HIVE-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165443#comment-14165443 ] Eugene Koifman commented on HIVE-8368: -- Adding for completeness Before the patch: {noformat} hive explain delete from concur_orc_tab where age = 20 and age 30; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-2 depends on stages: Stage-1 Stage-0 depends on stages: Stage-2 Stage-3 depends on stages: Stage-0 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: concur_orc_tab Filter Operator predicate: ((age = 20) and (age 30)) (type: boolean) Select Operator expressions: ROW__ID (type: structtransactionid:bigint,bucketid:int,rowid:bigint) outputColumnNames: _col0 Reduce Output Operator key expressions: _col0 (type: structtransactionid:bigint,bucketid:int,rowid:bigint) sort order: - Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: structtransactionid:bigint,bucketid:int,rowid:bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-2 Map Reduce Map Operator Tree: TableScan Reduce Output Operator sort order: Map-reduce partition columns: UDFToInteger(_col0) (type: int) value expressions: _col0 (type: structtransactionid:bigint,bucketid:int,rowid:bigint) Reduce Operator Tree: Extract File Output Operator compressed: false table: input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde name: default.concur_orc_tab Stage: Stage-0 Move Operator tables: replace: false table: input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde name: default.concur_orc_tab Stage: Stage-3 Stats-Aggr Operator Time taken: 0.697 seconds, Fetched: 62 row(s) {noformat} After the patch: {noformat} hive explain delete from concur_orc_tab where age = 20 and age 30; OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 Stage-2 depends on stages: Stage-0 STAGE PLANS: Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: concur_orc_tab Filter Operator predicate: ((age = 20) and (age 30)) (type: boolean) Select Operator expressions: ROW__ID (type: structtransactionid:bigint,bucketid:int,rowid:bigint) outputColumnNames: _col0 Reduce Output Operator key expressions: _col0 (type: structtransactionid:bigint,bucketid:int,rowid:bigint) sort order: + Map-reduce partition columns: UDFToInteger(_col0) (type: int) Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: structtransactionid:bigint,bucketid:int,rowid:bigint) outputColumnNames: _col0 File Output Operator compressed: false table: input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde name: default.concur_orc_tab Stage: Stage-0 Move Operator tables: replace: false table: input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat output format: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde name: default.concur_orc_tab Stage: Stage-2 Stats-Aggr Operator Time taken: 0.538 seconds, Fetched: 45 row(s) {noformat} compactor is improperly writing delete records in base file --- Key: HIVE-8368 URL: https://issues.apache.org/jira/browse/HIVE-8368 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0
[jira] [Commented] (HIVE-8368) compactor is improperly writing delete records in base file
[ https://issues.apache.org/jira/browse/HIVE-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166238#comment-14166238 ] Eugene Koifman commented on HIVE-8368: -- The 2 plans in previous comment were meant for HIVE-8367. compactor is improperly writing delete records in base file --- Key: HIVE-8368 URL: https://issues.apache.org/jira/browse/HIVE-8368 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8368.2.patch, HIVE-8368.patch When the compactor reads records from the base and deltas, it is not properly dropping delete records. This leads to oversized base files, and possibly to wrong query results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8367) delete writes records in wrong order in some cases
[ https://issues.apache.org/jira/browse/HIVE-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14166240#comment-14166240 ] Eugene Koifman commented on HIVE-8367: -- https://issues.apache.org/jira/browse/HIVE-8368?focusedCommentId=14165443page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14165443 has query plans for the delete statement before and after this patch delete writes records in wrong order in some cases -- Key: HIVE-8367 URL: https://issues.apache.org/jira/browse/HIVE-8367 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.14.0 Attachments: HIVE-8367.2.patch, HIVE-8367.patch I have found one query with 10k records where you do: create table insert into table -- 10k records delete from table -- just some records The records in the delete delta are not ordered properly by rowid. I assume this applies to updates as well, but I haven't tested it yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8367) delete writes records in wrong order in some cases
[ https://issues.apache.org/jira/browse/HIVE-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14164298#comment-14164298 ] Eugene Koifman commented on HIVE-8367: -- I think this needs more info. What was the original query where the issue showed up? What precisely was the problem and how does the RS deduplication change help? The explanation for the latter would be useful to add to the code where this setting is set. How is the changes to sort order of ROW__ID related? delete writes records in wrong order in some cases -- Key: HIVE-8367 URL: https://issues.apache.org/jira/browse/HIVE-8367 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.14.0 Attachments: HIVE-8367.patch I have found one query with 10k records where you do: create table insert into table -- 10k records delete from table -- just some records The records in the delete delta are not ordered properly by rowid. I assume this applies to updates as well, but I haven't tested it yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8367) delete writes records in wrong order in some cases
[ https://issues.apache.org/jira/browse/HIVE-8367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14164303#comment-14164303 ] Eugene Koifman commented on HIVE-8367: -- also, ReduceSinkDeDuplication.java change is not needed delete writes records in wrong order in some cases -- Key: HIVE-8367 URL: https://issues.apache.org/jira/browse/HIVE-8367 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.14.0 Attachments: HIVE-8367.patch I have found one query with 10k records where you do: create table insert into table -- 10k records delete from table -- just some records The records in the delete delta are not ordered properly by rowid. I assume this applies to updates as well, but I haven't tested it yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8387) add retry logic to ZooKeeperStorage in WebHCat
Eugene Koifman created HIVE-8387: Summary: add retry logic to ZooKeeperStorage in WebHCat Key: HIVE-8387 URL: https://issues.apache.org/jira/browse/HIVE-8387 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.13.1 Reporter: Eugene Koifman Assignee: Eugene Koifman ZK interactions may run into transient errors that should be retried. Currently there is no retry logic in WebHCat for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8323) HIVE-8290 added transactional tbl property. Need to ensure it can only be set on tables using AcidOutputFormat
Eugene Koifman created HIVE-8323: Summary: HIVE-8290 added transactional tbl property. Need to ensure it can only be set on tables using AcidOutputFormat Key: HIVE-8323 URL: https://issues.apache.org/jira/browse/HIVE-8323 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0 Reporter: Eugene Koifman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8323) Ensure transactional tbl property can only be set on tables using AcidOutputFormat
[ https://issues.apache.org/jira/browse/HIVE-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8323: - Summary: Ensure transactional tbl property can only be set on tables using AcidOutputFormat (was: HIVE-8290 added transactional tbl property. Need to ensure it can only be set on tables using AcidOutputFormat) Ensure transactional tbl property can only be set on tables using AcidOutputFormat Key: HIVE-8323 URL: https://issues.apache.org/jira/browse/HIVE-8323 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0 Reporter: Eugene Koifman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8290) With DbTxnManager configured, all ORC tables forced to be transactional
[ https://issues.apache.org/jira/browse/HIVE-8290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153450#comment-14153450 ] Eugene Koifman commented on HIVE-8290: -- There is unused import hive_metastoreConstants. Also, could you add a comment on ACID_TABLE_PROPERTY, basically the equivalent of the the Description of this Jira ticket? This is minor, but would it make sense to move the constant to AcidInputFormat or some other more directly ACID related class? Otherwise, LGTM +1. With DbTxnManager configured, all ORC tables forced to be transactional --- Key: HIVE-8290 URL: https://issues.apache.org/jira/browse/HIVE-8290 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.14.0 Attachments: HIVE-8290.2.patch, HIVE-8290.patch Currently, once a user configures DbTxnManager to the be transaction manager, all tables that use ORC are expected to be transactional. This means they all have to have buckets. This most likely won't be what users want. We need to add a specific mark to a table so that users can indicate it should be treated in a transactional way. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.
[ https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153690#comment-14153690 ] Eugene Koifman commented on HIVE-8258: -- Review Comments: 1. TestCleaner has 3 unused imports 2. Cleaner: comment at lines 74-78: I think it would be good to elaborate on why this works. Something like any readers that acquired new locks on the same partition will not read files we are trying to delete since they will have been merged into other deltas/base by compaction and AcidUtils.getAcidState() has the logic to do that 3. Cleaner line 86: {noformat}if (!compactId2LockMap.containsKey(ci.id)) {{noformat} - I don't think this is the right map to use here 4. The cleaner may be in removeFiles() doing fs.delete() while some reader may be calling AcidUtils.getAcidState() at exactly the same. Is this a race condition that can cause problems? You get a list of files in getAcidState() but by the time you query the metadata about one such file it's deleted by the cleaner. Is the FS flexible enough for this? Compactor cleaners can be starved on a busy table or partition. --- Key: HIVE-8258 URL: https://issues.apache.org/jira/browse/HIVE-8258 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Attachments: HIVE-8258.patch Currently the cleaning thread in the compactor does not run on a table or partition while any locks are held on this partition. This leaves it open to starvation in the case of a busy table or partition. It only needs to wait until all locks on the table/partition at the time of the compaction have expired. Any jobs initiated after that (and thus any locks obtained) will be for the new versions of the files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8311) Driver is encoding transaction information too late
[ https://issues.apache.org/jira/browse/HIVE-8311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14153803#comment-14153803 ] Eugene Koifman commented on HIVE-8311: -- previously, locks were acquired 1st, then valid transaction list computed. Now the order is reversed. So now it's possible that txn list is computed, then a new txn (on a resource in in this TX) runs, then locks are acquired. I think, strictly speaking this is a race condition - for example a compactor run may sneak in here. Does this seem like an issue? Does hive cache query plans? If so, it will need to be invalidated when valid txn list changes Driver is encoding transaction information too late --- Key: HIVE-8311 URL: https://issues.apache.org/jira/browse/HIVE-8311 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.14.0 Attachments: HIVE-8311.patch Currently Driver is obtaining the transaction information and encoding it in the conf in runInternal. But this is too late, as the query has already been planned. Either we need to change the plan when this info is obtained or we need to obtain it at compile time. This bug was introduced by HIVE-8203. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8311) Driver is encoding transaction information too late
[ https://issues.apache.org/jira/browse/HIVE-8311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154224#comment-14154224 ] Eugene Koifman commented on HIVE-8311: -- You are right. +1 Driver is encoding transaction information too late --- Key: HIVE-8311 URL: https://issues.apache.org/jira/browse/HIVE-8311 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.14.0 Attachments: HIVE-8311.patch Currently Driver is obtaining the transaction information and encoding it in the conf in runInternal. But this is too late, as the query has already been planned. Either we need to change the plan when this info is obtained or we need to obtain it at compile time. This bug was introduced by HIVE-8203. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7817) distinct/group by don't work on partition columns
[ https://issues.apache.org/jira/browse/HIVE-7817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14152155#comment-14152155 ] Eugene Koifman commented on HIVE-7817: -- I don't know if this ever worked. I ran into this by accident while trying to test something else in 0.14. distinct/group by don't work on partition columns - Key: HIVE-7817 URL: https://issues.apache.org/jira/browse/HIVE-7817 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Reporter: Eugene Koifman suppose you have a table like this: {code:sql} CREATE TABLE page_view( viewTime INT, userid BIGINT, page_url STRING, referrer_url STRING, ip STRING COMMENT 'IP Address of the User') COMMENT 'This is the page view table' PARTITIONED BY(dt STRING, country STRING) CLUSTERED BY(userid) INTO 4 BUCKETS {code} Then {code:sql} select distinct dt from page_view; select distinct dt, country from page_view; select dt, country from page_view group by dt, country; {code} all fail with {noformat} Query ID = ekoifman_20140820172626_b03ba819-c111-433f-a3fc-453c7d5a3e86 Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapreduce.job.reduces=number Job running in-process (local Hadoop) Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2014-08-20 17:26:13,018 Stage-1 map = 0%, reduce = 0% Ended Job = job_local165359429_0013 with errors Error during job, obtaining debugging information... FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 0 msec {noformat} but {code:sql} select dt, country, count(*) from page_view group by dt, country; {code} works fine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8301) Update committer list
Eugene Koifman created HIVE-8301: Summary: Update committer list Key: HIVE-8301 URL: https://issues.apache.org/jira/browse/HIVE-8301 Project: Hive Issue Type: Improvement Components: Documentation Reporter: Eugene Koifman -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8301) Update committer list
[ https://issues.apache.org/jira/browse/HIVE-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8301: - Attachment: HIVE-8301.patch Update committer list - Key: HIVE-8301 URL: https://issues.apache.org/jira/browse/HIVE-8301 Project: Hive Issue Type: Improvement Components: Documentation Reporter: Eugene Koifman Attachments: HIVE-8301.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8301) Update committer list
[ https://issues.apache.org/jira/browse/HIVE-8301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-8301: - Description: NO PRECOMMIT TESTS add myself to committer list Update committer list - Key: HIVE-8301 URL: https://issues.apache.org/jira/browse/HIVE-8301 Project: Hive Issue Type: Improvement Components: Documentation Reporter: Eugene Koifman Attachments: HIVE-8301.patch NO PRECOMMIT TESTS add myself to committer list -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8203) ACID operations result in NPE when run through HS2
[ https://issues.apache.org/jira/browse/HIVE-8203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14148504#comment-14148504 ] Eugene Koifman commented on HIVE-8203: -- +1 pending tests ACID operations result in NPE when run through HS2 -- Key: HIVE-8203 URL: https://issues.apache.org/jira/browse/HIVE-8203 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8203.2.patch, HIVE-8203.patch When accessing Hive via HS2, any operation requiring the DbTxnManager results in an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8244) INSERT/UPDATE/DELETE should return count of rows affected
Eugene Koifman created HIVE-8244: Summary: INSERT/UPDATE/DELETE should return count of rows affected Key: HIVE-8244 URL: https://issues.apache.org/jira/browse/HIVE-8244 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.14.0 Reporter: Eugene Koifman it's common in SQL and JDBC [API|http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeUpdate(java.lang.String)] to return count of affected rows. Hive should do the same (it's not as of 9/24/2014) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8247) Pig cursor written to Hive via HCat doesn't NULL-fill missing columns
Eugene Koifman created HIVE-8247: Summary: Pig cursor written to Hive via HCat doesn't NULL-fill missing columns Key: HIVE-8247 URL: https://issues.apache.org/jira/browse/HIVE-8247 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.1 Reporter: Eugene Koifman This started out as BUG-15650 but in BUG-15650 it's no longer clear what the real issue is so I'm filing a new ticket. Suppose a Hive table has columns (a,b,c,d) If a Pig script writing to this table produces schema (a,b,c) it works: 'd' will be NULL. If a Pig script writing to this table produces schema (a,b,d) it fails with error below. This is an old issue. There is nothing in HCatalog documentation that indicates whether this should work. {noformat} Running org.apache.hive.hcatalog.pig.TestOrcHCatStorer Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 30.113 sec FAILURE! - in org.apache.hive.hcatalog.pig.TestOrcHCatStorer partialSchemaSepcification(org.apache.hive.hcatalog.pig.TestOrcHCatStorer) Time elapsed: 29.886 sec ERROR! org.apache.pig.impl.logicalLayer.FrontendException: Unable to store alias ABD at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1635) at org.apache.pig.PigServer.registerQuery(PigServer.java:575) at org.apache.hive.hcatalog.mapreduce.HCatBaseTest.logAndRegister(HCatBaseTest.java:92) at org.apache.hive.hcatalog.pig.TestHCatStorer.partialSchemaSepcification(TestHCatStorer.java:1035) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:254) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:149) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103) Caused by: org.apache.pig.impl.plan.VisitorException: line 7, column 0 Output Location Validation Failed for: 'T More info to follow: org.apache.hive.hcatalog.common.HCatException : 2007 : Invalid column position in partition schema : Expected column c at position 3, found column d at org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:75) at org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) at org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53) at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52) at