[jira] [Updated] (HIVE-4577) hive CLI can't handle hadoop dfs command with space and quotes.
[ https://issues.apache.org/jira/browse/HIVE-4577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-4577: -- Affects Version/s: 0.10.0 hive CLI can't handle hadoop dfs command with space and quotes. Key: HIVE-4577 URL: https://issues.apache.org/jira/browse/HIVE-4577 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.9.0, 0.10.0 Reporter: Bing Li Assignee: Bing Li Attachments: HIVE-4577.1.patch, HIVE-4577.2.patch As design, hive could support hadoop dfs command in hive shell, like hive dfs -mkdir /user/biadmin/mydir; but has different behavior with hadoop if the path contains space and quotes hive dfs -mkdir hello; drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:40 /user/biadmin/hello hive dfs -mkdir 'world'; drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:43 /user/biadmin/'world' hive dfs -mkdir bei jing; drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:44 /user/biadmin/bei drwxr-xr-x - biadmin supergroup 0 2013-04-23 09:44 /user/biadmin/jing -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4588) Support session level hooks for HiveServer2
Prasad Mujumdar created HIVE-4588: - Summary: Support session level hooks for HiveServer2 Key: HIVE-4588 URL: https://issues.apache.org/jira/browse/HIVE-4588 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Support session level hooks for HiveSrver2. The configured hooks will get executed at beginning of each new session. This is useful for auditing connections, possibly tuning the session level properties etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request: HIVE-4588: Support session level hooks for HiveServer2
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11326/ --- Review request for hive. Description --- Support session level hooks for HiveServer2 - New config parameter to define the hook - New hook context interface to pass the serssion user and config to the hook implementation - Session manager executes the configured hooks when a new session starts This addresses bug HIVE-4588. https://issues.apache.org/jira/browse/HIVE-4588 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 335af45 service/src/java/org/apache/hive/service/cli/session/HiveSessionHook.java PRE-CREATION service/src/java/org/apache/hive/service/cli/session/HiveSessionHookContext.java PRE-CREATION service/src/java/org/apache/hive/service/cli/session/HiveSessionHookContextImpl.java PRE-CREATION service/src/java/org/apache/hive/service/cli/session/SessionManager.java 3bb6807 service/src/test/org/apache/hive/service/cli/session/TestSessionHooks.java PRE-CREATION Diff: https://reviews.apache.org/r/11326/diff/ Testing --- Added new test for session hooks Thanks, Prasad Mujumdar
[jira] [Updated] (HIVE-4588) Support session level hooks for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-4588: -- Attachment: HIVE-4588-1.patch Support session level hooks for HiveServer2 --- Key: HIVE-4588 URL: https://issues.apache.org/jira/browse/HIVE-4588 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-4588-1.patch Support session level hooks for HiveSrver2. The configured hooks will get executed at beginning of each new session. This is useful for auditing connections, possibly tuning the session level properties etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4588) Support session level hooks for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-4588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasad Mujumdar updated HIVE-4588: -- Status: Patch Available (was: Open) Review request on https://reviews.apache.org/r/11326/ Support session level hooks for HiveServer2 --- Key: HIVE-4588 URL: https://issues.apache.org/jira/browse/HIVE-4588 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Prasad Mujumdar Assignee: Prasad Mujumdar Fix For: 0.12.0 Attachments: HIVE-4588-1.patch Support session level hooks for HiveSrver2. The configured hooks will get executed at beginning of each new session. This is useful for auditing connections, possibly tuning the session level properties etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters
Bing Li created HIVE-4589: - Summary: Hive Load command failed when inpath contains space or any restricted characters Key: HIVE-4589 URL: https://issues.apache.org/jira/browse/HIVE-4589 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0, 0.9.0 Reporter: Bing Li Assignee: Bing Li 0) create a simple text file with some string. See attached uk.cities 1) Create a directory in Hadoop that contains a space in the name hadoop fs -mkdir '/testdir/bri tain/' hadoop fs -copyFromLocal /tmp/uk.cities '/testdir/bri tain/uk.cities' 2) create table partspace ( city string) partitioned by ( country string) row format delimited FIELDS TERMINATED BY '$' stored as textfile; 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace partition (country='britain'); Then I got the message like, Load failed with message Wrong file format. Please check the file's format -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4450) Extend Vector Aggregates to support GROUP BY
[ https://issues.apache.org/jira/browse/HIVE-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4450: --- Attachment: HIVE-4450-p1.patch.txt added missing file Extend Vector Aggregates to support GROUP BY Key: HIVE-4450 URL: https://issues.apache.org/jira/browse/HIVE-4450 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Remus Rusanu Assignee: Remus Rusanu Labels: features Fix For: vectorization-branch Attachments: HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt Extend the VectorGroupByOperator and the VectorUDAF aggregates to support group by. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters
[ https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-4589: -- Status: Patch Available (was: In Progress) Hive Load command failed when inpath contains space or any restricted characters Key: HIVE-4589 URL: https://issues.apache.org/jira/browse/HIVE-4589 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0, 0.9.0 Reporter: Bing Li Assignee: Bing Li 0) create a simple text file with some string. See attached uk.cities 1) Create a directory in Hadoop that contains a space in the name hadoop fs -mkdir '/testdir/bri tain/' hadoop fs -copyFromLocal /tmp/uk.cities '/testdir/bri tain/uk.cities' 2) create table partspace ( city string) partitioned by ( country string) row format delimited FIELDS TERMINATED BY '$' stored as textfile; 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace partition (country='britain'); Then I got the message like, Load failed with message Wrong file format. Please check the file's format -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters
[ https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-4589 started by Bing Li. Hive Load command failed when inpath contains space or any restricted characters Key: HIVE-4589 URL: https://issues.apache.org/jira/browse/HIVE-4589 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.9.0, 0.10.0 Reporter: Bing Li Assignee: Bing Li 0) create a simple text file with some string. See attached uk.cities 1) Create a directory in Hadoop that contains a space in the name hadoop fs -mkdir '/testdir/bri tain/' hadoop fs -copyFromLocal /tmp/uk.cities '/testdir/bri tain/uk.cities' 2) create table partspace ( city string) partitioned by ( country string) row format delimited FIELDS TERMINATED BY '$' stored as textfile; 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace partition (country='britain'); Then I got the message like, Load failed with message Wrong file format. Please check the file's format -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters
[ https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-4589: -- Status: Open (was: Patch Available) Hive Load command failed when inpath contains space or any restricted characters Key: HIVE-4589 URL: https://issues.apache.org/jira/browse/HIVE-4589 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0, 0.9.0 Reporter: Bing Li Assignee: Bing Li Attachments: HIVE-4589.patch 0) create a simple text file with some string. See attached uk.cities 1) Create a directory in Hadoop that contains a space in the name hadoop fs -mkdir '/testdir/bri tain/' hadoop fs -copyFromLocal /tmp/uk.cities '/testdir/bri tain/uk.cities' 2) create table partspace ( city string) partitioned by ( country string) row format delimited FIELDS TERMINATED BY '$' stored as textfile; 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace partition (country='britain'); Then I got the message like, Load failed with message Wrong file format. Please check the file's format -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters
[ https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-4589: -- Attachment: HIVE-4589.patch Hive Load command failed when inpath contains space or any restricted characters Key: HIVE-4589 URL: https://issues.apache.org/jira/browse/HIVE-4589 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.9.0, 0.10.0 Reporter: Bing Li Assignee: Bing Li Attachments: HIVE-4589.patch 0) create a simple text file with some string. See attached uk.cities 1) Create a directory in Hadoop that contains a space in the name hadoop fs -mkdir '/testdir/bri tain/' hadoop fs -copyFromLocal /tmp/uk.cities '/testdir/bri tain/uk.cities' 2) create table partspace ( city string) partitioned by ( country string) row format delimited FIELDS TERMINATED BY '$' stored as textfile; 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace partition (country='britain'); Then I got the message like, Load failed with message Wrong file format. Please check the file's format -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4570) More information to user on GetOperationStatus in Hive Server2 when query is still executing
[ https://issues.apache.org/jira/browse/HIVE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13663978#comment-13663978 ] Jaideep Dhok commented on HIVE-4570: Please give your feedback on the below changes - Current API GetOperationState is not enough since it returns only a state enum. Instead of changing that we can add new API GetOperationProgress() which will return both OperationState and OperationProgress. Driver maintains list of running and runnable tasks although that info is not exposed outside. It's kept locally in the driver's execute method. We can add Driver.getTaskProgressList() to return task progress reports on all tasks (both running and runnable) Proposed changes - {noformat} // new method in TCLIService.thrift OperationProgress GetOperationProgress(OperationHandle) // new types - OperationProgress, TaskProgress and MapRedTaskProgress 1. OperationProgress: class OperationProgress { OperationState getOperationState(), ListTaskProgress getTaskProgress(); } 2.class TaskProgress { public float getProgress() { return 0; } public String getTaskID() { return N/A; } } 3. class MapRedTaskProgress extends TaskProgress { public float mapProgress(); public float reduceProgress(); public String getTaskID() { return runningJob.getID().toString(); } } 4. New method in Task: public TaskProgress getTaskProgress() { return 0; // default to 0 } 5. Override getTaskProgress in MapRedTask to return a MapRedTaskProgress {noformat} More information to user on GetOperationStatus in Hive Server2 when query is still executing Key: HIVE-4570 URL: https://issues.apache.org/jira/browse/HIVE-4570 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Currently in Hive Server2, when the query is still executing only the status is set as STILL_EXECUTING. This issue is to give more information to the user such as progress and running job handles, if possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters
[ https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-4589: -- Attachment: (was: HIVE-4589.patch) Hive Load command failed when inpath contains space or any restricted characters Key: HIVE-4589 URL: https://issues.apache.org/jira/browse/HIVE-4589 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.9.0, 0.10.0 Reporter: Bing Li Assignee: Bing Li Attachments: HIVE-4589.patch 0) create a simple text file with some string. See attached uk.cities 1) Create a directory in Hadoop that contains a space in the name hadoop fs -mkdir '/testdir/bri tain/' hadoop fs -copyFromLocal /tmp/uk.cities '/testdir/bri tain/uk.cities' 2) create table partspace ( city string) partitioned by ( country string) row format delimited FIELDS TERMINATED BY '$' stored as textfile; 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace partition (country='britain'); Then I got the message like, Load failed with message Wrong file format. Please check the file's format -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters
[ https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-4589: -- Attachment: HIVE-4589.patch Hive Load command failed when inpath contains space or any restricted characters Key: HIVE-4589 URL: https://issues.apache.org/jira/browse/HIVE-4589 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.9.0, 0.10.0 Reporter: Bing Li Assignee: Bing Li Attachments: HIVE-4589.patch 0) create a simple text file with some string. See attached uk.cities 1) Create a directory in Hadoop that contains a space in the name hadoop fs -mkdir '/testdir/bri tain/' hadoop fs -copyFromLocal /tmp/uk.cities '/testdir/bri tain/uk.cities' 2) create table partspace ( city string) partitioned by ( country string) row format delimited FIELDS TERMINATED BY '$' stored as textfile; 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace partition (country='britain'); Then I got the message like, Load failed with message Wrong file format. Please check the file's format -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters
[ https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-4589: -- Attachment: (was: HIVE-4589.patch) Hive Load command failed when inpath contains space or any restricted characters Key: HIVE-4589 URL: https://issues.apache.org/jira/browse/HIVE-4589 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.9.0, 0.10.0 Reporter: Bing Li Assignee: Bing Li Attachments: HIVE-4589.patch 0) create a simple text file with some string. See attached uk.cities 1) Create a directory in Hadoop that contains a space in the name hadoop fs -mkdir '/testdir/bri tain/' hadoop fs -copyFromLocal /tmp/uk.cities '/testdir/bri tain/uk.cities' 2) create table partspace ( city string) partitioned by ( country string) row format delimited FIELDS TERMINATED BY '$' stored as textfile; 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace partition (country='britain'); Then I got the message like, Load failed with message Wrong file format. Please check the file's format -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters
[ https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-4589: -- Attachment: HIVE-4589.patch Hive Load command failed when inpath contains space or any restricted characters Key: HIVE-4589 URL: https://issues.apache.org/jira/browse/HIVE-4589 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.9.0, 0.10.0 Reporter: Bing Li Assignee: Bing Li Attachments: HIVE-4589.patch 0) create a simple text file with some string. See attached uk.cities 1) Create a directory in Hadoop that contains a space in the name hadoop fs -mkdir '/testdir/bri tain/' hadoop fs -copyFromLocal /tmp/uk.cities '/testdir/bri tain/uk.cities' 2) create table partspace ( city string) partitioned by ( country string) row format delimited FIELDS TERMINATED BY '$' stored as textfile; 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace partition (country='britain'); Then I got the message like, Load failed with message Wrong file format. Please check the file's format -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters
[ https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664032#comment-13664032 ] Bing Li commented on HIVE-4589: --- In order to run this test case (-Dtestcase=TestCliDriver -Dqfile=load_fs3.q), you should apply the patch for HIVE-4577 first. Hive Load command failed when inpath contains space or any restricted characters Key: HIVE-4589 URL: https://issues.apache.org/jira/browse/HIVE-4589 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.9.0, 0.10.0 Reporter: Bing Li Assignee: Bing Li Attachments: HIVE-4589.patch 0) create a simple text file with some string. See attached uk.cities 1) Create a directory in Hadoop that contains a space in the name hadoop fs -mkdir '/testdir/bri tain/' hadoop fs -copyFromLocal /tmp/uk.cities '/testdir/bri tain/uk.cities' 2) create table partspace ( city string) partitioned by ( country string) row format delimited FIELDS TERMINATED BY '$' stored as textfile; 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace partition (country='britain'); Then I got the message like, Load failed with message Wrong file format. Please check the file's format -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters
[ https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bing Li updated HIVE-4589: -- Status: Patch Available (was: Open) I added a new test case for this defect. In order to run the case (-Dtestcase=TestCliDriver -Dqfile=load_fs3.q), you should apply the patch for HIVE-4577 first. Hive Load command failed when inpath contains space or any restricted characters Key: HIVE-4589 URL: https://issues.apache.org/jira/browse/HIVE-4589 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0, 0.9.0 Reporter: Bing Li Assignee: Bing Li Attachments: HIVE-4589.patch 0) create a simple text file with some string. See attached uk.cities 1) Create a directory in Hadoop that contains a space in the name hadoop fs -mkdir '/testdir/bri tain/' hadoop fs -copyFromLocal /tmp/uk.cities '/testdir/bri tain/uk.cities' 2) create table partspace ( city string) partitioned by ( country string) row format delimited FIELDS TERMINATED BY '$' stored as textfile; 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace partition (country='britain'); Then I got the message like, Load failed with message Wrong file format. Please check the file's format -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaideep Dhok updated HIVE-4569: --- Affects Version/s: (was: 0.11.0) Status: Patch Available (was: In Progress) Added new thrift API - GetQueryPlan to return query plan of a SQL query. GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4589) Hive Load command failed when inpath contains space or any restricted characters
[ https://issues.apache.org/jira/browse/HIVE-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664095#comment-13664095 ] Xuefu Zhang commented on HIVE-4589: --- It looks like a dupe of HIVE-4554. Hive Load command failed when inpath contains space or any restricted characters Key: HIVE-4589 URL: https://issues.apache.org/jira/browse/HIVE-4589 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.9.0, 0.10.0 Reporter: Bing Li Assignee: Bing Li Attachments: HIVE-4589.patch 0) create a simple text file with some string. See attached uk.cities 1) Create a directory in Hadoop that contains a space in the name hadoop fs -mkdir '/testdir/bri tain/' hadoop fs -copyFromLocal /tmp/uk.cities '/testdir/bri tain/uk.cities' 2) create table partspace ( city string) partitioned by ( country string) row format delimited FIELDS TERMINATED BY '$' stored as textfile; 3) load data inpath '/testdir/bri tain/uk.cities' into table partspace partition (country='britain'); Then I got the message like, Load failed with message Wrong file format. Please check the file's format -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4472) OR, NOT Filter logic can lose an array, and always takes time O(VectorizedRowBatch.DEFAULT_SIZE)
[ https://issues.apache.org/jira/browse/HIVE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-4472: --- Attachment: HIVE-4472.5.patch Same patch as previous one except that the fix to TestConstantVectorExpression is removed, because that is taken care of by HIVE-4553. OR, NOT Filter logic can lose an array, and always takes time O(VectorizedRowBatch.DEFAULT_SIZE) Key: HIVE-4472 URL: https://issues.apache.org/jira/browse/HIVE-4472 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Jitendra Nath Pandey Attachments: HIVE-4472.1.patch, HIVE-4472.2.patch, HIVE-4472.3.patch, HIVE-4472.4.patch, HIVE-4472.5.patch The issue is in file FilterExprOrExpr.java and FilterNotExpr.java. I posted a review for you at https://reviews.apache.org/r/10752/ I think there is a bug related to sharing of an array of integers. Also, one algorithm step takes O(DEFAULT_BATCH_SIZE) time always. If nDEFAULT_BATCH_SIZE then this is a performance issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4579) Create a SARG interface for RecordReaders
[ https://issues.apache.org/jira/browse/HIVE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664293#comment-13664293 ] Ashit Gosalia commented on HIVE-4579: - This is a broad enough interface. You may also want to consider supporting IN and BETWEEN clauses because particular RecordReaders may efficiently implement these special forms. Looking at the TPC-H and TPC-DS queries, top level ANDs also seem to be a common case. Create a SARG interface for RecordReaders - Key: HIVE-4579 URL: https://issues.apache.org/jira/browse/HIVE-4579 Project: Hive Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley I think we should create a SARG (http://en.wikipedia.org/wiki/Sargable) interface for RecordReaders. For a first pass, I'll create an API that uses the value stored in hive.io.filter.expr.serialized. The desire is to define an simpler interface that the direct AST expression that is provided by hive.io.filter.expr.serialized so that the code to evaluate expressions can be generalized instead of put inside a particular RecordReader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4450) Extend Vector Aggregates to support GROUP BY
[ https://issues.apache.org/jira/browse/HIVE-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-4450: --- Attachment: HIVE-4450-p1.patch.txt This one I applied to second enlistment and confirmed it compiles Extend Vector Aggregates to support GROUP BY Key: HIVE-4450 URL: https://issues.apache.org/jira/browse/HIVE-4450 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Remus Rusanu Assignee: Remus Rusanu Labels: features Fix For: vectorization-branch Attachments: HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt Extend the VectorGroupByOperator and the VectorUDAF aggregates to support group by. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4579) Create a SARG interface for RecordReaders
[ https://issues.apache.org/jira/browse/HIVE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664321#comment-13664321 ] Eric Hanson commented on HIVE-4579: --- Consider adding Column IN (list-of-constants) as a SIMPLE_COND. This is really commonly used. Create a SARG interface for RecordReaders - Key: HIVE-4579 URL: https://issues.apache.org/jira/browse/HIVE-4579 Project: Hive Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley I think we should create a SARG (http://en.wikipedia.org/wiki/Sargable) interface for RecordReaders. For a first pass, I'll create an API that uses the value stored in hive.io.filter.expr.serialized. The desire is to define an simpler interface that the direct AST expression that is provided by hive.io.filter.expr.serialized so that the code to evaluate expressions can be generalized instead of put inside a particular RecordReader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4548) Speed up vectorized LIKE filter for special cases abc%, %abc and %abc%
[ https://issues.apache.org/jira/browse/HIVE-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664336#comment-13664336 ] Eric Hanson commented on HIVE-4548: --- Can you confirm if there is a problem or not? E.g. is it possible for a % character to show up as the first or second character of a 2-character sequence in a String that represents a character beyond the standard set of 0x to 0x. If it is indeed a problem, then we should fix it here and open another JIRA to report a bug in the original UDFLike. Speed up vectorized LIKE filter for special cases abc%, %abc and %abc% -- Key: HIVE-4548 URL: https://issues.apache.org/jira/browse/HIVE-4548 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Teddy Choi Priority: Minor Fix For: vectorization-branch Attachments: HIVE-4548.1-with-benchmark.patch.txt, HIVE-4548.1-without-benchmark.patch.txt, HIVE-4548.2-with-benchmark.patch.txt, HIVE-4548.2-without-benchmark.patch.txt Speed up vectorized LIKE filter evaluation for abc%, %abc, and %abc% pattern special cases (here, abc is just a place holder for some fixed string). Problem: The current vectorized LIKE implementation always calls the standard LIKE function code in UDFLike.java. But this is pretty expensive. It calls multiple functions and allocates at least one new object per call. Probably 80% of uses of LIKE are for the simple patterns abc%, %abc, and %abc%. These can be implemented much more efficiently. Start by speeding up the case for Column LIKE abc% The goal would be to minimize expense in the inner loop. Don't use new() in the inner loop, and write a static function that checks the prefix of the string matches the like pattern as efficiently as possible, operating directly on the byte array holding UTF-8-encoded string data, and avoiding unnecessary additional function calls and if/else logic. Call that in the inner loop. If feasible, consider using a template-driven approach, with an instance of the template expanded for each of the three cases. Start doing the abc% (prefix match) by hand, then consider templatizing for the other two cases. The code is in the vectorization branch of the main hive repo. Start by checking in the constructor for FilterStringColLikeStringScalar.java if the pattern is one of the simple special cases. If so, record that, and have the evaluate() method call a special-case function for each case, i.e. the general case, and each of the 3 special cases. All the dynamic decision-making would be done once per vector, not once per element. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3159) Update AvroSerde to determine schema of new tables
[ https://issues.apache.org/jira/browse/HIVE-3159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan reassigned HIVE-3159: - Assignee: (was: Jakob Homan) Update AvroSerde to determine schema of new tables -- Key: HIVE-3159 URL: https://issues.apache.org/jira/browse/HIVE-3159 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Jakob Homan Currently when writing tables to Avro one must manually provide an Avro schema that matches what is being delivered by Hive. It'd be better to have the serde infer this schema by converting the table's TypeInfo into an appropriate AvroSchema. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2670) A cluster test utility for Hive
[ https://issues.apache.org/jira/browse/HIVE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-2670: - Attachment: HIVE-2670_5.patch An up to date version of the patch. I did change this to work with the HCat e2e tests that are already (now) in Hive rather than create a whole new e2e directory directly off of hive/trunk. I've run all of the tests and confirmed that they pass. A cluster test utility for Hive --- Key: HIVE-2670 URL: https://issues.apache.org/jira/browse/HIVE-2670 Project: Hive Issue Type: New Feature Components: Testing Infrastructure Reporter: Alan Gates Assignee: Johnny Zhang Attachments: harness.tar, HIVE-2670_5.patch, hive_cluster_test_2.patch, hive_cluster_test_3.patch, hive_cluster_test_4.patch, hive_cluster_test.patch Hive has an extensive set of unit tests, but it does not have an infrastructure for testing in a cluster environment. Pig and HCatalog have been using a test harness for cluster testing for some time. We have written Hive drivers and tests to run in this harness. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4581) HCat e2e tests broken by changes to Hive's describe table formatting
[ https://issues.apache.org/jira/browse/HIVE-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664388#comment-13664388 ] Sushanth Sowmyan commented on HIVE-4581: Changes look good to me. +1. HCat e2e tests broken by changes to Hive's describe table formatting Key: HIVE-4581 URL: https://issues.apache.org/jira/browse/HIVE-4581 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.12.0 Attachments: HIVE-4581.patch In Hive 0.11 the default formatting for describe table changed. A number of the HCat e2e tests do describe table and apply regular expressions to the output to make sure the table looks correct. These formatting changes broke those tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2670) A cluster test utility for Hive
[ https://issues.apache.org/jira/browse/HIVE-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664390#comment-13664390 ] Johnny Zhang commented on HIVE-2670: Alan, Thanks for the update! A cluster test utility for Hive --- Key: HIVE-2670 URL: https://issues.apache.org/jira/browse/HIVE-2670 Project: Hive Issue Type: New Feature Components: Testing Infrastructure Reporter: Alan Gates Assignee: Johnny Zhang Attachments: harness.tar, HIVE-2670_5.patch, hive_cluster_test_2.patch, hive_cluster_test_3.patch, hive_cluster_test_4.patch, hive_cluster_test.patch Hive has an extensive set of unit tests, but it does not have an infrastructure for testing in a cluster environment. Pig and HCatalog have been using a test harness for cluster testing for some time. We have written Hive drivers and tests to run in this harness. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4590) HCatalog documentation example is wrong
Eugene Koifman created HIVE-4590: Summary: HCatalog documentation example is wrong Key: HIVE-4590 URL: https://issues.apache.org/jira/browse/HIVE-4590 Project: Hive Issue Type: Bug Components: Documentation Affects Versions: 0.10.0 Reporter: Eugene Koifman Priority: Minor http://hive.apache.org/docs/hcat_r0.5.0/inputoutput.html#Read+Example reads The following very simple MapReduce program reads data from one table which it assumes to have an integer in the second column, and counts how many different values it sees. That is, it does the equivalent of select col1, count(*) from $table group by col1;. The description of the query is wrong. It actually counts how many instances of each distinct value it find. For example, if values of col1 are {1,1,1,3,3,3,5) it will produce 1, 3 3, 2, 5, 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4591) Making changes to webhcat-site.xml have no effect
Eugene Koifman created HIVE-4591: Summary: Making changes to webhcat-site.xml have no effect Key: HIVE-4591 URL: https://issues.apache.org/jira/browse/HIVE-4591 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Looks like WebHCat configuration is read as follows: Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, jar:file:/Users/ekoifman/dev/hive/build/dist/hcatalog/share/webhcat/svr/webhcat-0.12.0-SNAPSHOT.jar!/webhcat-default.xml creating /Users/ekoifman/dev/hive/build/dist/hcatalog/etc/webhcat/webhcat-site.xml and setting templeton.exec.timeout has no effect as can be seen in ExecServiceImpl Probably the webhcat_server.sh script is missing something -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
webhcat component in hive jira
Can a hive jira admin please create a webhcat component in hive project in jira ? (webhcat - http://hive.apache.org/docs/hcat_r0.5.0/rest.html) Thanks, Thejas
[jira] [Commented] (HIVE-4578) Changes to Pig's test harness broke HCat e2e tests
[ https://issues.apache.org/jira/browse/HIVE-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664425#comment-13664425 ] Hudson commented on HIVE-4578: -- Integrated in Hive-trunk-h0.21 #2113 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2113/]) HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates) (Revision 1484969) Result = FAILURE gates : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1484969 Files : * /hive/trunk/hcatalog/src/test/e2e/hcatalog/build.xml * /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource * /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/default.res * /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/windows.res Changes to Pig's test harness broke HCat e2e tests -- Key: HIVE-4578 URL: https://issues.apache.org/jira/browse/HIVE-4578 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.12.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.12.0 Attachments: HIVE-4578_2.patch, HIVE-4578.patch HCatalog externs the test harness from Pig. Pig recently made some changes to the test harness to work better across Unix and Windows. These changes require new OS specific files. HCatalog will also need these files in order to work with the test harness. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 2113 - Still Failing
Changes for Build #2088 [gates] HIVE-4465 webhcat e2e tests succeed regardless of exitvalue Changes for Build #2089 [cws] HIVE-3957. Add pseudo-BNF grammar for RCFile to Javadoc (Mark Grover via cws) [cws] HIVE-4497. beeline module tests don't get run by default (Thejas Nair via cws) [gangtimliu] HIVE-4474: Column access not tracked properly for partitioned tables. Samuel Yuan via Gang Tim Liu [hashutosh] HIVE-4455 : HCatalog build directories get included in tar file produced by ant tar (Alan Gates via Ashutosh Chauhan) Changes for Build #2090 Changes for Build #2091 [hashutosh] HIVE-4392 : Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns (Navis via Ashutosh Chauhan) [hashutosh] HIVE-4421 : Improve memory usage by ORC dictionaries (Owen Omalley via Ashutosh Chauhan) [mithun] HCATALOG-627 - Adding thread-safety to NotificationListener. (amalakar via mithun) Changes for Build #2092 [hashutosh] HIVE-4466 : Fix continue.on.failure in unit tests to -well- continue on failure in unit tests (Gunther Hagleitner via Ashutosh Chauhan) [hashutosh] HIVE-4471 : Build fails with hcatalog checkstyle error (Gunther Hagleitner via Ashutosh Chauhan) Changes for Build #2093 [omalley] HIVE-4494 ORC map columns get class cast exception in some contexts (omalley) [omalley] HIVE-4500 Ensure that HiveServer 2 closes log files. (Alan Gates via omalley) Changes for Build #2094 [navis] HIVE-4209 Cache evaluation result of deterministic expression and reuse it (Navis via namit) Changes for Build #2095 Changes for Build #2096 Changes for Build #2097 [cws] HIVE-4530. Enforce minmum ant version required in build script (Arup Malakar via cws) [omalley] Preparing RELEASE_NOTES for Hive 0.11.0rc2. Changes for Build #2098 [omalley] Update release notes for 0.11.0rc2 [omalley] HIVE-4527 Fix eclipse project template (Carl Steinbach via omalley) [omalley] HIVE-4505 Hive can't load transforms with remote scripts. (Prasad Majumdar and Gunther Hagleitner via omalley) [omalley] HIVE-4498 TestBeeLineWithArgs.testPositiveScriptFile fails (Thejas Nair via omalley) Changes for Build #2099 Changes for Build #2100 Changes for Build #2101 Changes for Build #2102 Changes for Build #2103 [daijy] PIG-2955: Fix bunch of Pig e2e tests on Windows Changes for Build #2104 [daijy] PIG-3069: Native Windows Compatibility for Pig E2E Tests and Harness Changes for Build #2105 [omalley] HIVE-4550 local_mapred_error_cache fails on some hadoop versions (Gunther Hagleitner via omalley) [omalley] HIVE-4440 SMB Operator spills to disk like it's 1999 (Gunther Hagleitner via omalley) Changes for Build #2106 Changes for Build #2107 [omalley] HIVE-4486 FetchOperator slows down SMB map joins by 50% when there are many partitions (Gopal V via omalley) Changes for Build #2108 Changes for Build #2109 Changes for Build #2110 Changes for Build #2111 [omalley] HIVE-4475 Switch RCFile default to LazyBinaryColumnarSerDe. (Guther Hagleitner via omalley) [omalley] HIVE-4521 Auto join conversion fails in certain cases (Gunther Hagleitner via omalley) Changes for Build #2112 Changes for Build #2113 [gates] HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates) All tests passed The Apache Jenkins build system has built Hive-trunk-h0.21 (build #2113) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/2113/ to view the results.
[jira] [Assigned] (HIVE-4590) HCatalog documentation example is wrong
[ https://issues.apache.org/jira/browse/HIVE-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz reassigned HIVE-4590: Assignee: Lefty Leverenz HCatalog documentation example is wrong --- Key: HIVE-4590 URL: https://issues.apache.org/jira/browse/HIVE-4590 Project: Hive Issue Type: Bug Components: Documentation Affects Versions: 0.10.0 Reporter: Eugene Koifman Assignee: Lefty Leverenz Priority: Minor http://hive.apache.org/docs/hcat_r0.5.0/inputoutput.html#Read+Example reads The following very simple MapReduce program reads data from one table which it assumes to have an integer in the second column, and counts how many different values it sees. That is, it does the equivalent of select col1, count(*) from $table group by col1;. The description of the query is wrong. It actually counts how many instances of each distinct value it find. For example, if values of col1 are {1,1,1,3,3,3,5) it will produce 1, 3 3, 2, 5, 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4578) Changes to Pig's test harness broke HCat e2e tests
[ https://issues.apache.org/jira/browse/HIVE-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664449#comment-13664449 ] Hudson commented on HIVE-4578: -- Integrated in Hive-trunk-hadoop2 #206 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/206/]) HIVE-4578 Changes to Pig's test harness broke HCat e2e tests (gates) (Revision 1484969) Result = ABORTED gates : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1484969 Files : * /hive/trunk/hcatalog/src/test/e2e/hcatalog/build.xml * /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource * /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/default.res * /hive/trunk/hcatalog/src/test/e2e/hcatalog/resource/windows.res Changes to Pig's test harness broke HCat e2e tests -- Key: HIVE-4578 URL: https://issues.apache.org/jira/browse/HIVE-4578 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.12.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.12.0 Attachments: HIVE-4578_2.patch, HIVE-4578.patch HCatalog externs the test harness from Pig. Pig recently made some changes to the test harness to work better across Unix and Windows. These changes require new OS specific files. HCatalog will also need these files in order to work with the test harness. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4590) HCatalog documentation example is wrong
[ https://issues.apache.org/jira/browse/HIVE-4590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-4590: - Component/s: HCatalog HCatalog documentation example is wrong --- Key: HIVE-4590 URL: https://issues.apache.org/jira/browse/HIVE-4590 Project: Hive Issue Type: Bug Components: Documentation, HCatalog Affects Versions: 0.10.0 Reporter: Eugene Koifman Assignee: Lefty Leverenz Priority: Minor http://hive.apache.org/docs/hcat_r0.5.0/inputoutput.html#Read+Example reads The following very simple MapReduce program reads data from one table which it assumes to have an integer in the second column, and counts how many different values it sees. That is, it does the equivalent of select col1, count(*) from $table group by col1;. The description of the query is wrong. It actually counts how many instances of each distinct value it find. For example, if values of col1 are {1,1,1,3,3,3,5) it will produce 1, 3 3, 2, 5, 1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4450) Extend Vector Aggregates to support GROUP BY
[ https://issues.apache.org/jira/browse/HIVE-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4450: Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this to vectorization branch. Thanks, Remus! Extend Vector Aggregates to support GROUP BY Key: HIVE-4450 URL: https://issues.apache.org/jira/browse/HIVE-4450 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Remus Rusanu Assignee: Remus Rusanu Labels: features Fix For: vectorization-branch Attachments: HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt, HIVE-4450-p1.patch.txt Extend the VectorGroupByOperator and the VectorUDAF aggregates to support group by. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4537) select * fails on orc table when vectorization is enabled
[ https://issues.apache.org/jira/browse/HIVE-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4537: Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) I just committed this to the vectorization branch. Thanks, Sarvesh! select * fails on orc table when vectorization is enabled -- Key: HIVE-4537 URL: https://issues.apache.org/jira/browse/HIVE-4537 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Tony Murphy Assignee: Sarvesh Sakalanaga Fix For: 0.12.0 Attachments: Hive-4537.0.patch, Hive-4537.1.patch hive select * from intdataorc; OK Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating cint0 Time taken: 0.213 seconds -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4537) select * fails on orc table when vectorization is enabled
[ https://issues.apache.org/jira/browse/HIVE-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4537: Description: hive select * from intdataorc; OK Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating cint0 Time taken: 0.213 seconds was: hive select * from intdataorc; OK Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating cint0 Time taken: 0.213 seconds Fix Version/s: (was: 0.12.0) vectorization-branch select * fails on orc table when vectorization is enabled -- Key: HIVE-4537 URL: https://issues.apache.org/jira/browse/HIVE-4537 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Tony Murphy Assignee: Sarvesh Sakalanaga Fix For: vectorization-branch Attachments: Hive-4537.0.patch, Hive-4537.1.patch hive select * from intdataorc; OK Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating cint0 Time taken: 0.213 seconds -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4553) Column Column, and Column Scalar vectorized execution tests
[ https://issues.apache.org/jira/browse/HIVE-4553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4553: Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this to the vectorization branch. Thanks, Tony! Column Column, and Column Scalar vectorized execution tests --- Key: HIVE-4553 URL: https://issues.apache.org/jira/browse/HIVE-4553 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Tony Murphy Assignee: Tony Murphy Fix For: vectorization-branch Attachments: HIVE-4553 (2).patch, HIVE-4553 (3).patch, HIVE-4553.4.patch, HIVE-4553.5.patch, HIVE-4553.patch review board review: https://reviews.apache.org/r/11133/ This patch adds Column Column, and Column Scalar vectorized execution tests. These tests are generated in parallel with the vectorized expressions. The tests focus is on validating the column vector and the vectorized row batch metadata regarding nulls, repeating, and selection. Overview of Changes: CodeGen.java: + joinPath, getCamelCaseType, readFile and writeFile made static for use in TestCodeGen.java. + filter types now specify null as their output type rather than doesn't matter to make detection for test generation easier. + support for test generation added. TestCodeGen.java Templates: TestClass.txt TestColumnColumnFilterVectorExpressionEvaluation.txt, TestColumnColumnOperationVectorExpressionEvaluation.txt, TestColumnScalarFilterVectorExpressionEvaluation.txt, TestColumnScalarOperationVectorExpressionEvaluation.txt +This class is mutable and maintains a hashmap of TestSuiteClassName to test cases. The tests cases are added over the course of vectorized expressions class generation, with test classes being outputted at the end. For each column vector (inputs and/or outputs) a matrix of pairwise covering Booleans is used to generate test cases across nulls and repeating dimensions. Based on the input column vector(s) nulls and repeating states the states of the output column vector (if there is one) is validated, along with the null vector. For filter operations the selection vector is validated against the generated data. Each template corresponds to a class representing a test suite. VectorizedRowGroupUtil.java +added methods generateLongColumnVector and generateDoubleColumnVector for generating the respective column vectors with optional nulls and/or repeating values. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4472) OR, NOT Filter logic can lose an array, and always takes time O(VectorizedRowBatch.DEFAULT_SIZE)
[ https://issues.apache.org/jira/browse/HIVE-4472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4472: Resolution: Fixed Fix Version/s: vectorization-branch Status: Resolved (was: Patch Available) I just committed this to the vectorization branch. Thanks, Jitendra! OR, NOT Filter logic can lose an array, and always takes time O(VectorizedRowBatch.DEFAULT_SIZE) Key: HIVE-4472 URL: https://issues.apache.org/jira/browse/HIVE-4472 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Jitendra Nath Pandey Fix For: vectorization-branch Attachments: HIVE-4472.1.patch, HIVE-4472.2.patch, HIVE-4472.3.patch, HIVE-4472.4.patch, HIVE-4472.5.patch The issue is in file FilterExprOrExpr.java and FilterNotExpr.java. I posted a review for you at https://reviews.apache.org/r/10752/ I think there is a bug related to sharing of an array of integers. Also, one algorithm step takes O(DEFAULT_BATCH_SIZE) time always. If nDEFAULT_BATCH_SIZE then this is a performance issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4534) IsNotNull and NotCol incorrectly handle nulls.
[ https://issues.apache.org/jira/browse/HIVE-4534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4534: Resolution: Fixed Fix Version/s: vectorization-branch Status: Resolved (was: Patch Available) I just committed this to the vectorization branch. Thanks, Jitendra! IsNotNull and NotCol incorrectly handle nulls. -- Key: HIVE-4534 URL: https://issues.apache.org/jira/browse/HIVE-4534 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Jitendra Nath Pandey Fix For: vectorization-branch Attachments: HIVE-4534.1.patch, HIVE-4534.2.patch See file IsNotNull.java in package org.apache.hadoop.hive.ql.exec.vector.expressions It never looks at the noNulls flag on the input vector, but accesses the isNull[] array anyway. This can yield incorrect results. isRepeating and noNulls are not set in the output, which can also cause wrong results. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request: HIVE-4568 Beeline needs to support resolving variables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11334/ --- Review request for hive. Description --- 1. Added command variable substition 2. Added test case This addresses bug HIVE-4568. https://issues.apache.org/jira/browse/HIVE-4568 Diffs - beeline/src/java/org/apache/hive/beeline/BeeLine.java aeb1e8b beeline/src/java/org/apache/hive/beeline/TestBeeLineVarSubstitution.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/VariableSubstitution.java f292944 Diff: https://reviews.apache.org/r/11334/diff/ Testing --- Thanks, Xuefu Zhang
Review Request: Review Request for HIVE-4554 Failed to create a table from existing file if file path has spaces
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/11335/ --- Review request for hive. Description --- Patch includes fix and new test case. This addresses bug HIVE-4554. https://issues.apache.org/jira/browse/HIVE-4554 Diffs - ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java 3031d1c ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java bd8d252 ql/src/test/queries/clientpositive/load_file_with_space_in_the_name.q PRE-CREATION ql/src/test/results/clientpositive/load_file_with_space_in_the_name.q.out PRE-CREATION Diff: https://reviews.apache.org/r/11335/diff/ Testing --- Thanks, Xuefu Zhang
[jira] [Updated] (HIVE-4579) Create a SARG interface for RecordReaders
[ https://issues.apache.org/jira/browse/HIVE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4579: Attachment: pushdown.pdf Here's a quick write up of the intent of the interface. Create a SARG interface for RecordReaders - Key: HIVE-4579 URL: https://issues.apache.org/jira/browse/HIVE-4579 Project: Hive Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: pushdown.pdf I think we should create a SARG (http://en.wikipedia.org/wiki/Sargable) interface for RecordReaders. For a first pass, I'll create an API that uses the value stored in hive.io.filter.expr.serialized. The desire is to define an simpler interface that the direct AST expression that is provided by hive.io.filter.expr.serialized so that the code to evaluate expressions can be generalized instead of put inside a particular RecordReader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4579) Create a SARG interface for RecordReaders
[ https://issues.apache.org/jira/browse/HIVE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4579: Attachment: (was: pushdown.pdf) Create a SARG interface for RecordReaders - Key: HIVE-4579 URL: https://issues.apache.org/jira/browse/HIVE-4579 Project: Hive Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: pushdown.pdf I think we should create a SARG (http://en.wikipedia.org/wiki/Sargable) interface for RecordReaders. For a first pass, I'll create an API that uses the value stored in hive.io.filter.expr.serialized. The desire is to define an simpler interface that the direct AST expression that is provided by hive.io.filter.expr.serialized so that the code to evaluate expressions can be generalized instead of put inside a particular RecordReader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4579) Create a SARG interface for RecordReaders
[ https://issues.apache.org/jira/browse/HIVE-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-4579: Attachment: pushdown.pdf fixed a typo Create a SARG interface for RecordReaders - Key: HIVE-4579 URL: https://issues.apache.org/jira/browse/HIVE-4579 Project: Hive Issue Type: Improvement Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: pushdown.pdf I think we should create a SARG (http://en.wikipedia.org/wiki/Sargable) interface for RecordReaders. For a first pass, I'll create an API that uses the value stored in hive.io.filter.expr.serialized. The desire is to define an simpler interface that the direct AST expression that is provided by hive.io.filter.expr.serialized so that the code to evaluate expressions can be generalized instead of put inside a particular RecordReader. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4548) Speed up vectorized LIKE filter for special cases abc%, %abc and %abc%
[ https://issues.apache.org/jira/browse/HIVE-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664572#comment-13664572 ] Eric Hanson commented on HIVE-4548: --- It appears that all the specific characters you are checking for in parseSimplePattern (%, _, \) cannot be the first or last character of a surrogate pair. So I think the code is safe. Please think this through and add some unit tests that process multi-byte UTF-8 characters of 3 bytes or more (which will force encoding as surrogate pairs inside a String). See http://en.wikipedia.org/wiki/UTF-16/UCS-2#Code_points_U.2B1_to_U.2B10 for a discussion of surrogate pairs. See http://en.wikipedia.org/wiki/List_of_Unicode_characters for a list of Unicode characters. % is 0x0025, _ is 0x005F, and \ is 0x005C. Surrogate pairs are all have lead surrogates in the range 0xD800..0xDBFF and trail surrogates in the range 0xDC00..0xDFFF. Speed up vectorized LIKE filter for special cases abc%, %abc and %abc% -- Key: HIVE-4548 URL: https://issues.apache.org/jira/browse/HIVE-4548 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Teddy Choi Priority: Minor Fix For: vectorization-branch Attachments: HIVE-4548.1-with-benchmark.patch.txt, HIVE-4548.1-without-benchmark.patch.txt, HIVE-4548.2-with-benchmark.patch.txt, HIVE-4548.2-without-benchmark.patch.txt Speed up vectorized LIKE filter evaluation for abc%, %abc, and %abc% pattern special cases (here, abc is just a place holder for some fixed string). Problem: The current vectorized LIKE implementation always calls the standard LIKE function code in UDFLike.java. But this is pretty expensive. It calls multiple functions and allocates at least one new object per call. Probably 80% of uses of LIKE are for the simple patterns abc%, %abc, and %abc%. These can be implemented much more efficiently. Start by speeding up the case for Column LIKE abc% The goal would be to minimize expense in the inner loop. Don't use new() in the inner loop, and write a static function that checks the prefix of the string matches the like pattern as efficiently as possible, operating directly on the byte array holding UTF-8-encoded string data, and avoiding unnecessary additional function calls and if/else logic. Call that in the inner loop. If feasible, consider using a template-driven approach, with an instance of the template expanded for each of the three cases. Start doing the abc% (prefix match) by hand, then consider templatizing for the other two cases. The code is in the vectorization branch of the main hive repo. Start by checking in the constructor for FilterStringColLikeStringScalar.java if the pattern is one of the simple special cases. If so, record that, and have the evaluate() method call a special-case function for each case, i.e. the general case, and each of the 3 special cases. All the dynamic decision-making would be done once per vector, not once per element. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4581) HCat e2e tests broken by changes to Hive's describe table formatting
[ https://issues.apache.org/jira/browse/HIVE-4581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-4581: - Resolution: Fixed Status: Resolved (was: Patch Available) HCat e2e tests broken by changes to Hive's describe table formatting Key: HIVE-4581 URL: https://issues.apache.org/jira/browse/HIVE-4581 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Reporter: Alan Gates Assignee: Alan Gates Fix For: 0.12.0 Attachments: HIVE-4581.patch In Hive 0.11 the default formatting for describe table changed. A number of the HCat e2e tests do describe table and apply regular expressions to the output to make sure the table looks correct. These formatting changes broke those tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664636#comment-13664636 ] Carl Steinbach commented on HIVE-4569: -- bq. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. It was not added because it became clear during implementation of HiveServer2 that it was a bad idea to extend (i.e. depend on) any of the existing legacy Hive Thrift APIs. We also were narrowly focused on supporting JDBC/ODBC, and neither of these APIs provide explicit support for retrieving the execution plan. @Jaideep: I think it would be a good idea to post some notes about how you plan to modify the HS2 Thrift API and get feedback before spending time doing the implementation work. GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #380
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/380/ -- [...truncated 35402 lines...] [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/jenkins/hive_2013-05-22_16-11-46_918_7907957170562774939/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201305221611_488159501.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] Copying file: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: load data local inpath '/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Output: default@testhivedrivertable [junit] Copying data from file:/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] POSTHOOK: query: load data local inpath '/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select * from testhivedrivertable limit 10 [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/tmp/jenkins/hive_2013-05-22_16-11-50_939_8934239527469923146/-mr-1 [junit] POSTHOOK: query: select * from testhivedrivertable limit 10 [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/jenkins/hive_2013-05-22_16-11-50_939_8934239527469923146/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201305221611_368184354.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201305221611_401196014.txt [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201305221611_1663133470.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (key int, value string)
[jira] [Created] (HIVE-4592) ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results
Eric Hanson created HIVE-4592: - Summary: ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results Key: HIVE-4592 URL: https://issues.apache.org/jira/browse/HIVE-4592 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch ColumnArithmeticColumn.txt should set the output column's noNulls flag to true if neither input column has nulls, but it does not do that. This can lead to wrong results is the noNulls was set to false in a previous use of the batch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Jenkins build is back to normal : Hive-0.10.0-SNAPSHOT-h0.20.1 #153
See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/153/
[jira] [Commented] (HIVE-4592) ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results
[ https://issues.apache.org/jira/browse/HIVE-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664761#comment-13664761 ] Eric Hanson commented on HIVE-4592: --- Found some other issues in null propagation as well. ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results Key: HIVE-4592 URL: https://issues.apache.org/jira/browse/HIVE-4592 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch ColumnArithmeticColumn.txt should set the output column's noNulls flag to true if neither input column has nulls, but it does not do that. This can lead to wrong results is the noNulls was set to false in a previous use of the batch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HIVE-4592) ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results
[ https://issues.apache.org/jira/browse/HIVE-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-4592 started by Eric Hanson. ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results Key: HIVE-4592 URL: https://issues.apache.org/jira/browse/HIVE-4592 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch ColumnArithmeticColumn.txt should set the output column's noNulls flag to true if neither input column has nulls, but it does not do that. This can lead to wrong results is the noNulls was set to false in a previous use of the batch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4593) ErrorMsg has several messages that reuse the same error code
Eugene Koifman created HIVE-4593: Summary: ErrorMsg has several messages that reuse the same error code Key: HIVE-4593 URL: https://issues.apache.org/jira/browse/HIVE-4593 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Eugene Koifman All of these errorCode values are associated with more than one message. 10043 10227 10228 10229 10230 10231 This is not right. This affects ErrorMsg.getErrorMsg(int errorCode) as well as Templeton logic. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4593) ErrorMsg has several messages that reuse the same error code
[ https://issues.apache.org/jira/browse/HIVE-4593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-4593: - Description: All of these errorCode values are associated with more than one message. 10043 10227 10228 10229 10230 10231 This is not right. This affects ErrorMsg.getErrorMsg(int errorCode) as well as Templeton logic. Should probably add a JUnit test to check this. was: All of these errorCode values are associated with more than one message. 10043 10227 10228 10229 10230 10231 This is not right. This affects ErrorMsg.getErrorMsg(int errorCode) as well as Templeton logic. ErrorMsg has several messages that reuse the same error code Key: HIVE-4593 URL: https://issues.apache.org/jira/browse/HIVE-4593 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Eugene Koifman All of these errorCode values are associated with more than one message. 10043 10227 10228 10229 10230 10231 This is not right. This affects ErrorMsg.getErrorMsg(int errorCode) as well as Templeton logic. Should probably add a JUnit test to check this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4257) java.sql.SQLNonTransientConnectionException on JDBCStatsAggregator
[ https://issues.apache.org/jira/browse/HIVE-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664794#comment-13664794 ] Navis commented on HIVE-4257: - running test java.sql.SQLNonTransientConnectionException on JDBCStatsAggregator -- Key: HIVE-4257 URL: https://issues.apache.org/jira/browse/HIVE-4257 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.11.0 Reporter: Teddy Choi Assignee: Teddy Choi Priority: Minor Attachments: HIVE-4257.1.patch.txt java.sql.SQLNonTransientConnectionException occurs on JDBCStatsAggregator after executing dozens of Hive queries periodically, which inserts thousands of rows. It may have a relation with DERBY-5098. To avoid this error, Hive should use a more recent version of Derby(10.6.2.3, 10.7.1.4, 10.8.2.2, 10.9.1.0 or later). Hive 0.11.0-SNAPSHOT uses Derby 10.4.2.0. {noformat} 2013-03-24 15:54:30,487 ERROR jdbc.JDBCStatsAggregator (JDBCStatsAggregator.java:aggregateStats(168)) - Error during publishing aggregation. java.sql.SQLNonTransientConnectionException: No current connection. 2013-03-24 15:54:30,487 ERROR jdbc.JDBCStatsAggregator (JDBCStatsAggregator.java:aggregateStats(168)) - Error during publishing aggregation. java.sql.SQLNonTransientConnectionException: No current connection. 2013-03-24 15:54:30,487 ERROR jdbc.JDBCStatsAggregator (JDBCStatsAggregator.java:cleanUp(249)) - Error during publishing aggregation. java.sql.SQLNonTransientConnectionException: No current connection. {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4194) JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL
[ https://issues.apache.org/jira/browse/HIVE-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664795#comment-13664795 ] Navis commented on HIVE-4194: - running test JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL -- Key: HIVE-4194 URL: https://issues.apache.org/jira/browse/HIVE-4194 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Richard Ding Assignee: Richard Ding Attachments: HIVE-4194.patch As per JDBC 3.0 Spec (section 9.2) If the Driver implementation understands the URL, it will return a Connection object; otherwise it returns null Currently HiveConnection constructor will throw IllegalArgumentException if url string doesn't start with jdbc:hive2. This exception should be caught by HiveDriver.connect and return null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4220) TimestampWritable.toString throws array index exception sometimes
[ https://issues.apache.org/jira/browse/HIVE-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664801#comment-13664801 ] Navis commented on HIVE-4220: - [~mikhail] The default value of max-worker of HiveServer is Integer.MAX and I've thought it could make too many formatters in some (erroneous) situation. But admittedly, it's safer and cleaner. +1 and running test. TimestampWritable.toString throws array index exception sometimes - Key: HIVE-4220 URL: https://issues.apache.org/jira/browse/HIVE-4220 Project: Hive Issue Type: Bug Reporter: Navis Assignee: Navis Attachments: HIVE-4220.D9669.1.patch {noformat} org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 45 at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:215) at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:170) at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:288) at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:348) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 45 at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:194) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1449) at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:193) ... 11 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 45 at sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:436) at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2081) at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:1996) at java.util.Calendar.setTimeInMillis(Calendar.java:1110) at java.util.Calendar.setTime(Calendar.java:1076) at java.text.SimpleDateFormat.format(SimpleDateFormat.java:875) at java.text.SimpleDateFormat.format(SimpleDateFormat.java:868) at java.text.DateFormat.format(DateFormat.java:316) at org.apache.hadoop.hive.serde2.io.TimestampWritable.toString(TimestampWritable.java:327) at org.apache.hadoop.hive.serde2.lazy.LazyTimestamp.writeUTF8(LazyTimestamp.java:95) at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:234) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:427) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:381) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:365) at org.apache.hadoop.hive.ql.exec.ListSinkOperator.processOp(ListSinkOperator.java:96) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:487) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:821) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:487) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:821) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:487) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:474) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:468) at org.apache.hadoop.hive.ql.exec.FetchTask.fetchAndPush(FetchTask.java:222) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:188) ... 13 more {noformat} data formatter in TimestampWritable is declared static and shared but it's not thread-safe. -- This message is automatically generated by JIRA. If you think it was sent
[jira] [Commented] (HIVE-4516) Fix concurrency bug in serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java
[ https://issues.apache.org/jira/browse/HIVE-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664803#comment-13664803 ] Navis commented on HIVE-4516: - +1, running test Fix concurrency bug in serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java - Key: HIVE-4516 URL: https://issues.apache.org/jira/browse/HIVE-4516 Project: Hive Issue Type: Bug Reporter: Jon Hartlaub Attachments: TimestampWritable.java.patch A patch for concurrent use of TimestampWritable which occurs in a multithreaded scenario (as found in AmpLab Shark). A static SimpleDateFormat (not ThreadSafe) is used by TimestampWritable in CTAS DDL statements where it manifests as data corruption when used in a concurrent environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4592) ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results
[ https://issues.apache.org/jira/browse/HIVE-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664817#comment-13664817 ] Jitendra Nath Pandey commented on HIVE-4592: Same issue exists in many other templates. I think we should fix them too in the same jira. Also, most of these templates assume that noNulls=false and isRepeating=true means all values are null. ColumnArithmeticColumn.txt template never sets output isNull to true; can give wrong results Key: HIVE-4592 URL: https://issues.apache.org/jira/browse/HIVE-4592 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch ColumnArithmeticColumn.txt should set the output column's noNulls flag to true if neither input column has nulls, but it does not do that. This can lead to wrong results is the noNulls was set to false in a previous use of the batch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4220) TimestampWritable.toString throws array index exception sometimes
[ https://issues.apache.org/jira/browse/HIVE-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4220: Resolution: Duplicate Status: Resolved (was: Patch Available) TimestampWritable.toString throws array index exception sometimes - Key: HIVE-4220 URL: https://issues.apache.org/jira/browse/HIVE-4220 Project: Hive Issue Type: Bug Reporter: Navis Assignee: Navis Attachments: HIVE-4220.D9669.1.patch {noformat} org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 45 at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:215) at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:170) at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:288) at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:348) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 45 at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:194) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1449) at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:193) ... 11 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 45 at sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:436) at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2081) at java.util.GregorianCalendar.computeFields(GregorianCalendar.java:1996) at java.util.Calendar.setTimeInMillis(Calendar.java:1110) at java.util.Calendar.setTime(Calendar.java:1076) at java.text.SimpleDateFormat.format(SimpleDateFormat.java:875) at java.text.SimpleDateFormat.format(SimpleDateFormat.java:868) at java.text.DateFormat.format(DateFormat.java:316) at org.apache.hadoop.hive.serde2.io.TimestampWritable.toString(TimestampWritable.java:327) at org.apache.hadoop.hive.serde2.lazy.LazyTimestamp.writeUTF8(LazyTimestamp.java:95) at org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:234) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:427) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:381) at org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:365) at org.apache.hadoop.hive.ql.exec.ListSinkOperator.processOp(ListSinkOperator.java:96) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:487) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:821) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:487) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:821) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:90) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:487) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:474) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:468) at org.apache.hadoop.hive.ql.exec.FetchTask.fetchAndPush(FetchTask.java:222) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:188) ... 13 more {noformat} data formatter in TimestampWritable is declared static and shared but it's not thread-safe. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaideep Dhok updated HIVE-4569: --- Attachment: git-4569.patch Attaching patch somehow it got skipped earlier. GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664836#comment-13664836 ] Jaideep Dhok commented on HIVE-4569: @Carl: This change will not affect JDBC/ODBC clients. Currently clients using Thrift have no way to get query plan, which is why we wanted to add this. Here are the changes proposed: # Add GetQueryPlan with arguments same as ExecuteStatement - {code}TGetQueryPlanResp GetQueryPlan(1:TExecuteStatementReq req);{code} # Run a SQLOperation for the request, calling Driver.compile with the statement and return the plan object. Throw HiveSQLException with return code of compile if it fails. # New response type for the above call - {code} struct TGetQueryPlanResp { 1: required TStatus status // Queryplan 2: required queryplan.Query plan } {code} We'll have to include queryplan.thrift in TCLIService.thrift for the return type GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4594) UDF should use setLenient(false) when using SimpleDateFormat for parsing given datestring
Minwoo Kim created HIVE-4594: Summary: UDF should use setLenient(false) when using SimpleDateFormat for parsing given datestring Key: HIVE-4594 URL: https://issues.apache.org/jira/browse/HIVE-4594 Project: Hive Issue Type: Bug Components: UDF Reporter: Minwoo Kim Priority: Minor If UDF has a date format of MM/DD/ and the function supply it the String 9/5/05 the date should *not* be allowed. In all cases, parsing must be non-lenient; the given string must strictly adhere to the parsing format. For example, {code} select hour('2013-05-111 10:10:1') from src limit 1; {code} it returns 10. the result returned is not what is expected, it should be null; SimpleDateFormat is lenient by default. so, UDF should use setLenient(false) when using SimpleDateFormat for parsing given datestring except for very specific or intended case -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4594) UDF should use setLenient(false) when using SimpleDateFormat for parsing given datestring
[ https://issues.apache.org/jira/browse/HIVE-4594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Minwoo Kim updated HIVE-4594: - Description: If UDF has a date format of MM/DD/ and the function supply it the String 9/5/05 the date should *not* be allowed. In most cases, parsing must be non-lenient; the given string must strictly adhere to the parsing format. For example, {code} select hour('2013-05-111 10:10:1') from src limit 1; {code} it returns 10. the result returned is not what is expected, it should be null; SimpleDateFormat is lenient by default. so, UDF should use setLenient(false) when using SimpleDateFormat for parsing given datestring except for very specific or intended case was: If UDF has a date format of MM/DD/ and the function supply it the String 9/5/05 the date should *not* be allowed. In all cases, parsing must be non-lenient; the given string must strictly adhere to the parsing format. For example, {code} select hour('2013-05-111 10:10:1') from src limit 1; {code} it returns 10. the result returned is not what is expected, it should be null; SimpleDateFormat is lenient by default. so, UDF should use setLenient(false) when using SimpleDateFormat for parsing given datestring except for very specific or intended case UDF should use setLenient(false) when using SimpleDateFormat for parsing given datestring - Key: HIVE-4594 URL: https://issues.apache.org/jira/browse/HIVE-4594 Project: Hive Issue Type: Bug Components: UDF Reporter: Minwoo Kim Priority: Minor If UDF has a date format of MM/DD/ and the function supply it the String 9/5/05 the date should *not* be allowed. In most cases, parsing must be non-lenient; the given string must strictly adhere to the parsing format. For example, {code} select hour('2013-05-111 10:10:1') from src limit 1; {code} it returns 10. the result returned is not what is expected, it should be null; SimpleDateFormat is lenient by default. so, UDF should use setLenient(false) when using SimpleDateFormat for parsing given datestring except for very specific or intended case -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4569: - Status: Open (was: Patch Available) Please post a review request on reviewboard or phabricator. GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4570) More information to user on GetOperationStatus in Hive Server2 when query is still executing
[ https://issues.apache.org/jira/browse/HIVE-4570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664858#comment-13664858 ] Amareshwari Sriramadasu commented on HIVE-4570: --- bq. Current API GetOperationState is not enough since it returns only a state enum. Instead of changing that we can add new API GetOperationProgress() which will return both OperationState and OperationProgress. Sounds good. +1. For default implementation of getProgress(), you can return 1, if task is successful and 0, otherwise. More information to user on GetOperationStatus in Hive Server2 when query is still executing Key: HIVE-4570 URL: https://issues.apache.org/jira/browse/HIVE-4570 Project: Hive Issue Type: Improvement Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Currently in Hive Server2, when the query is still executing only the status is set as STILL_EXECUTING. This issue is to give more information to the user such as progress and running job handles, if possible. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4569: -- Attachment: HIVE-4569.D10887.1.patch jaideepdhok requested code review of HIVE-4569 [jira] GetQueryPlan api in Hive Server2. Reviewers: JIRA HIVE-4569 It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. TEST PLAN Added unit test CLIServiceTest.testGetQueryPlan REVISION DETAIL https://reviews.facebook.net/D10887 AFFECTED FILES service/if/TCLIService.thrift service/src/gen/thrift/gen-cpp/TCLIService.cpp service/src/gen/thrift/gen-cpp/TCLIService.h service/src/gen/thrift/gen-cpp/TCLIService_server.skeleton.cpp service/src/gen/thrift/gen-cpp/TCLIService_types.cpp service/src/gen/thrift/gen-cpp/TCLIService_types.h service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCLIService.java service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetQueryPlanResp.java service/src/gen/thrift/gen-php/TCLIService.php service/src/gen/thrift/gen-py/TCLIService/TCLIService-remote service/src/gen/thrift/gen-py/TCLIService/TCLIService.py service/src/gen/thrift/gen-py/TCLIService/ttypes.py service/src/gen/thrift/gen-rb/t_c_l_i_service.rb service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb service/src/java/org/apache/hive/service/cli/CLIService.java service/src/java/org/apache/hive/service/cli/CLIServiceClient.java service/src/java/org/apache/hive/service/cli/EmbeddedCLIServiceClient.java service/src/java/org/apache/hive/service/cli/ICLIService.java service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java service/src/java/org/apache/hive/service/cli/session/HiveSession.java service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java service/src/test/org/apache/hive/service/cli/CLIServiceTest.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/26055/ To: JIRA, jaideepdhok GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch, HIVE-4569.D10887.1.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4569) GetQueryPlan api in Hive Server2
[ https://issues.apache.org/jira/browse/HIVE-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaideep Dhok updated HIVE-4569: --- Status: Patch Available (was: Open) GetQueryPlan api in Hive Server2 Key: HIVE-4569 URL: https://issues.apache.org/jira/browse/HIVE-4569 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Amareshwari Sriramadasu Assignee: Jaideep Dhok Attachments: git-4569.patch, HIVE-4569.D10887.1.patch It would nice to have GetQueryPlan as thrift api. I do not see GetQueryPlan api available in HiveServer2, though the wiki https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API contains, not sure why it was not added. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4531) [WebHCat] Collecting task logs to hdfs
[ https://issues.apache.org/jira/browse/HIVE-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4531: - Attachment: samplestatusdirwithlist.tar.gz HIVE-4531-5.patch Adding a list file to the logs. Attached samplestatusdirwithlist.tar.gz for a sample status directory. Here is a sample list file (list.txt): job: job_201305221327_0068(name=PigLatin:73.pig,status=SUCCEEDED) attempt:attempt_201305221327_0068_m_00_0(type=map,status=completed,starttime=22-May-2013 17:10:26,endtime=22-May-2013 17:10:32) attempt:attempt_201305221327_0068_m_02_0(type=setup,status=completed,starttime=22-May-2013 17:10:17,endtime=22-May-2013 17:10:26) attempt:attempt_201305221327_0068_m_01_0(type=cleanup,status=completed,starttime=22-May-2013 17:10:32,endtime=22-May-2013 17:10:38) job: job_201305221327_0069(name=PigLatin:73.pig,status=SUCCEEDED) attempt:attempt_201305221327_0069_m_00_0(type=map,status=completed,starttime=22-May-2013 17:10:53,endtime=22-May-2013 17:10:59) attempt:attempt_201305221327_0069_r_00_0(type=reduce,status=completed,starttime=22-May-2013 17:10:59,endtime=22-May-2013 17:11:11) attempt:attempt_201305221327_0069_m_02_0(type=setup,status=completed,starttime=22-May-2013 17:10:44,endtime=22-May-2013 17:10:53) attempt:attempt_201305221327_0069_m_01_0(type=cleanup,status=completed,starttime=22-May-2013 17:11:11,endtime=22-May-2013 17:11:17) job: job_201305221327_0070(name=PigLatin:73.pig,status=SUCCEEDED) attempt:attempt_201305221327_0070_m_00_0(type=map,status=completed,starttime=22-May-2013 17:11:32,endtime=22-May-2013 17:11:38) attempt:attempt_201305221327_0070_r_00_0(type=reduce,status=completed,starttime=22-May-2013 17:11:38,endtime=22-May-2013 17:11:50) attempt:attempt_201305221327_0070_m_02_0(type=setup,status=completed,starttime=22-May-2013 17:11:23,endtime=22-May-2013 17:11:32) attempt:attempt_201305221327_0070_m_01_0(type=cleanup,status=completed,starttime=22-May-2013 17:11:50,endtime=22-May-2013 17:11:56) job: job_201305221327_0071(name=PigLatin:73.pig,status=FAILED) attempt:attempt_201305221327_0071_m_00_0(type=map,status=completed,starttime=22-May-2013 17:12:11,endtime=22-May-2013 17:12:17) attempt:attempt_201305221327_0071_m_01_0(type=map,status=completed,starttime=22-May-2013 17:12:17,endtime=22-May-2013 17:12:23) attempt:attempt_201305221327_0071_m_03_0(type=setup,status=completed,starttime=22-May-2013 17:12:02,endtime=22-May-2013 17:12:11) attempt:attempt_201305221327_0071_m_02_0(type=cleanup,status=completed,starttime=22-May-2013 17:13:11,endtime=22-May-2013 17:13:17) attempt:attempt_201305221327_0071_r_00_0(type=reduce,status=failed,starttime=22-May-2013 17:12:17,endtime=22-May-2013 17:12:29) attempt:attempt_201305221327_0071_r_00_1(type=reduce,status=failed,starttime=22-May-2013 17:12:35,endtime=22-May-2013 17:12:33) attempt:attempt_201305221327_0071_r_00_2(type=reduce,status=failed,starttime=22-May-2013 17:12:47,endtime=22-May-2013 17:12:43) attempt:attempt_201305221327_0071_r_00_3(type=reduce,status=failed,starttime=22-May-2013 17:12:59,endtime=22-May-2013 17:12:46) [WebHCat] Collecting task logs to hdfs -- Key: HIVE-4531 URL: https://issues.apache.org/jira/browse/HIVE-4531 Project: Hive Issue Type: New Feature Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4531-1.patch, HIVE-4531-2.patch, HIVE-4531-3.patch, HIVE-4531-4.patch, HIVE-4531-5.patch, samplestatusdirwithlist.tar.gz It would be nice we collect task logs after job finish. This is similar to what Amazon EMR does. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira