[GitHub] hive pull request: Fix lock/unlock pairing
Github user pavel-sakun closed the pull request at: https://github.com/apache/hive/pull/17 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (HIVE-860) Persistent distributed cache
[ https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048595#comment-14048595 ] Hive QA commented on HIVE-860: -- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653298/HIVE-860.patch {color:red}ERROR:{color} -1 due to 23 failed/errored test(s), 5610 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_partition_coltype org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join29 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_nulls org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_func1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_id2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_test_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_leadlag org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_skewjoinopt20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_null org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_between_in org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_streaming org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/643/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/643/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-643/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 23 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653298 Persistent distributed cache Key: HIVE-860 URL: https://issues.apache.org/jira/browse/HIVE-860 Project: Hive Issue Type: Improvement Affects Versions: 0.12.0 Reporter: Zheng Shao Assignee: Brock Noland Fix For: 0.14.0 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch DistributedCache is shared across multiple jobs, if the hdfs file name is the same. We need to make sure Hive put the same file into the same location every time and do not overwrite if the file content is the same. We can achieve 2 different results: A1. Files added with the same name, timestamp, and md5 in the same session will have a single copy in distributed cache. A2. Filed added with the same name, timestamp, and md5 will have a single copy in distributed cache. A2 has a bigger benefit in sharing but may raise a question on when Hive should clean it up in hdfs. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23153: Fix some test output files.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23153/ --- (Updated July 1, 2014, 8:03 a.m.) Review request for hive. Summary (updated) - Fix some test output files. Bugs: HIVE-5976 https://issues.apache.org/jira/browse/HIVE-5976 Repository: hive-git Description (updated) --- Update expected output files based on patch changes. Diffs (updated) - hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/CreateTableHook.java ec24531117203a5c75c62d0e5b54d5a43d37fa79 itests/custom-serde/src/main/java/org/apache/hadoop/hive/serde2/CustomTextSerDe.java PRE-CREATION itests/custom-serde/src/main/java/org/apache/hadoop/hive/serde2/CustomTextStorageFormatDescriptor.java PRE-CREATION itests/custom-serde/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/AbstractStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 41310661ced0616f6bee27af2b1195127e5230e8 ql/src/java/org/apache/hadoop/hive/ql/io/ORCFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/ParquetFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/RCFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/SequenceFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/StorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/StorageFormatFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/TextFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 60d54b6a04e1a9601342b0159387114f7b666338 ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 640b6b319ce84a875cc78cb8b29fa6bbc1067fc5 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 412a046488eaea42a6416c7cbd514715d37e249f ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f934ac4e3b736eed1b3060fa516124c67f9a2f87 ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 9c001c1495b423c19f3fa710c74f1bb1e24a08f4 ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java 0af25360ee6f3088c764f0c4d812f30d1eeb91d6 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 83d09c079f3ce035c4d905280a40611b41516356 ql/src/java/org/apache/hadoop/hive/ql/parse/StorageFormat.java PRE-CREATION ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java PRE-CREATION ql/src/test/queries/clientpositive/storage_format_descriptor.q PRE-CREATION ql/src/test/results/clientnegative/fileformat_bad_class.q.out ab1e9357c0a7d4e21816290fbf7ed99396932b92 ql/src/test/results/clientnegative/genericFileFormat.q.out 9613df95c8fc977c0ad1f717afa2db3870dfd904 ql/src/test/results/clientpositive/ctas.q.out 0040f3c9df690c44a1bc2f258cb075dbaaa585f3 ql/src/test/results/clientpositive/storage_format_descriptor.q.out PRE-CREATION ql/src/test/results/clientpositive/tez/ctas.q.out a58e16639d725c851cedfc7bb81d65c25f3c56c3 Diff: https://reviews.apache.org/r/23153/diff/ Testing --- Thanks, David Chen
Re: Review Request 23153: Fix some test output files.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23153/ --- (Updated July 1, 2014, 8:03 a.m.) Review request for hive. Bugs: HIVE-5976 https://issues.apache.org/jira/browse/HIVE-5976 Repository: hive-git Description --- Update expected output files based on patch changes. Diffs - hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/CreateTableHook.java ec24531117203a5c75c62d0e5b54d5a43d37fa79 itests/custom-serde/src/main/java/org/apache/hadoop/hive/serde2/CustomTextSerDe.java PRE-CREATION itests/custom-serde/src/main/java/org/apache/hadoop/hive/serde2/CustomTextStorageFormatDescriptor.java PRE-CREATION itests/custom-serde/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/AbstractStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 41310661ced0616f6bee27af2b1195127e5230e8 ql/src/java/org/apache/hadoop/hive/ql/io/ORCFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/ParquetFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/RCFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/SequenceFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/StorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/StorageFormatFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/TextFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 60d54b6a04e1a9601342b0159387114f7b666338 ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 640b6b319ce84a875cc78cb8b29fa6bbc1067fc5 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 412a046488eaea42a6416c7cbd514715d37e249f ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f934ac4e3b736eed1b3060fa516124c67f9a2f87 ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 9c001c1495b423c19f3fa710c74f1bb1e24a08f4 ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java 0af25360ee6f3088c764f0c4d812f30d1eeb91d6 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 83d09c079f3ce035c4d905280a40611b41516356 ql/src/java/org/apache/hadoop/hive/ql/parse/StorageFormat.java PRE-CREATION ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java PRE-CREATION ql/src/test/queries/clientpositive/storage_format_descriptor.q PRE-CREATION ql/src/test/results/clientnegative/fileformat_bad_class.q.out ab1e9357c0a7d4e21816290fbf7ed99396932b92 ql/src/test/results/clientnegative/genericFileFormat.q.out 9613df95c8fc977c0ad1f717afa2db3870dfd904 ql/src/test/results/clientpositive/ctas.q.out 0040f3c9df690c44a1bc2f258cb075dbaaa585f3 ql/src/test/results/clientpositive/storage_format_descriptor.q.out PRE-CREATION ql/src/test/results/clientpositive/tez/ctas.q.out a58e16639d725c851cedfc7bb81d65c25f3c56c3 Diff: https://reviews.apache.org/r/23153/diff/ Testing --- Thanks, David Chen
[jira] [Updated] (HIVE-5976) Decouple input formats from STORED as keywords
[ https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Chen updated HIVE-5976: - Attachment: HIVE-5976.3.patch Decouple input formats from STORED as keywords -- Key: HIVE-5976 URL: https://issues.apache.org/jira/browse/HIVE-5976 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5976.2.patch, HIVE-5976.3.patch, HIVE-5976.3.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch As noted in HIVE-5783, we hard code the input formats mapped to keywords. It'd be nice if there was a registration system so we didn't need to do that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5976) Decouple input formats from STORED as keywords
[ https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048632#comment-14048632 ] David Chen commented on HIVE-5976: -- I have posted a new patch that should fix some of the tests. Some of the test failures appear to be caused by slightly different output that Hive now prints due to this patch. The changes should fix the following tests: {{org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_ctas}} {{org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas}} Test failed because output files expected old parse tree dump. {{org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fileformat_bad_class}} Hive now outputs {{FAILED: SemanticException Cannot find class 'ClassDoesNotExist'}} if SerDe class is not found. {{org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_genericFileFormat}} Hive now prints format name in all caps in quotes. I am still looking into the other test failures. Decouple input formats from STORED as keywords -- Key: HIVE-5976 URL: https://issues.apache.org/jira/browse/HIVE-5976 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5976.2.patch, HIVE-5976.3.patch, HIVE-5976.3.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch As noted in HIVE-5783, we hard code the input formats mapped to keywords. It'd be nice if there was a registration system so we didn't need to do that. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23153: Fix some test output files.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23153/ --- (Updated July 1, 2014, 8:29 a.m.) Review request for hive. Bugs: HIVE-5976 https://issues.apache.org/jira/browse/HIVE-5976 Repository: hive-git Description (updated) --- Fix some test output files. Use JavaUtils.getClassLoader. Apply patch Diffs (updated) - hcatalog/core/src/main/java/org/apache/hive/hcatalog/cli/SemanticAnalysis/CreateTableHook.java ec24531117203a5c75c62d0e5b54d5a43d37fa79 itests/custom-serde/src/main/java/org/apache/hadoop/hive/serde2/CustomTextSerDe.java PRE-CREATION itests/custom-serde/src/main/java/org/apache/hadoop/hive/serde2/CustomTextStorageFormatDescriptor.java PRE-CREATION itests/custom-serde/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/AbstractStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 41310661ced0616f6bee27af2b1195127e5230e8 ql/src/java/org/apache/hadoop/hive/ql/io/ORCFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/ParquetFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/RCFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/SequenceFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/StorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/StorageFormatFactory.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/TextFileStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 60d54b6a04e1a9601342b0159387114f7b666338 ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 640b6b319ce84a875cc78cb8b29fa6bbc1067fc5 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g 412a046488eaea42a6416c7cbd514715d37e249f ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f934ac4e3b736eed1b3060fa516124c67f9a2f87 ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 9c001c1495b423c19f3fa710c74f1bb1e24a08f4 ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java 0af25360ee6f3088c764f0c4d812f30d1eeb91d6 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 399f92a6b8006e52891d7f864393846276a6c2b3 ql/src/java/org/apache/hadoop/hive/ql/parse/StorageFormat.java PRE-CREATION ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java PRE-CREATION ql/src/test/queries/clientpositive/storage_format_descriptor.q PRE-CREATION ql/src/test/results/clientnegative/fileformat_bad_class.q.out ab1e9357c0a7d4e21816290fbf7ed99396932b92 ql/src/test/results/clientnegative/genericFileFormat.q.out 9613df95c8fc977c0ad1f717afa2db3870dfd904 ql/src/test/results/clientpositive/ctas.q.out 0040f3c9df690c44a1bc2f258cb075dbaaa585f3 ql/src/test/results/clientpositive/storage_format_descriptor.q.out PRE-CREATION ql/src/test/results/clientpositive/tez/ctas.q.out a58e16639d725c851cedfc7bb81d65c25f3c56c3 Diff: https://reviews.apache.org/r/23153/diff/ Testing --- Thanks, David Chen
[jira] [Updated] (HIVE-5976) Decouple input formats from STORED as keywords
[ https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Chen updated HIVE-5976: - Attachment: HIVE-5976.4.patch Rebasing on trunk. Decouple input formats from STORED as keywords -- Key: HIVE-5976 URL: https://issues.apache.org/jira/browse/HIVE-5976 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5976.2.patch, HIVE-5976.3.patch, HIVE-5976.3.patch, HIVE-5976.4.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch As noted in HIVE-5783, we hard code the input formats mapped to keywords. It'd be nice if there was a registration system so we didn't need to do that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6586) Add new parameters to HiveConf.java after commit HIVE-6037 (also fix typos)
[ https://issues.apache.org/jira/browse/HIVE-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048645#comment-14048645 ] Lefty Leverenz commented on HIVE-6586: -- HIVE-6782 added hive.localize.resource.wait.interval hive.localize.resource.num.wait.attempts in 0.13.0. They aren't in patch HIVE-6037-0.13.0. Add new parameters to HiveConf.java after commit HIVE-6037 (also fix typos) --- Key: HIVE-6586 URL: https://issues.apache.org/jira/browse/HIVE-6586 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Lefty Leverenz Labels: TODOC14 HIVE-6037 puts the definitions of configuration parameters into the HiveConf.java file, but several recent jiras for release 0.13.0 introduce new parameters that aren't in HiveConf.java yet and some parameter definitions need to be altered for 0.13.0. This jira will patch HiveConf.java after HIVE-6037 gets committed. Also, four typos patched in HIVE-6582 need to be fixed in the new HiveConf.java. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 22926: Select columns by index instead of name
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22926/ --- (Updated July 1, 2014, 8:43 a.m.) Review request for hive. Changes --- Check ambiguous columns added negative tests Bugs: HIVE-494 https://issues.apache.org/jira/browse/HIVE-494 Repository: hive-git Description --- SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/ColumnInfo.java feb8558 ql/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g f448b16 ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 9c001c1 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1d8d764 ql/src/java/org/apache/hadoop/hive/ql/parse/TypeCheckProcFactory.java e7da289 ql/src/test/queries/clientnegative/select_by_column_index_negative0.q PRE-CREATION ql/src/test/queries/clientnegative/select_by_column_index_negative1.q PRE-CREATION ql/src/test/queries/clientnegative/select_by_column_index_negative2.q PRE-CREATION ql/src/test/queries/clientpositive/select_by_column_index.q PRE-CREATION ql/src/test/results/clientnegative/select_by_column_index_negative0.q.out PRE-CREATION ql/src/test/results/clientnegative/select_by_column_index_negative1.q.out PRE-CREATION ql/src/test/results/clientnegative/select_by_column_index_negative2.q.out PRE-CREATION ql/src/test/results/clientpositive/select_by_column_index.q.out PRE-CREATION Diff: https://reviews.apache.org/r/22926/diff/ Testing --- Thanks, Navis Ryu
[jira] [Updated] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-494: --- Attachment: HIVE-494.3.patch.txt Check ambiguous columns added negative tests Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Navis Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch, HIVE-494.2.patch.txt, HIVE-494.3.patch.txt, HIVE-494.D12153.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6782) HiveServer2Concurrency issue when running with tez intermittently, throwing org.apache.tez.dag.api.SessionNotRunning: Application not running error
[ https://issues.apache.org/jira/browse/HIVE-6782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048650#comment-14048650 ] Lefty Leverenz commented on HIVE-6782: -- *hive.localize.resource.wait.interval* *hive.localize.resource.num.wait.attempts* are documented in the wiki here: * [Configuration Properties -- Tez -- hive.localize.resource.wait.interval | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.localize.resource.wait.interval] * [Configuration Properties -- Tez -- hive.localize.resource.num.wait.attempts | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.localize.resource.num.wait.attempts] I also added a comment to HIVE-6586 so they won't get lost in the shuffle when HIVE-6037 changes HiveConf.java. HiveServer2Concurrency issue when running with tez intermittently, throwing org.apache.tez.dag.api.SessionNotRunning: Application not running error - Key: HIVE-6782 URL: https://issues.apache.org/jira/browse/HIVE-6782 Project: Hive Issue Type: Bug Components: Tez Reporter: Vikram Dixit K Assignee: Vikram Dixit K Fix For: 0.13.0, 0.14.0 Attachments: HIVE-6782.1.patch, HIVE-6782.10.patch, HIVE-6782.11.patch, HIVE-6782.2.patch, HIVE-6782.3.patch, HIVE-6782.4.patch, HIVE-6782.5.patch, HIVE-6782.6.patch, HIVE-6782.7.patch, HIVE-6782.8.patch, HIVE-6782.9.patch HiveServer2 concurrency is failing intermittently when using tez, throwing org.apache.tez.dag.api.SessionNotRunning: Application not running error -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7323) Date type stats in ORC sometimes go stale
[ https://issues.apache.org/jira/browse/HIVE-7323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048651#comment-14048651 ] Hive QA commented on HIVE-7323: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653308/HIVE-7323.1.patch.txt {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5671 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketizedhiveinputformat org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/644/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/644/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-644/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653308 Date type stats in ORC sometimes go stale - Key: HIVE-7323 URL: https://issues.apache.org/jira/browse/HIVE-7323 Project: Hive Issue Type: Bug Components: Statistics Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-7323.1.patch.txt I cannot make proper test case but sometimes min/max value in date type stats is changed in runtime. Stats for other type contains non-mutable values in it but date type stats contains DateWritable, which of inner value can be changed anytime. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7294) sql std auth - authorize show grant statements
[ https://issues.apache.org/jira/browse/HIVE-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-7294: Attachment: HIVE-7294.2.patch HIVE-7294.2.patch - also authorizes 'show role grant' statements. sql std auth - authorize show grant statements -- Key: HIVE-7294 URL: https://issues.apache.org/jira/browse/HIVE-7294 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-7294.1.patch, HIVE-7294.2.patch A non admin user should not be allowed to run show grant commands only for themselves or a role they belong to. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7040) TCP KeepAlive for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-7040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048669#comment-14048669 ] Nicolas ThiƩbaud commented on HIVE-7040: Let me know which one to focus on, I'd like to see merged one or the other. I don't mind closing this one in favor of HIVE-6679. TCP KeepAlive for HiveServer2 - Key: HIVE-7040 URL: https://issues.apache.org/jira/browse/HIVE-7040 Project: Hive Issue Type: Improvement Components: HiveServer2, Server Infrastructure Affects Versions: 0.13.1 Reporter: Nicolas ThiƩbaud Attachments: HIVE-7040.3.patch, HIVE-7040.patch, HIVE-7040.patch.2 Implement TCP KeepAlive for HiverServer 2 to avoid half open connections. A setting new is added. This works for ThriftBinaryCLIService with and without SSL. {code} property namehive.server2.tcp.keepalive/name valuetrue/value descriptionWhether to enable TCP keepalive for Hive Server 2/description /property {code} The default proposed value is true, in the same way this is the default for the metastore, see HIVE-1410. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23139: HIVE-7294 : sql std auth - authorize show grant statements
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23139/ --- (Updated July 1, 2014, 9:16 a.m.) Review request for hive. Changes --- HIVE-7294.2.patch - also authorizes 'show role grant' statements. Bugs: HIVE-7294 https://issues.apache.org/jira/browse/HIVE-7294 Repository: hive-git Description --- See jira Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java d8d900b ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java e4f5aac ql/src/test/queries/clientnegative/authorization_insertoverwrite_nodel.q 90fe6e1 ql/src/test/queries/clientnegative/authorization_priv_current_role_neg.q bbf3b66 ql/src/test/queries/clientnegative/authorization_role_grant_otherrole.q PRE-CREATION ql/src/test/queries/clientnegative/authorization_role_grant_otheruser.q PRE-CREATION ql/src/test/queries/clientnegative/authorization_show_grant_otherrole.q PRE-CREATION ql/src/test/queries/clientnegative/authorization_show_grant_otheruser_all.q PRE-CREATION ql/src/test/queries/clientnegative/authorization_show_grant_otheruser_alltabs.q PRE-CREATION ql/src/test/queries/clientnegative/authorization_show_grant_otheruser_wtab.q PRE-CREATION ql/src/test/queries/clientpositive/authorization_grant_public_role.q 8473178 ql/src/test/queries/clientpositive/authorization_grant_table_priv.q 02d364e ql/src/test/queries/clientpositive/authorization_insert.q 5de6f50 ql/src/test/queries/clientpositive/authorization_revoke_table_priv.q ccda3b5 ql/src/test/queries/clientpositive/authorization_role_grant2.q fd6aa38 ql/src/test/queries/clientpositive/authorization_show_grant.q PRE-CREATION ql/src/test/queries/clientpositive/authorization_view_sqlstd.q bd7bbfe ql/src/test/results/clientnegative/authorization_insertoverwrite_nodel.q.out de1d230 ql/src/test/results/clientnegative/authorization_role_grant_otherrole.q.out PRE-CREATION ql/src/test/results/clientnegative/authorization_role_grant_otheruser.q.out PRE-CREATION ql/src/test/results/clientnegative/authorization_show_grant_otherrole.q.out PRE-CREATION ql/src/test/results/clientnegative/authorization_show_grant_otheruser_all.q.out PRE-CREATION ql/src/test/results/clientnegative/authorization_show_grant_otheruser_alltabs.q.out PRE-CREATION ql/src/test/results/clientnegative/authorization_show_grant_otheruser_wtab.q.out PRE-CREATION ql/src/test/results/clientpositive/authorization_grant_public_role.q.out a0a45f7 ql/src/test/results/clientpositive/authorization_grant_table_priv.q.out 9a6ec17 ql/src/test/results/clientpositive/authorization_insert.q.out f94d9a9 ql/src/test/results/clientpositive/authorization_role_grant2.q.out 2e94af3 ql/src/test/results/clientpositive/authorization_show_grant.q.out PRE-CREATION ql/src/test/results/clientpositive/authorization_view_sqlstd.q.out 50c0247 Diff: https://reviews.apache.org/r/23139/diff/ Testing --- test cases included. Thanks, Thejas Nair
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048713#comment-14048713 ] Hive QA commented on HIVE-7231: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653314/HIVE-7231.6.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5671 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthBinary.org.apache.hive.minikdc.TestJdbcWithMiniKdcSQLAuthBinary {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/645/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/645/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-645/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653314 Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5275) HiveServer2 should respect hive.aux.jars.path property and add aux jars to distributed cache
[ https://issues.apache.org/jira/browse/HIVE-5275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048786#comment-14048786 ] Hari Sekhon commented on HIVE-5275: --- I've observed this but on IBM BigInsights 2.1 which has many integration bugs so I don't know if that's just IBM having done something funny or if this is a widespread problem. HiveServer2 should respect hive.aux.jars.path property and add aux jars to distributed cache Key: HIVE-5275 URL: https://issues.apache.org/jira/browse/HIVE-5275 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Alex Favaro HiveServer2 currently ignores the hive.aux.jars.path property in hive-site.xml. That means that the only way to use a custom SerDe is to add it to AUX_CLASSPATH on the server and manually distribute the jar to the cluster nodes. Hive CLI does this automatically when hive.aux.jars.path is set. It would be nice if HiverServer2 did the same. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7324) CBO: provide a mechanism to test CBO features based on table stats only (w/o table data)
[ https://issues.apache.org/jira/browse/HIVE-7324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-7324: --- Description: Since lot of the CBO work is focused on planning, it will be nice to be able to run explain query to test CBO features. TPCDS has a rich enough schema and query set. So the patch loads a dump TPCDS(Scale 1) stats. 1. TestCBO shows a way to load stats from a dump and run explain on a tpcds query. The output is currently dumped to Sys.out. This can be improved by hooking to QTestUtil, but hopefully this is a good start. 2. Uncovered couple of issues in the process of testing this: a) PartitionPruner fails on 'true' constants. For e.g. you will get an error for {code:sql} SELECT * FROM t WHERE partCol 100 AND true {code} This gets exposed because the predicates coming out of Optiq can contain 'true' predicates. b) OpTraitsRulesProcFactory:checkBucketedTable checks that number of files = numBuckets. This fails because there are no dataFiles. So I have altered it to catch exceptions and assume bucketMapJoinConvertible = false if an exception is encountered here. Uploading with these changes in this patch for now. Will carve them out as separate patches. [~ashutoshc], [~hagleitn] can you please take a look. was: Since lot of the CBO work is focused on planning, it will be nice to be able to run explain query to test CBO features. TPCDS has a rich enough schema and query set. So the patch loads a dump TPCDS(Scale 1) stats. 1. TestCBO shows a way to load stats from a dump and run explain on a tpcds query. The output is currently dumped to Sys.out. This can be improved by hooking to QTestUtil, but hopefully this is a good start. 2. Uncovered couple of issues in the process of testing this: a) PartitionPruner fails on 'true' constants. For e.g. you will get an error for {code} select * from t where partCol 100 and true {code} This gets exposed because the predicates coming out of Optiq can contain 'true' predicates. b) OpTraitsRulesProcFactory:checkBucketedTable checks that number of files = numBuckets. This fails because there are no dataFiles. So I have altered it to catch exceptions and assume bucketMapJoinConvertible = false if an exception is encountered here. Uploading with these changes in this patch for now. Will carve them out as separate patches. [~ashutoshc], [~hagleitn] can you please take a look. CBO: provide a mechanism to test CBO features based on table stats only (w/o table data) Key: HIVE-7324 URL: https://issues.apache.org/jira/browse/HIVE-7324 Project: Hive Issue Type: Sub-task Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-7324.1.patch Since lot of the CBO work is focused on planning, it will be nice to be able to run explain query to test CBO features. TPCDS has a rich enough schema and query set. So the patch loads a dump TPCDS(Scale 1) stats. 1. TestCBO shows a way to load stats from a dump and run explain on a tpcds query. The output is currently dumped to Sys.out. This can be improved by hooking to QTestUtil, but hopefully this is a good start. 2. Uncovered couple of issues in the process of testing this: a) PartitionPruner fails on 'true' constants. For e.g. you will get an error for {code:sql} SELECT * FROM t WHERE partCol 100 AND true {code} This gets exposed because the predicates coming out of Optiq can contain 'true' predicates. b) OpTraitsRulesProcFactory:checkBucketedTable checks that number of files = numBuckets. This fails because there are no dataFiles. So I have altered it to catch exceptions and assume bucketMapJoinConvertible = false if an exception is encountered here. Uploading with these changes in this patch for now. Will carve them out as separate patches. [~ashutoshc], [~hagleitn] can you please take a look. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-860) Persistent distributed cache
[ https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048892#comment-14048892 ] Hive QA commented on HIVE-860: -- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653325/HIVE-860.patch {color:red}ERROR:{color} -1 due to 26 failed/errored test(s), 5656 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join30 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_cast org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_empty_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_compute_stats_long org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby7_map_skew org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_cube1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_unused org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_columnarserde org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_inputddl5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_leadlag_queries org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mi org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_create org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_ppd_decimal org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppr_pushdown2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view org.apache.hadoop.hive.cli.TestContribCliDriver.testCliDriver_udtf_output_on_close org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.hwi.TestHWISessionManager.testHiveDriver {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/647/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/647/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-647/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 26 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653325 Persistent distributed cache Key: HIVE-860 URL: https://issues.apache.org/jira/browse/HIVE-860 Project: Hive Issue Type: Improvement Affects Versions: 0.12.0 Reporter: Zheng Shao Assignee: Brock Noland Fix For: 0.14.0 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch DistributedCache is shared across multiple jobs, if the hdfs file name is the same. We need to make sure Hive put the same file into the same location every time and do not overwrite if the file content is the same. We can achieve 2 different results: A1. Files added with the same name, timestamp, and md5 in the same session will have a single copy in distributed cache. A2. Filed added with the same name, timestamp, and md5 will have a single copy in distributed cache. A2 has a bigger benefit in sharing but may raise a question on when Hive should clean it up in hdfs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7292) Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Hammerbacher updated HIVE-7292: Description: Spark as an open-source data analytics cluster computing framework has gained significant momentum recently. Many Hive users already have Spark installed as their computing backbone. To take advantages of Hive, they still need to have either MapReduce or Tez on their cluster. This initiative will provide user a new alternative so that those user can consolidate their backend. Secondly, providing such an alternative further increases Hive's adoption as it exposes Spark users to a viable, feature-rich de facto standard SQL tools on Hadoop. Finally, allowing Hive to run on Spark also has performance benefits. Hive queries, especially those involving multiple reducer stages, will run faster, thus improving user experience as Tez does. This is an umbrella JIRA which will cover many coming subtask. Design doc will be attached here shortly, and will be on the wiki as well. Feedback from the community is greatly appreciated! was: Spark as an open-source data analytics cluster computing framework has gained significant momentum recently. Many Hive users already have Spark installed as their computing backbone. To take advantages of Hive, they still need to have either MapReduce or Tez on their cluster. This initiative will provide user a new alternative so that those user can consolidate their backend. Secondly, providing such an alternative further increases Hive's adoption as it exposes Spark users to a viable, feature-rich de facto standard SQL tools on Hadoop. Finally, allowing Hive to run on Spark also has performance benefits. Hive queries, especially those involving multiple reducer stages, will run faster, thus improving user experience as Tez does. This is an umber JIRA which will cover many coming subtask. Design doc will be attached here shortly, and will be on the wiki as well. Feedback from the community is greatly appreciated! Hive on Spark - Key: HIVE-7292 URL: https://issues.apache.org/jira/browse/HIVE-7292 Project: Hive Issue Type: Improvement Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: Hive-on-Spark.pdf Spark as an open-source data analytics cluster computing framework has gained significant momentum recently. Many Hive users already have Spark installed as their computing backbone. To take advantages of Hive, they still need to have either MapReduce or Tez on their cluster. This initiative will provide user a new alternative so that those user can consolidate their backend. Secondly, providing such an alternative further increases Hive's adoption as it exposes Spark users to a viable, feature-rich de facto standard SQL tools on Hadoop. Finally, allowing Hive to run on Spark also has performance benefits. Hive queries, especially those involving multiple reducer stages, will run faster, thus improving user experience as Tez does. This is an umbrella JIRA which will cover many coming subtask. Design doc will be attached here shortly, and will be on the wiki as well. Feedback from the community is greatly appreciated! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7292) Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048936#comment-14048936 ] niraj rai commented on HIVE-7292: - I am in OOO, so, the replying to the email might get delayed. Please reach out to me at (408) 799-8605 if you need something urgent. Regards Niraj Hive on Spark - Key: HIVE-7292 URL: https://issues.apache.org/jira/browse/HIVE-7292 Project: Hive Issue Type: Improvement Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: Hive-on-Spark.pdf Spark as an open-source data analytics cluster computing framework has gained significant momentum recently. Many Hive users already have Spark installed as their computing backbone. To take advantages of Hive, they still need to have either MapReduce or Tez on their cluster. This initiative will provide user a new alternative so that those user can consolidate their backend. Secondly, providing such an alternative further increases Hive's adoption as it exposes Spark users to a viable, feature-rich de facto standard SQL tools on Hadoop. Finally, allowing Hive to run on Spark also has performance benefits. Hive queries, especially those involving multiple reducer stages, will run faster, thus improving user experience as Tez does. This is an umbrella JIRA which will cover many coming subtask. Design doc will be attached here shortly, and will be on the wiki as well. Feedback from the community is greatly appreciated! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive
[ https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049027#comment-14049027 ] Jason Dere commented on HIVE-7090: -- I don't think the failure in TestHCatLoader is related, this passes locally for me Support session-level temporary tables in Hive -- Key: HIVE-7090 URL: https://issues.apache.org/jira/browse/HIVE-7090 Project: Hive Issue Type: Bug Components: SQL Reporter: Gunther Hagleitner Assignee: Jason Dere Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch, HIVE-7090.3.patch, HIVE-7090.4.patch, HIVE-7090.5.patch, HIVE-7090.6.patch It's common to see sql scripts that create some temporary table as an intermediate result, run some additional queries against it and then clean up at the end. We should support temporary tables properly, meaning automatically manage the life cycle and make sure the visibility is restricted to the creating connection/session. Without these it's common to see left over tables in meta-store or weird errors with clashing tmp table names. Proposed syntax: CREATE TEMPORARY TABLE CTAS, CTL, INSERT INTO, should all be supported as usual. Knowing that a user wants a temp table can enable us to further optimize access to it. E.g.: temp tables should be kept in memory where possible, compactions and merging table files aren't required, ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive
[ https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049026#comment-14049026 ] Jason Dere commented on HIVE-7090: -- [~brocknoland], does the patch look okay? Support session-level temporary tables in Hive -- Key: HIVE-7090 URL: https://issues.apache.org/jira/browse/HIVE-7090 Project: Hive Issue Type: Bug Components: SQL Reporter: Gunther Hagleitner Assignee: Jason Dere Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch, HIVE-7090.3.patch, HIVE-7090.4.patch, HIVE-7090.5.patch, HIVE-7090.6.patch It's common to see sql scripts that create some temporary table as an intermediate result, run some additional queries against it and then clean up at the end. We should support temporary tables properly, meaning automatically manage the life cycle and make sure the visibility is restricted to the creating connection/session. Without these it's common to see left over tables in meta-store or weird errors with clashing tmp table names. Proposed syntax: CREATE TEMPORARY TABLE CTAS, CTL, INSERT INTO, should all be supported as usual. Knowing that a user wants a temp table can enable us to further optimize access to it. E.g.: temp tables should be kept in memory where possible, compactions and merging table files aren't required, ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7205) Wrong results when union all of grouping followed by group by with correlation optimization
[ https://issues.apache.org/jira/browse/HIVE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049060#comment-14049060 ] Hive QA commented on HIVE-7205: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12651916/HIVE-7205.2.patch.txt {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 5657 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_multiMapJoin2 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/648/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/648/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-648/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12651916 Wrong results when union all of grouping followed by group by with correlation optimization --- Key: HIVE-7205 URL: https://issues.apache.org/jira/browse/HIVE-7205 Project: Hive Issue Type: Bug Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: dima machlin Assignee: Navis Priority: Critical Attachments: HIVE-7205.1.patch.txt, HIVE-7205.2.patch.txt use case : table TBL (a string,b string) contains single row : 'a','a' the following query : {code:sql} select b, sum(cc) from ( select b,count(1) as cc from TBL group by b union all select a as b,count(1) as cc from TBL group by a ) z group by b {code} returns a 1 a 1 while set hive.optimize.correlation=true; if we change set hive.optimize.correlation=false; it returns correct results : a 2 The plan with correlation optimization : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL b (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a) b) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL a) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_TABLE_OR_COL cc (TOK_GROUPBY (TOK_TABLE_OR_COL b STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias - Map Operator Tree: null-subquery1:z-subquery1:TBL TableScan alias: TBL Select Operator expressions: expr: b type: string outputColumnNames: b Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: b type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions:
[jira] [Commented] (HIVE-7307) Lack of synchronization for TxnHandler#getDbConn()
[ https://issues.apache.org/jira/browse/HIVE-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049066#comment-14049066 ] Alan Gates commented on HIVE-7307: -- [~ted_yu], not sure why we should be synchronizing calls to connPool. Are you worried that they are not thread safe? Lack of synchronization for TxnHandler#getDbConn() -- Key: HIVE-7307 URL: https://issues.apache.org/jira/browse/HIVE-7307 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor TxnHandler#getDbConn() accesses connPool without holding lock on TxnHandler.class {code} Connection dbConn = connPool.getConnection(); dbConn.setAutoCommit(false); {code} null check should be performed on the return value, dbConn. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HIVE-7307) Lack of synchronization for TxnHandler#getDbConn()
[ https://issues.apache.org/jira/browse/HIVE-7307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu resolved HIVE-7307. -- Resolution: Later Lack of synchronization for TxnHandler#getDbConn() -- Key: HIVE-7307 URL: https://issues.apache.org/jira/browse/HIVE-7307 Project: Hive Issue Type: Bug Reporter: Ted Yu Priority: Minor TxnHandler#getDbConn() accesses connPool without holding lock on TxnHandler.class {code} Connection dbConn = connPool.getConnection(); dbConn.setAutoCommit(false); {code} null check should be performed on the return value, dbConn. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive
[ https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049124#comment-14049124 ] Alan Gates commented on HIVE-7090: -- I believe we need to solve the views issue, as being able to create a view on a table when others can see the view and not the table is bogus. Other than that I'm +1 on the patch. Support session-level temporary tables in Hive -- Key: HIVE-7090 URL: https://issues.apache.org/jira/browse/HIVE-7090 Project: Hive Issue Type: Bug Components: SQL Reporter: Gunther Hagleitner Assignee: Jason Dere Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch, HIVE-7090.3.patch, HIVE-7090.4.patch, HIVE-7090.5.patch, HIVE-7090.6.patch It's common to see sql scripts that create some temporary table as an intermediate result, run some additional queries against it and then clean up at the end. We should support temporary tables properly, meaning automatically manage the life cycle and make sure the visibility is restricted to the creating connection/session. Without these it's common to see left over tables in meta-store or weird errors with clashing tmp table names. Proposed syntax: CREATE TEMPORARY TABLE CTAS, CTL, INSERT INTO, should all be supported as usual. Knowing that a user wants a temp table can enable us to further optimize access to it. E.g.: temp tables should be kept in memory where possible, compactions and merging table files aren't required, ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Issue Comment Deleted] (HIVE-7292) Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7292: --- Comment: was deleted (was: I am in OOO, so, the replying to the email might get delayed. Please reach out to me at (408) 799-8605 if you need something urgent. Regards Niraj ) Hive on Spark - Key: HIVE-7292 URL: https://issues.apache.org/jira/browse/HIVE-7292 Project: Hive Issue Type: Improvement Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: Hive-on-Spark.pdf Spark as an open-source data analytics cluster computing framework has gained significant momentum recently. Many Hive users already have Spark installed as their computing backbone. To take advantages of Hive, they still need to have either MapReduce or Tez on their cluster. This initiative will provide user a new alternative so that those user can consolidate their backend. Secondly, providing such an alternative further increases Hive's adoption as it exposes Spark users to a viable, feature-rich de facto standard SQL tools on Hadoop. Finally, allowing Hive to run on Spark also has performance benefits. Hive queries, especially those involving multiple reducer stages, will run faster, thus improving user experience as Tez does. This is an umbrella JIRA which will cover many coming subtask. Design doc will be attached here shortly, and will be on the wiki as well. Feedback from the community is greatly appreciated! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7292) Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049138#comment-14049138 ] niraj rai commented on HIVE-7292: - I am in OOO, so, the replying to the email might get delayed. Hive on Spark - Key: HIVE-7292 URL: https://issues.apache.org/jira/browse/HIVE-7292 Project: Hive Issue Type: Improvement Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: Hive-on-Spark.pdf Spark as an open-source data analytics cluster computing framework has gained significant momentum recently. Many Hive users already have Spark installed as their computing backbone. To take advantages of Hive, they still need to have either MapReduce or Tez on their cluster. This initiative will provide user a new alternative so that those user can consolidate their backend. Secondly, providing such an alternative further increases Hive's adoption as it exposes Spark users to a viable, feature-rich de facto standard SQL tools on Hadoop. Finally, allowing Hive to run on Spark also has performance benefits. Hive queries, especially those involving multiple reducer stages, will run faster, thus improving user experience as Tez does. This is an umbrella JIRA which will cover many coming subtask. Design doc will be attached here shortly, and will be on the wiki as well. Feedback from the community is greatly appreciated! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7090) Support session-level temporary tables in Hive
[ https://issues.apache.org/jira/browse/HIVE-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049146#comment-14049146 ] Jason Dere commented on HIVE-7090: -- Will add the view check as a followup item, I think this can be done during semantic analysis of the view creation. Support session-level temporary tables in Hive -- Key: HIVE-7090 URL: https://issues.apache.org/jira/browse/HIVE-7090 Project: Hive Issue Type: Bug Components: SQL Reporter: Gunther Hagleitner Assignee: Jason Dere Attachments: HIVE-7090.1.patch, HIVE-7090.2.patch, HIVE-7090.3.patch, HIVE-7090.4.patch, HIVE-7090.5.patch, HIVE-7090.6.patch It's common to see sql scripts that create some temporary table as an intermediate result, run some additional queries against it and then clean up at the end. We should support temporary tables properly, meaning automatically manage the life cycle and make sure the visibility is restricted to the creating connection/session. Without these it's common to see left over tables in meta-store or weird errors with clashing tmp table names. Proposed syntax: CREATE TEMPORARY TABLE CTAS, CTL, INSERT INTO, should all be supported as usual. Knowing that a user wants a temp table can enable us to further optimize access to it. E.g.: temp tables should be kept in memory where possible, compactions and merging table files aren't required, ... -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7282) HCatLoader fail to load Orc map with null key
[ https://issues.apache.org/jira/browse/HIVE-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049228#comment-14049228 ] Sushanth Sowmyan commented on HIVE-7282: While this protects the difference between orc and rcfile from HCat, HIVE-5020 is about the differences in behaviour between rcfile and orc in how they handle nulls in maps, and should not be closed until hive has a consistent behaviour. I would actually prefer to solve this in a consistent manner in hive before applying this to hcat, as explained in comments in that jira. I'll try to revive the discussion there. HCatLoader fail to load Orc map with null key - Key: HIVE-7282 URL: https://issues.apache.org/jira/browse/HIVE-7282 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Assignee: Daniel Dai Fix For: 0.14.0 Attachments: HIVE-7282-1.patch, HIVE-7282-2.patch Here is the stack: Get exception: AttemptID:attempt_1403634189382_0011_m_00_0 Info:Error: org.apache.pig.backend.executionengine.ExecException: ERROR 6018: Error converting read value to tuple at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:76) at org.apache.hive.hcatalog.pig.HCatLoader.getNext(HCatLoader.java:58) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1557) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) Caused by: java.lang.NullPointerException at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToPigMap(PigHCatUtil.java:469) at org.apache.hive.hcatalog.pig.PigHCatUtil.extractPigObject(PigHCatUtil.java:404) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:456) at org.apache.hive.hcatalog.pig.PigHCatUtil.transformToTuple(PigHCatUtil.java:374) at org.apache.hive.hcatalog.pig.HCatBaseLoader.getNext(HCatBaseLoader.java:64) ... 13 more -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7127) Handover more details on exception in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049232#comment-14049232 ] Hive QA commented on HIVE-7127: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12652341/HIVE-7127.5.patch.txt {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5657 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/649/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/649/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-649/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12652341 Handover more details on exception in hiveserver2 - Key: HIVE-7127 URL: https://issues.apache.org/jira/browse/HIVE-7127 Project: Hive Issue Type: Improvement Components: JDBC Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-7127.1.patch.txt, HIVE-7127.2.patch.txt, HIVE-7127.4.patch.txt, HIVE-7127.5.patch.txt Currently, JDBC hands over exception message and error codes. But it's not helpful for debugging. {noformat} org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 1:0 cannot recognize input near 'createa' 'asd' 'EOF' at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121) at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:109) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:231) at org.apache.hive.beeline.Commands.execute(Commands.java:736) at org.apache.hive.beeline.Commands.sql(Commands.java:657) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:889) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:744) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:459) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:442) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) {noformat} With this patch, JDBC client can get more details on hiveserver2. {noformat} Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 1:0 cannot recognize input near 'createa' 'asd' 'EOF' at org.apache.hive.service.cli.operation.SQLOperation.prepare(Unknown Source) at org.apache.hive.service.cli.operation.SQLOperation.run(Unknown Source) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(Unknown Source) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(Unknown Source) at org.apache.hive.service.cli.CLIService.executeStatementAsync(Unknown Source) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(Unknown Source) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(Unknown Source) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(Unknown Source) at org.apache.thrift.ProcessFunction.process(Unknown Source) at org.apache.thrift.TBaseProcessor.process(Unknown Source) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(Unknown Source) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5020) HCat reading null-key map entries causes NPE
[ https://issues.apache.org/jira/browse/HIVE-5020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049238#comment-14049238 ] Sushanth Sowmyan commented on HIVE-5020: Sorry for the late response to this jira, and thanks for the input, all. I'd initially wanted to give it time for more people to respond, and then this fell by the wayside. Thrift structures do not support map null keys. I agree that sortedness is not important for maps, and in fact, we should not guarantee it for something that's just called a map. And while I'd like to see a usecase for nulls in keys supported, it looks like the conventional hive semantics for maps ignores null keys, and changing rcfile users so that they suddenly start getting null keys is a recipe for trouble for a lot of users. So having orc map to rc behaviour, and make that the standard hive behaviour might make more sense. [~owen.omalley]/[~prasanth_j], could you comment on what you think the impact of changing orc behaviour that way might be? HCat should adopt whatever behaviour we standardize on for hive, and can follow after that. HCat reading null-key map entries causes NPE Key: HIVE-5020 URL: https://issues.apache.org/jira/browse/HIVE-5020 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Currently, if someone has a null key in a map, HCatInputFormat will terminate with an NPE while trying to read it. {noformat} java.lang.NullPointerException at java.lang.String.compareTo(String.java:1167) at java.lang.String.compareTo(String.java:92) at java.util.TreeMap.put(TreeMap.java:545) at org.apache.hcatalog.data.HCatRecordSerDe.serializeMap(HCatRecordSerDe.java:222) at org.apache.hcatalog.data.HCatRecordSerDe.serializeField(HCatRecordSerDe.java:198) at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:53) at org.apache.hcatalog.data.LazyHCatRecord.get(LazyHCatRecord.java:97) at org.apache.hcatalog.mapreduce.HCatRecordReader.nextKeyValue(HCatRecordReader.java:203) {noformat} This is because we use a TreeMap to preserve order of elements in the map when reading from the underlying storage/serde. This problem is easily fixed in a number of ways: a) Switch to HashMap, which allows null keys. That does not preserve order of keys, which should not be important for map fields, but if we desire that, we have a solution for that too - LinkedHashMap, which would both retain order and allow us to insert null keys into the map. b) Ignore null keyed entries - check if the field we read is null, and if it is, then ignore that item in the record altogether. This way, HCat is robust in what it does - it does not terminate with an NPE, and it does not allow null keys in maps that might be problematic to layers above us that are not used to seeing nulls as keys in maps. Why do I bring up the second fix? First, I bring it up because of the way we discovered this bug. When reading from an RCFile, we do not notice this bug. If the same query that produced the RCFile instead produces an Orcfile, and we try reading from it, we see this problem. RCFile seems to be quietly stripping any null key entries, whereas Orc retains them. This is why we didn't notice this problem for a long while, and suddenly, now, we are. Now, if we fix our code to allow nulls in map keys through to layers above, we expose layers above to this change, which may then cause them to break. (Technically, this is stretching the case because we already break now if they care) More importantly, though, we have a case now, where the same data will be exposed differently if it were stored as orc or if it were stored as rcfile. And as a layer that is supposed to make storage invisible to the end user, HCat should attempt to provide some consistency in how data behaves to the end user. Secondly, whether or not nulls should be supported as keys in Maps seems to be almost a religious view. Some people see it from a perspective of a mapping, which lends itself to a Sure, if we encounter a null, we map to this other value kind of a view, whereas other people view it from a lookup index kind of view, which lends itself to a null as a key makes no sense - What kind of lookup do you expect to perform? kind of view. Both views have their points, and it makes sense to see if we need to support it. That said... There is another important concern at hand here: nulls in map keys might be due to bad data(corruption or loading error), and by stripping them, we might be silently hiding that from the user. So silent stripping is bad. This is an important point that does steer me towards the former approach, of passing it on to layers above, and
[jira] [Updated] (HIVE-7325) Support non-constant expressions for MAP type indices.
[ https://issues.apache.org/jira/browse/HIVE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mala Chikka Kempanna updated HIVE-7325: --- Summary: Support non-constant expressions for MAP type indices. (was: Support non-constant expressions for MAP indexes.) Support non-constant expressions for MAP type indices. -- Key: HIVE-7325 URL: https://issues.apache.org/jira/browse/HIVE-7325 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Mala Chikka Kempanna Fix For: 0.14.0 Here is my sample: CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,D:BatchDate,D:Country) TBLPROPERTIES (hbase.table.name = RECORD); CREATE TABLE KEY_RECORD(KeyValue String, RecordId mapstring,string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key, K:) TBLPROPERTIES (hbase.table.name = KEY_RECORD); The following join statement doesn't work. SELECT a.*, b.* from KEY_RECORD a join RECORD b WHERE a.RecordId[b.RecordID] is not null; FAILED: SemanticException 2:16 Non-constant expression for map indexes not supported. Error encountered near token 'RecordID' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7325) Support non-constant expressions for MAP indexes.
Mala Chikka Kempanna created HIVE-7325: -- Summary: Support non-constant expressions for MAP indexes. Key: HIVE-7325 URL: https://issues.apache.org/jira/browse/HIVE-7325 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Mala Chikka Kempanna Fix For: 0.14.0 Here is my sample: CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,D:BatchDate,D:Country) TBLPROPERTIES (hbase.table.name = RECORD); CREATE TABLE KEY_RECORD(KeyValue String, RecordId mapstring,string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key, K:) TBLPROPERTIES (hbase.table.name = KEY_RECORD); The following join statement doesn't work. SELECT a.*, b.* from KEY_RECORD a join RECORD b WHERE a.RecordId[b.RecordID] is not null; FAILED: SemanticException 2:16 Non-constant expression for map indexes not supported. Error encountered near token 'RecordID' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7325) Support non-constant expressions for MAP type indices.
[ https://issues.apache.org/jira/browse/HIVE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-7325: -- Description: Here is my sample: {code} CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,D:BatchDate,D:Country) TBLPROPERTIES (hbase.table.name = RECORD); CREATE TABLE KEY_RECORD(KeyValue String, RecordId mapstring,string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key, K:) TBLPROPERTIES (hbase.table.name = KEY_RECORD); {code} The following join statement doesn't work. {code} SELECT a.*, b.* from KEY_RECORD a join RECORD b WHERE a.RecordId[b.RecordID] is not null; {code} FAILED: SemanticException 2:16 Non-constant expression for map indexes not supported. Error encountered near token 'RecordID' was: Here is my sample: CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,D:BatchDate,D:Country) TBLPROPERTIES (hbase.table.name = RECORD); CREATE TABLE KEY_RECORD(KeyValue String, RecordId mapstring,string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key, K:) TBLPROPERTIES (hbase.table.name = KEY_RECORD); The following join statement doesn't work. SELECT a.*, b.* from KEY_RECORD a join RECORD b WHERE a.RecordId[b.RecordID] is not null; FAILED: SemanticException 2:16 Non-constant expression for map indexes not supported. Error encountered near token 'RecordID' Support non-constant expressions for MAP type indices. -- Key: HIVE-7325 URL: https://issues.apache.org/jira/browse/HIVE-7325 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Mala Chikka Kempanna Fix For: 0.14.0 Here is my sample: {code} CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,D:BatchDate,D:Country) TBLPROPERTIES (hbase.table.name = RECORD); CREATE TABLE KEY_RECORD(KeyValue String, RecordId mapstring,string) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key, K:) TBLPROPERTIES (hbase.table.name = KEY_RECORD); {code} The following join statement doesn't work. {code} SELECT a.*, b.* from KEY_RECORD a join RECORD b WHERE a.RecordId[b.RecordID] is not null; {code} FAILED: SemanticException 2:16 Non-constant expression for map indexes not supported. Error encountered near token 'RecordID' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-5976) Decouple input formats from STORED as keywords
[ https://issues.apache.org/jira/browse/HIVE-5976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049302#comment-14049302 ] Hive QA commented on HIVE-5976: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653346/HIVE-5976.4.patch {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 5673 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_file_format org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_partition_wise_fileformat3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_storage_format_descriptor org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/650/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/650/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-650/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653346 Decouple input formats from STORED as keywords -- Key: HIVE-5976 URL: https://issues.apache.org/jira/browse/HIVE-5976 Project: Hive Issue Type: Task Reporter: Brock Noland Assignee: Brock Noland Attachments: HIVE-5976.2.patch, HIVE-5976.3.patch, HIVE-5976.3.patch, HIVE-5976.4.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch, HIVE-5976.patch As noted in HIVE-5783, we hard code the input formats mapped to keywords. It'd be nice if there was a registration system so we didn't need to do that. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-494) Select columns by index instead of name
[ https://issues.apache.org/jira/browse/HIVE-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049363#comment-14049363 ] Hive QA commented on HIVE-494: -- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653349/HIVE-494.3.patch.txt {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5675 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hive.hcatalog.streaming.TestStreaming.testRemainingTransactions org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/652/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/652/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-652/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653349 Select columns by index instead of name --- Key: HIVE-494 URL: https://issues.apache.org/jira/browse/HIVE-494 Project: Hive Issue Type: Wish Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Navis Priority: Minor Labels: SQL Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-494.D1641.1.patch, HIVE-494.2.patch.txt, HIVE-494.3.patch.txt, HIVE-494.D12153.1.patch SELECT mytable[0], mytable[2] FROM some_table_name mytable; ...should return the first and third columns, respectively, from mytable regardless of their column names. The need for names specifically is kind of silly when they just get translated into numbers anyway. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7262: --- Status: Patch Available (was: In Progress) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7262: --- Attachment: HIVE-7262.2.patch Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7262: --- Status: In Progress (was: Patch Available) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7326) Hive complains invalid column reference with group by having aggregate predicates
Hari Sankar Sivarama Subramaniyan created HIVE-7326: --- Summary: Hive complains invalid column reference with group by having aggregate predicates Key: HIVE-7326 URL: https://issues.apache.org/jira/browse/HIVE-7326 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan CREATE TABLE TestV1_Staples ( Item_Count INT, Ship_Priority STRING, Order_Priority STRING, Order_Status STRING, Order_Quantity DOUBLE, Sales_Total DOUBLE, Discount DOUBLE, Tax_Rate DOUBLE, Ship_Mode STRING, Fill_Time DOUBLE, Gross_Profit DOUBLE, Price DOUBLE, Ship_Handle_Cost DOUBLE, Employee_Name STRING, Employee_Dept STRING, Manager_Name STRING, Employee_Yrs_Exp DOUBLE, Employee_Salary DOUBLE, Customer_Name STRING, Customer_State STRING, Call_Center_Region STRING, Customer_Balance DOUBLE, Customer_Segment STRING, Prod_Type1 STRING, Prod_Type2 STRING, Prod_Type3 STRING, Prod_Type4 STRING, Product_Name STRING, Product_Container STRING, Ship_Promo STRING, Supplier_Name STRING, Supplier_Balance DOUBLE, Supplier_Region STRING, Supplier_State STRING, Order_ID STRING, Order_Year INT, Order_Month INT, Order_Day INT, Order_Date_ STRING, Order_Quarter STRING, Product_Base_Margin DOUBLE, Product_ID STRING, Receive_Time DOUBLE, Received_Date_ STRING, Ship_Date_ STRING, Ship_Charge DOUBLE, Total_Cycle_Time DOUBLE, Product_In_Stock STRING, PID INT, Market_Segment STRING ); Query that works: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (COUNT(s1.discount) = 822) AND (SUM(customer_balance) = 4074689.00041) ); Query that fails: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (SUM(customer_balance) = 4074689.00041) AND (COUNT(s1.discount) = 822) ); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7326) Hive complains invalid column reference with 'having' aggregate predicates
[ https://issues.apache.org/jira/browse/HIVE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-7326: Summary: Hive complains invalid column reference with 'having' aggregate predicates (was: Hive complains invalid column reference with group by having aggregate predicates) Hive complains invalid column reference with 'having' aggregate predicates -- Key: HIVE-7326 URL: https://issues.apache.org/jira/browse/HIVE-7326 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan CREATE TABLE TestV1_Staples ( Item_Count INT, Ship_Priority STRING, Order_Priority STRING, Order_Status STRING, Order_Quantity DOUBLE, Sales_Total DOUBLE, Discount DOUBLE, Tax_Rate DOUBLE, Ship_Mode STRING, Fill_Time DOUBLE, Gross_Profit DOUBLE, Price DOUBLE, Ship_Handle_Cost DOUBLE, Employee_Name STRING, Employee_Dept STRING, Manager_Name STRING, Employee_Yrs_Exp DOUBLE, Employee_Salary DOUBLE, Customer_Name STRING, Customer_State STRING, Call_Center_Region STRING, Customer_Balance DOUBLE, Customer_Segment STRING, Prod_Type1 STRING, Prod_Type2 STRING, Prod_Type3 STRING, Prod_Type4 STRING, Product_Name STRING, Product_Container STRING, Ship_Promo STRING, Supplier_Name STRING, Supplier_Balance DOUBLE, Supplier_Region STRING, Supplier_State STRING, Order_ID STRING, Order_Year INT, Order_Month INT, Order_Day INT, Order_Date_ STRING, Order_Quarter STRING, Product_Base_Margin DOUBLE, Product_ID STRING, Receive_Time DOUBLE, Received_Date_ STRING, Ship_Date_ STRING, Ship_Charge DOUBLE, Total_Cycle_Time DOUBLE, Product_In_Stock STRING, PID INT, Market_Segment STRING ); Query that works: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (COUNT(s1.discount) = 822) AND (SUM(customer_balance) = 4074689.00041) ); Query that fails: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (SUM(customer_balance) = 4074689.00041) AND (COUNT(s1.discount) = 822) ); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7294) sql std auth - authorize show grant statements
[ https://issues.apache.org/jira/browse/HIVE-7294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049471#comment-14049471 ] Hive QA commented on HIVE-7294: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653354/HIVE-7294.2.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5663 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/654/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/654/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-654/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653354 sql std auth - authorize show grant statements -- Key: HIVE-7294 URL: https://issues.apache.org/jira/browse/HIVE-7294 Project: Hive Issue Type: Bug Components: Authorization, SQLStandardAuthorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-7294.1.patch, HIVE-7294.2.patch A non admin user should not be allowed to run show grant commands only for themselves or a role they belong to. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7127) Handover more details on exception in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7127: Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks Szehon Ho, for the review! Handover more details on exception in hiveserver2 - Key: HIVE-7127 URL: https://issues.apache.org/jira/browse/HIVE-7127 Project: Hive Issue Type: Improvement Components: JDBC Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.14.0 Attachments: HIVE-7127.1.patch.txt, HIVE-7127.2.patch.txt, HIVE-7127.4.patch.txt, HIVE-7127.5.patch.txt Currently, JDBC hands over exception message and error codes. But it's not helpful for debugging. {noformat} org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 1:0 cannot recognize input near 'createa' 'asd' 'EOF' at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:121) at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:109) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:231) at org.apache.hive.beeline.Commands.execute(Commands.java:736) at org.apache.hive.beeline.Commands.sql(Commands.java:657) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:889) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:744) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:459) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:442) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) {noformat} With this patch, JDBC client can get more details on hiveserver2. {noformat} Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 1:0 cannot recognize input near 'createa' 'asd' 'EOF' at org.apache.hive.service.cli.operation.SQLOperation.prepare(Unknown Source) at org.apache.hive.service.cli.operation.SQLOperation.run(Unknown Source) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(Unknown Source) at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(Unknown Source) at org.apache.hive.service.cli.CLIService.executeStatementAsync(Unknown Source) at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(Unknown Source) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(Unknown Source) at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(Unknown Source) at org.apache.thrift.ProcessFunction.process(Unknown Source) at org.apache.thrift.TBaseProcessor.process(Unknown Source) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(Unknown Source) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7314) Wrong results of UDF when hive.cache.expr.evaluation is set
[ https://issues.apache.org/jira/browse/HIVE-7314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049510#comment-14049510 ] Ashutosh Chauhan commented on HIVE-7314: +1 Wrong results of UDF when hive.cache.expr.evaluation is set --- Key: HIVE-7314 URL: https://issues.apache.org/jira/browse/HIVE-7314 Project: Hive Issue Type: Bug Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: dima machlin Assignee: Navis Attachments: HIVE-7314.1.patch.txt It seems that the expression caching doesn't work when using UDF inside another UDF or a hive function. For example : tbl has one row : 'a','b' The following query : {code:sql} select concat(custUDF(a),' ', custUDF(b)) from tbl; {code} returns 'a a' seems to cache custUDF(a) and use it for custUDF(b). Same query without the concat works fine. Replacing the concat with another custom UDF also returns 'a a' -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-7262: --- Issue Type: Sub-task (was: Bug) Parent: HIVE-7318 Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Issue Comment Deleted] (HIVE-7292) Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7292: --- Comment: was deleted (was: I am in OOO, so, the replying to the email might get delayed. ) Hive on Spark - Key: HIVE-7292 URL: https://issues.apache.org/jira/browse/HIVE-7292 Project: Hive Issue Type: Improvement Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: Hive-on-Spark.pdf Spark as an open-source data analytics cluster computing framework has gained significant momentum recently. Many Hive users already have Spark installed as their computing backbone. To take advantages of Hive, they still need to have either MapReduce or Tez on their cluster. This initiative will provide user a new alternative so that those user can consolidate their backend. Secondly, providing such an alternative further increases Hive's adoption as it exposes Spark users to a viable, feature-rich de facto standard SQL tools on Hadoop. Finally, allowing Hive to run on Spark also has performance benefits. Hive queries, especially those involving multiple reducer stages, will run faster, thus improving user experience as Tez does. This is an umbrella JIRA which will cover many coming subtask. Design doc will be attached here shortly, and will be on the wiki as well. Feedback from the community is greatly appreciated! -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049519#comment-14049519 ] Hive QA commented on HIVE-7262: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653480/HIVE-7262.2.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5672 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/656/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/656/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-656/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653480 Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 22996: HIVE-7090 Support session-level temporary tables in Hive
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22996/#review47170 --- Hey Jason, looks good! Nice work! I have a question or two below and a bit nits. itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniMr.java https://reviews.apache.org/r/22996/#comment82798 When the error message does not contain the text we are looking for, putting the actual text in the error message is useful. I.e. when this assertion fails we won't have any idea what the actual message was. Thus the person debugging will have to actually make a code change and re-run the test to see what happened. ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java https://reviews.apache.org/r/22996/#comment82794 I am sure this is a stupid question but why are we subclassing HMSC? ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java https://reviews.apache.org/r/22996/#comment82782 nit: Is Partition columns are not supported on temporary tables and source table in CREATE TABLE LIKE is partitioned. more clear? ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java https://reviews.apache.org/r/22996/#comment82783 It looks to me like these can be private since they are not accessed outside this class? ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java https://reviews.apache.org/r/22996/#comment82780 These // should be javadoc style. ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java https://reviews.apache.org/r/22996/#comment82779 I understand it's coded today such that these three conf.get() will not return null. However I believe we should use Preconditions.checkNotNull here to ensure once that assumption is not true we don't give the dev/user a terrible error message. ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java https://reviews.apache.org/r/22996/#comment82785 nit: Is Cannot create directory more clear? ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java https://reviews.apache.org/r/22996/#comment82786 Setter is not being used. - Brock Noland On June 28, 2014, 12:35 a.m., Jason Dere wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22996/ --- (Updated June 28, 2014, 12:35 a.m.) Review request for hive, Gunther Hagleitner, Navis Ryu, and Harish Butani. Bugs: HIVE-7090 https://issues.apache.org/jira/browse/HIVE-7090 Repository: hive-git Description --- Temp tables managed in memory by SessionState. SessionHiveMetaStoreClient overrides table-related methods in HiveMetaStore to access the temp tables saved in the SessionState when appropriate. Diffs - itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcWithMiniMr.java 9fb7550 itests/qtest/testconfiguration.properties 1462ecd metastore/if/hive_metastore.thrift cc802c6 metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 9e8d912 ql/src/java/org/apache/hadoop/hive/ql/Context.java abc4290 ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java d8d900b ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 4d35176 ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 3df2690 ql/src/java/org/apache/hadoop/hive/ql/parse/ColumnStatsSemanticAnalyzer.java 1270520 ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g f934ac4 ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 71471f4 ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 83d09c0 ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableDesc.java 2537b75 ql/src/java/org/apache/hadoop/hive/ql/plan/CreateTableLikeDesc.java cb5d64c ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 2143d0c ql/src/test/org/apache/hadoop/hive/ql/exec/tez/TestTezTask.java 43125f7 ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 98c3cc3 ql/src/test/org/apache/hadoop/hive/ql/parse/TestMacroSemanticAnalyzer.java 91de8da ql/src/test/org/apache/hadoop/hive/ql/parse/authorization/TestHiveAuthorizationTaskFactory.java 20d08b3 ql/src/test/queries/clientnegative/temp_table_authorize_create_tbl.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_column_stats.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_create_like_partitions.q PRE-CREATION ql/src/test/queries/clientnegative/temp_table_index.q PRE-CREATION
[jira] [Updated] (HIVE-860) Persistent distributed cache
[ https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-860: -- Attachment: HIVE-860.patch Running this one again. Some of the failures are quite strange. Persistent distributed cache Key: HIVE-860 URL: https://issues.apache.org/jira/browse/HIVE-860 Project: Hive Issue Type: Improvement Affects Versions: 0.12.0 Reporter: Zheng Shao Assignee: Brock Noland Fix For: 0.14.0 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch DistributedCache is shared across multiple jobs, if the hdfs file name is the same. We need to make sure Hive put the same file into the same location every time and do not overwrite if the file content is the same. We can achieve 2 different results: A1. Files added with the same name, timestamp, and md5 in the same session will have a single copy in distributed cache. A2. Filed added with the same name, timestamp, and md5 will have a single copy in distributed cache. A2 has a bigger benefit in sharing but may raise a question on when Hive should clean it up in hdfs. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7303) IllegalMonitorStateException when stmtHandle is null in HiveStatement
[ https://issues.apache.org/jira/browse/HIVE-7303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049558#comment-14049558 ] Brock Noland commented on HIVE-7303: Thank you [~navis]! Do you think we should implement the unwrap* functions? What would the use case be there? IllegalMonitorStateException when stmtHandle is null in HiveStatement - Key: HIVE-7303 URL: https://issues.apache.org/jira/browse/HIVE-7303 Project: Hive Issue Type: Bug Components: JDBC Reporter: Navis Attachments: HIVE-7303.1.patch.txt From http://www.mail-archive.com/dev@hive.apache.org/msg75617.html Unlock can be called even it's not locked in some situation. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7291) Refactor TestParser to understand test-property file
[ https://issues.apache.org/jira/browse/HIVE-7291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049625#comment-14049625 ] Brock Noland commented on HIVE-7291: +1 Refactor TestParser to understand test-property file Key: HIVE-7291 URL: https://issues.apache.org/jira/browse/HIVE-7291 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-7291.2.patch, HIVE-7291.3.patch, HIVE-7291.4.patch, HIVE-7291.patch, trunk-mr2.properties NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)