[jira] [Updated] (HIVE-3781) not all meta events call metastore event listener
[ https://issues.apache.org/jira/browse/HIVE-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3781: - Attachment: hive.3781.4.patch not all meta events call metastore event listener - Key: HIVE-3781 URL: https://issues.apache.org/jira/browse/HIVE-3781 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0 Reporter: Sudhanshu Arora Assignee: Navis Attachments: hive.3781.3.patch, hive.3781.4.patch, HIVE-3781.D7731.1.patch, HIVE-3781.D7731.2.patch An event listener must be called for any DDL activity. For example, create_index, drop_index today does not call metaevent listener. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2693) Add DECIMAL data type
[ https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-2693: - Status: Open (was: Patch Available) comments on phabricator Add DECIMAL data type - Key: HIVE-2693 URL: https://issues.apache.org/jira/browse/HIVE-2693 Project: Hive Issue Type: New Feature Components: Query Processor, Types Affects Versions: 0.10.0 Reporter: Carl Steinbach Assignee: Prasad Mujumdar Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, HIVE-2693-13.patch, HIVE-2693-14.patch, HIVE-2693-15.patch, HIVE-2693-16.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, HIVE-2693.D7683.1.patch, HIVE-2693-fix.patch, HIVE-2693.patch, HIVE-2693-take3.patch, HIVE-2693-take4.patch Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice template for how to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2693) Add DECIMAL data type
[ https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542080#comment-13542080 ] Namit Jain commented on HIVE-2693: -- Looks mostly good. Most of the comments are minor - the only major ones are around lack of testing. Add DECIMAL data type - Key: HIVE-2693 URL: https://issues.apache.org/jira/browse/HIVE-2693 Project: Hive Issue Type: New Feature Components: Query Processor, Types Affects Versions: 0.10.0 Reporter: Carl Steinbach Assignee: Prasad Mujumdar Attachments: 2693_7.patch, 2693_8.patch, 2693_fix_all_tests1.patch, HIVE-2693-10.patch, HIVE-2693-11.patch, HIVE-2693-12-SortableSerDe.patch, HIVE-2693-13.patch, HIVE-2693-14.patch, HIVE-2693-15.patch, HIVE-2693-16.patch, HIVE-2693-1.patch.txt, HIVE-2693-all.patch, HIVE-2693.D7683.1.patch, HIVE-2693-fix.patch, HIVE-2693.patch, HIVE-2693-take3.patch, HIVE-2693-take4.patch Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice template for how to do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3781) not all meta events call metastore event listener
[ https://issues.apache.org/jira/browse/HIVE-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3781: - Status: Open (was: Patch Available) TestMetaStoreEventListener is failing not all meta events call metastore event listener - Key: HIVE-3781 URL: https://issues.apache.org/jira/browse/HIVE-3781 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0 Reporter: Sudhanshu Arora Assignee: Navis Attachments: hive.3781.3.patch, hive.3781.4.patch, HIVE-3781.D7731.1.patch, HIVE-3781.D7731.2.patch An event listener must be called for any DDL activity. For example, create_index, drop_index today does not call metaevent listener. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3781) not all meta events call metastore event listener
[ https://issues.apache.org/jira/browse/HIVE-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3781: -- Attachment: HIVE-3781.D7731.3.patch navis updated the revision HIVE-3781 [jira] not all meta events call metastore event listener. Reviewers: JIRA Rebased to trunk and confirmed TestMetaStoreEventListener succeeded REVISION DETAIL https://reviews.facebook.net/D7731 AFFECTED FILES metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreEventListener.java metastore/src/java/org/apache/hadoop/hive/metastore/events/AddIndexEvent.java metastore/src/java/org/apache/hadoop/hive/metastore/events/PreEventContext.java metastore/src/java/org/apache/hadoop/hive/metastore/events/AlterIndexEvent.java metastore/src/java/org/apache/hadoop/hive/metastore/events/DropIndexEvent.java metastore/src/java/org/apache/hadoop/hive/metastore/events/PreAddIndexEvent.java metastore/src/java/org/apache/hadoop/hive/metastore/events/PreAlterIndexEvent.java metastore/src/java/org/apache/hadoop/hive/metastore/events/PreDropIndexEvent.java metastore/src/test/org/apache/hadoop/hive/metastore/DummyListener.java metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java To: JIRA, navis not all meta events call metastore event listener - Key: HIVE-3781 URL: https://issues.apache.org/jira/browse/HIVE-3781 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0 Reporter: Sudhanshu Arora Assignee: Navis Attachments: hive.3781.3.patch, hive.3781.4.patch, HIVE-3781.D7731.1.patch, HIVE-3781.D7731.2.patch, HIVE-3781.D7731.3.patch An event listener must be called for any DDL activity. For example, create_index, drop_index today does not call metaevent listener. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3781) not all meta events call metastore event listener
[ https://issues.apache.org/jira/browse/HIVE-3781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3781: Status: Patch Available (was: Open) not all meta events call metastore event listener - Key: HIVE-3781 URL: https://issues.apache.org/jira/browse/HIVE-3781 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0 Reporter: Sudhanshu Arora Assignee: Navis Attachments: hive.3781.3.patch, hive.3781.4.patch, HIVE-3781.D7731.1.patch, HIVE-3781.D7731.2.patch, HIVE-3781.D7731.3.patch An event listener must be called for any DDL activity. For example, create_index, drop_index today does not call metaevent listener. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp datatype.
Pieterjan Vriends created HIVE-3850: --- Summary: hour() function returns 12 hour clock value when using timestamp datatype. Key: HIVE-3850 URL: https://issues.apache.org/jira/browse/HIVE-3850 Project: Hive Issue Type: Improvement Components: UDF Reporter: Pieterjan Vriends Priority: Minor Apparently UDFHour.java does have two evaluate() functions. One that does accept a Text object as parameter and one that does use a TimeStampWritable object as parameter. The first function does return the value of Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the documentation I couldn't find any information on the overload of the evaluation function. I did spent quite some time finding out why my statement didn't return a 24 hour clock value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp
[ https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pieterjan Vriends updated HIVE-3850: Summary: hour() function returns 12 hour clock value when using timestamp (was: hour() function returns 12 hour clock value when using timestamp datatype.) hour() function returns 12 hour clock value when using timestamp Key: HIVE-3850 URL: https://issues.apache.org/jira/browse/HIVE-3850 Project: Hive Issue Type: Improvement Components: UDF Reporter: Pieterjan Vriends Priority: Minor Apparently UDFHour.java does have two evaluate() functions. One that does accept a Text object as parameter and one that does use a TimeStampWritable object as parameter. The first function does return the value of Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the documentation I couldn't find any information on the overload of the evaluation function. I did spent quite some time finding out why my statement didn't return a 24 hour clock value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp
[ https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pieterjan Vriends updated HIVE-3850: Priority: Major (was: Minor) hour() function returns 12 hour clock value when using timestamp Key: HIVE-3850 URL: https://issues.apache.org/jira/browse/HIVE-3850 Project: Hive Issue Type: Improvement Components: UDF Reporter: Pieterjan Vriends Apparently UDFHour.java does have two evaluate() functions. One that does accept a Text object as parameter and one that does use a TimeStampWritable object as parameter. The first function does return the value of Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the documentation I couldn't find any information on the overload of the evaluation function. I did spent quite some time finding out why my statement didn't return a 24 hour clock value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp
[ https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pieterjan Vriends updated HIVE-3850: Affects Version/s: 0.9.0 hour() function returns 12 hour clock value when using timestamp Key: HIVE-3850 URL: https://issues.apache.org/jira/browse/HIVE-3850 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.9.0 Reporter: Pieterjan Vriends Apparently UDFHour.java does have two evaluate() functions. One that does accept a Text object as parameter and one that does use a TimeStampWritable object as parameter. The first function does return the value of Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the documentation I couldn't find any information on the overload of the evaluation function. I did spent quite some time finding out why my statement didn't return a 24 hour clock value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp
[ https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pieterjan Vriends updated HIVE-3850: Description: Apparently UDFHour.java does have two evaluate() functions. One that does accept a Text object as parameter and one that does use a TimeStampWritable object as parameter. The first function does return the value of Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the documentation I couldn't find any information on the overload of the evaluation function. I did spent quite some time finding out why my statement didn't return a 24 hour clock value. Shouldn't both functions return the same? was: Apparently UDFHour.java does have two evaluate() functions. One that does accept a Text object as parameter and one that does use a TimeStampWritable object as parameter. The first function does return the value of Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the documentation I couldn't find any information on the overload of the evaluation function. I did spent quite some time finding out why my statement didn't return a 24 hour clock value. hour() function returns 12 hour clock value when using timestamp Key: HIVE-3850 URL: https://issues.apache.org/jira/browse/HIVE-3850 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.9.0 Reporter: Pieterjan Vriends Apparently UDFHour.java does have two evaluate() functions. One that does accept a Text object as parameter and one that does use a TimeStampWritable object as parameter. The first function does return the value of Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the documentation I couldn't find any information on the overload of the evaluation function. I did spent quite some time finding out why my statement didn't return a 24 hour clock value. Shouldn't both functions return the same? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2439) Upgrade antlr version to 3.4
[ https://issues.apache.org/jira/browse/HIVE-2439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542136#comment-13542136 ] Thiruvel Thirumoolan commented on HIVE-2439: [~namit] This is intended to simplify life for users of Hive, HCatalog and Pig. As HCat/Pig use antlr 3.4, anyone using all the components have to workaround in unfriendly and complicated ways. Thomas Weise raised this issue before http://markmail.org/thread/xltnc5ak2saurdbu. Websearch for 'hive pig antlr' also brings up workarounds like using jarjar. While upgrading antlr, I also fixed problems in Hive.g that didn't surface with 3.0.1. [~ashutoshc] Feel free to add if I have missed anything. Upgrade antlr version to 3.4 Key: HIVE-2439 URL: https://issues.apache.org/jira/browse/HIVE-2439 Project: Hive Issue Type: Improvement Affects Versions: 0.8.0 Reporter: Ashutosh Chauhan Assignee: Thiruvel Thirumoolan Fix For: 0.10.0, 0.9.1, 0.11.0 Attachments: HIVE-2439_branch9_2.patch, HIVE-2439_branch9_3.patch, HIVE-2439_branch9.patch, hive-2439_incomplete.patch, HIVE-2439_trunk.patch Upgrade antlr version to 3.4 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-446) Implement TRUNCATE
[ https://issues.apache.org/jira/browse/HIVE-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542139#comment-13542139 ] Hudson commented on HIVE-446: - Integrated in Hive-trunk-h0.21 #1890 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1890/]) HIVE-446 Implement TRUNCATE (Navis via namit) (Revision 1427681) Result = FAILURE namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1427681 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/TruncateTableDesc.java * /hive/trunk/ql/src/test/queries/clientnegative/truncate_table_failure1.q * /hive/trunk/ql/src/test/queries/clientnegative/truncate_table_failure2.q * /hive/trunk/ql/src/test/queries/clientnegative/truncate_table_failure3.q * /hive/trunk/ql/src/test/queries/clientnegative/truncate_table_failure4.q * /hive/trunk/ql/src/test/queries/clientpositive/truncate_table.q * /hive/trunk/ql/src/test/results/clientnegative/truncate_table_failure1.q.out * /hive/trunk/ql/src/test/results/clientnegative/truncate_table_failure2.q.out * /hive/trunk/ql/src/test/results/clientnegative/truncate_table_failure3.q.out * /hive/trunk/ql/src/test/results/clientnegative/truncate_table_failure4.q.out * /hive/trunk/ql/src/test/results/clientpositive/truncate_table.q.out Implement TRUNCATE -- Key: HIVE-446 URL: https://issues.apache.org/jira/browse/HIVE-446 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Prasad Chakka Assignee: Navis Fix For: 0.11.0 Attachments: HIVE-446.D7371.1.patch, HIVE-446.D7371.2.patch, HIVE-446.D7371.3.patch, HIVE-446.D7371.4.patch truncate the data but leave the table and metadata intact. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 1890 - Failure
Changes for Build #1886 Changes for Build #1887 Changes for Build #1888 Changes for Build #1889 Changes for Build #1890 [namit] HIVE-446 Implement TRUNCATE (Navis via namit) 1 tests failed. REGRESSION: org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_aggregator_error_1 Error Message: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. Stack Trace: junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please note the time in the report does not reflect the time until the VM exit. at net.sf.antcontrib.logic.ForTask.doSequentialIteration(ForTask.java:259) at net.sf.antcontrib.logic.ForTask.doToken(ForTask.java:268) at net.sf.antcontrib.logic.ForTask.doTheTasks(ForTask.java:324) at net.sf.antcontrib.logic.ForTask.execute(ForTask.java:244) The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1890) Status: Failure Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1890/ to view the results.
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #248
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/248/ -- [...truncated 5669 lines...] [ivy:resolve] downloading http://repo1.maven.org/maven2/org/apache/ftpserver/ftplet-api/1.0.0/ftplet-api-1.0.0.jar ... [ivy:resolve] (22kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.apache.ftpserver#ftplet-api;1.0.0!ftplet-api.jar(bundle) (14ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/org/apache/mina/mina-core/2.0.0-M5/mina-core-2.0.0-M5.jar ... [ivy:resolve] ... (622kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.apache.mina#mina-core;2.0.0-M5!mina-core.jar(bundle) (122ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/org/apache/ftpserver/ftpserver-core/1.0.0/ftpserver-core-1.0.0.jar ... [ivy:resolve] . (264kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.apache.ftpserver#ftpserver-core;1.0.0!ftpserver-core.jar(bundle) (19ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/org/apache/ftpserver/ftpserver-deprecated/1.0.0-M2/ftpserver-deprecated-1.0.0-M2.jar ... [ivy:resolve] .. (31kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.apache.ftpserver#ftpserver-deprecated;1.0.0-M2!ftpserver-deprecated.jar (13ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.5.2/slf4j-api-1.5.2.jar ... [ivy:resolve] . (16kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.slf4j#slf4j-api;1.5.2!slf4j-api.jar (11ms) ivy-retrieve-hadoop-shim: [echo] Project: shims [javac] Compiling 16 source files to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/248/artifact/hive/build/shims/classes [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java uses unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [echo] Building shims 0.20S build_shims: [echo] Project: shims [echo] Compiling https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/common/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/common-secure/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.20S/java against hadoop 1.0.0 (https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/248/artifact/hive/build/hadoopcore/hadoop-1.0.0) ivy-init-settings: [echo] Project: shims ivy-resolve-hadoop-shim: [echo] Project: shims [ivy:resolve] :: loading settings :: file = https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml [ivy:resolve] downloading http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-core/1.0.0/hadoop-core-1.0.0.jar ... [ivy:resolve] . (3652kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.apache.hadoop#hadoop-core;1.0.0!hadoop-core.jar (100ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-tools/1.0.0/hadoop-tools-1.0.0.jar ... [ivy:resolve] .. (281kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.apache.hadoop#hadoop-tools;1.0.0!hadoop-tools.jar (31ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-test/1.0.0/hadoop-test-1.0.0.jar ... [ivy:resolve] (2471kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.apache.hadoop#hadoop-test;1.0.0!hadoop-test.jar (63ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/commons-codec/commons-codec/1.4/commons-codec-1.4.jar ...
[jira] [Commented] (HIVE-3488) Issue trying to use the thick client (embedded) from windows.
[ https://issues.apache.org/jira/browse/HIVE-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1354#comment-1354 ] Rémy DUBOIS commented on HIVE-3488: --- Reminder. Issue trying to use the thick client (embedded) from windows. - Key: HIVE-3488 URL: https://issues.apache.org/jira/browse/HIVE-3488 Project: Hive Issue Type: Bug Components: Windows Affects Versions: 0.8.1 Reporter: Rémy DUBOIS Priority: Critical I'm trying to execute a very simple SELECT query against my remote hive server. If I'm doing a SELECT * from table, everything works well. If I'm trying to execute a SELECT name from table, this error appears: {code:java} Job Submission failed with exception 'java.io.IOException(cannot find dir = /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris])' 12/09/19 17:18:44 ERROR exec.Task: Job Submission failed with exception 'java.io.IOException(cannot find dir = /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris])' java.io.IOException: cannot find dir = /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris] at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:290) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:257) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.init(CombineHiveInputFormat.java:104) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:407) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:989) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:981) at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:891) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:844) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Unknown Source) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:844) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:818) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931) at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191) at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187) {code} Indeed, this dir (/user/hive/warehouse/test/city=paris/out.csv) can't be found since it deals with my data file, and not a directory. Could you please help me? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3812) TestCase TestJdbcDriver fails with IBM Java 6
[ https://issues.apache.org/jira/browse/HIVE-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Renata Ghisloti Duarte de Souza updated HIVE-3812: -- Fix Version/s: 0.8.1 0.10.0 Status: Patch Available (was: Open) TestCase TestJdbcDriver fails with IBM Java 6 - Key: HIVE-3812 URL: https://issues.apache.org/jira/browse/HIVE-3812 Project: Hive Issue Type: Bug Components: JDBC, Tests Affects Versions: 0.9.0, 0.8.1, 0.8.0, 0.10.0 Environment: Apache Ant 1.7.1 IBM JDK 6 Reporter: Renata Ghisloti Duarte de Souza Priority: Minor Fix For: 0.10.0, 0.8.1 Attachments: HIVE-3812.1_0.8.1.patch.txt, HIVE-3812.1_trunk.patch.txt When running testcase TestJdbcDriver with IBM Java 6, it fails with the following error: failure message=expected:[[{}, 1], [{[c=d, a=b]}, 2]] but was:[[{}, 1], [{[a=b, c=d]}, 2]]; type=junit.framework.ComparisonFailurejunit.framework.ComparisonFailure: expected:[[{}, 1], [{[c=d, a=b]}, 2]] but was:[[{}, 1], [{[a=b, c=d]}, 2]]; at junit.framework.Assert.assertEquals(Assert.java:85) at junit.framework.Assert.assertEquals(Assert.java:91) at org.apache.hadoop.hive.jdbc.TestJdbcDriver.testDataTypes(TestJdbcDriver.java:380) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3272) RetryingRawStore will perform partial transaction on retry
[ https://issues.apache.org/jira/browse/HIVE-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542282#comment-13542282 ] Kevin Wilfong commented on HIVE-3272: - Yes, this is a totally separate issue from HIVE-3826. HIVE-3826 will happen even when the RetryingRawStore tries only once (never retries). RetryingRawStore will perform partial transaction on retry -- Key: HIVE-3272 URL: https://issues.apache.org/jira/browse/HIVE-3272 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.10.0 Reporter: Kevin Wilfong Priority: Critical By the time the RetryingRawStore retries a command the transaction encompassing it has already been rolled back. This means that it will perform the remainder of the raw store commands outside of a transaction, unless there is another one encapsulating it which is definitely not always the case, and then fail when it tries to commit the transaction as there is none open. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3803) explain dependency should show the dependencies hierarchically in presence of views
[ https://issues.apache.org/jira/browse/HIVE-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542296#comment-13542296 ] Kevin Wilfong commented on HIVE-3803: - One really minor comment on the diff, otherwise looks good. explain dependency should show the dependencies hierarchically in presence of views --- Key: HIVE-3803 URL: https://issues.apache.org/jira/browse/HIVE-3803 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3803.1.patch, hive.3803.2.patch, hive.3803.3.patch, hive.3803.4.patch, hive.3803.5.patch It should also include tables whose partitions are being accessed -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
write access to the Hive wiki?
Hi All! Could I have write access to the Hive wiki? I'd like to fix some documentation errors. The most immediate is the Avro SerDe page, which contains incorrect table creation statements. -- Sean
[jira] [Commented] (HIVE-3652) Join optimization for star schema
[ https://issues.apache.org/jira/browse/HIVE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542361#comment-13542361 ] Vikram Dixit K commented on HIVE-3652: -- [~amareshwari] I am quite interested in this jira and was wondering what phase you are in with respect to design/implementation. I would like to collaborate with you on this if possible. Please let me know. Thanks Vikram. Join optimization for star schema - Key: HIVE-3652 URL: https://issues.apache.org/jira/browse/HIVE-3652 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Currently, if we join one fact table with multiple dimension tables, it results in multiple mapreduce jobs for each join with dimension table, because join would be on different keys for each dimension. Usually all the dimension tables will be small and can fit into memory and so map-side join can used to join with fact table. In this issue I want to look at optimizing such query to generate single mapreduce job sothat mapper loads dimension tables into memory and joins with fact table on different keys as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3552) HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys
[ https://issues.apache.org/jira/browse/HIVE-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542382#comment-13542382 ] Kevin Wilfong commented on HIVE-3552: - The patch needs to be updated, it's not applying cleanly. Some minor style comments on Phabricator. HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys - Key: HIVE-3552 URL: https://issues.apache.org/jira/browse/HIVE-3552 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3552.10.patch, hive.3552.11.patch, hive.3552.1.patch, hive.3552.2.patch, hive.3552.3.patch, hive.3552.4.patch, hive.3552.5.patch, hive.3552.6.patch, hive.3552.7.patch, hive.3552.8.patch, hive.3552.9.patch This is a follow up for HIVE-3433. Had a offline discussion with Sambavi - she pointed out a scenario where the implementation in HIVE-3433 will not scale. Assume that the user is performing a cube on many columns, say '8' columns. So, each row would generate 256 rows for the hash table, which may kill the current group by implementation. A better implementation would be to add an additional mr job - in the first mr job perform the group by assuming there was no cube. Add another mr job, where you would perform the cube. The assumption is that the group by would have decreased the output data significantly, and the rows would appear in the order of grouping keys which has a higher probability of hitting the hash table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3552) HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys
[ https://issues.apache.org/jira/browse/HIVE-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3552: Status: Open (was: Patch Available) HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys - Key: HIVE-3552 URL: https://issues.apache.org/jira/browse/HIVE-3552 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3552.10.patch, hive.3552.11.patch, hive.3552.1.patch, hive.3552.2.patch, hive.3552.3.patch, hive.3552.4.patch, hive.3552.5.patch, hive.3552.6.patch, hive.3552.7.patch, hive.3552.8.patch, hive.3552.9.patch This is a follow up for HIVE-3433. Had a offline discussion with Sambavi - she pointed out a scenario where the implementation in HIVE-3433 will not scale. Assume that the user is performing a cube on many columns, say '8' columns. So, each row would generate 256 rows for the hash table, which may kill the current group by implementation. A better implementation would be to add an additional mr job - in the first mr job perform the group by assuming there was no cube. Add another mr job, where you would perform the cube. The assumption is that the group by would have decreased the output data significantly, and the rows would appear in the order of grouping keys which has a higher probability of hitting the hash table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3803) explain dependency should show the dependencies hierarchically in presence of views
[ https://issues.apache.org/jira/browse/HIVE-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542383#comment-13542383 ] Kevin Wilfong commented on HIVE-3803: - Can you file the patch with the test updates. explain dependency should show the dependencies hierarchically in presence of views --- Key: HIVE-3803 URL: https://issues.apache.org/jira/browse/HIVE-3803 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3803.1.patch, hive.3803.2.patch, hive.3803.3.patch, hive.3803.4.patch, hive.3803.5.patch It should also include tables whose partitions are being accessed -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3803) explain dependency should show the dependencies hierarchically in presence of views
[ https://issues.apache.org/jira/browse/HIVE-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3803: Status: Open (was: Patch Available) explain dependency should show the dependencies hierarchically in presence of views --- Key: HIVE-3803 URL: https://issues.apache.org/jira/browse/HIVE-3803 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3803.1.patch, hive.3803.2.patch, hive.3803.3.patch, hive.3803.4.patch, hive.3803.5.patch It should also include tables whose partitions are being accessed -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format
[ https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542442#comment-13542442 ] Jakob Homan commented on HIVE-3585: --- Also, He, I'm assuming your -1 is not intended to be a veto? I don't believe it would hold up technically. Trevni is essentially a variation on Avro. Not letting people read their Trevni-encoded data in Hive just because there's already another columnar format doesn't seem like a good way forward. Integrate Trevni as another columnar oriented file format - Key: HIVE-3585 URL: https://issues.apache.org/jira/browse/HIVE-3585 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: alex gemini Assignee: Jakob Homan Priority: Minor add new avro module trevni as another columnar format.New columnar format need a columnar SerDe,seems fastutil is a good choice.the shark project use fastutil library as columnar serde library but it seems too large (almost 15m) for just a few primitive array collection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3585) Integrate Trevni as another columnar oriented file format
[ https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542432#comment-13542432 ] Mark Wagner commented on HIVE-3585: --- I'm taking this over for Jakob. Please add me as a contributor so that I can assign this ticket to myself. Integrate Trevni as another columnar oriented file format - Key: HIVE-3585 URL: https://issues.apache.org/jira/browse/HIVE-3585 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: alex gemini Assignee: Jakob Homan Priority: Minor add new avro module trevni as another columnar format.New columnar format need a columnar SerDe,seems fastutil is a good choice.the shark project use fastutil library as columnar serde library but it seems too large (almost 15m) for just a few primitive array collection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.10.0-SNAPSHOT-h0.20.1 #22
See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/22/ -- [...truncated 41919 lines...] [junit] Hadoop job information for null: number of mappers: 0; number of reducers: 0 [junit] 2013-01-02 14:17:24,131 null map = 100%, reduce = 100% [junit] Ended Job = job_local_0001 [junit] Execution completed successfully [junit] Mapred Local Task Succeeded . Convert the Join into MapJoin [junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-01-02_14-17-21_059_3270679034972611333/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201301021417_745834635.txt [junit] Copying file: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: load data local inpath '/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Output: default@testhivedrivertable [junit] Copying data from file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] Table default.testhivedrivertable stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 5812, raw_data_size: 0] [junit] POSTHOOK: query: load data local inpath '/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select * from testhivedrivertable limit 10 [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-01-02_14-17-25_499_8521399548346007948/-mr-1 [junit] POSTHOOK: query: select * from testhivedrivertable limit 10 [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-01-02_14-17-25_499_8521399548346007948/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201301021417_1668767561.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable
[jira] [Updated] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp datatype
[ https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pieterjan Vriends updated HIVE-3850: Summary: hour() function returns 12 hour clock value when using timestamp datatype (was: hour() function returns 12 hour clock value when using timestamp) hour() function returns 12 hour clock value when using timestamp datatype - Key: HIVE-3850 URL: https://issues.apache.org/jira/browse/HIVE-3850 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.9.0 Reporter: Pieterjan Vriends Apparently UDFHour.java does have two evaluate() functions. One that does accept a Text object as parameter and one that does use a TimeStampWritable object as parameter. The first function does return the value of Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the documentation I couldn't find any information on the overload of the evaluation function. I did spent quite some time finding out why my statement didn't return a 24 hour clock value. Shouldn't both functions return the same? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3851) Add isFinalMapRed from MapredWork to EXPLAIN EXTENDED output
Kevin Wilfong created HIVE-3851: --- Summary: Add isFinalMapRed from MapredWork to EXPLAIN EXTENDED output Key: HIVE-3851 URL: https://issues.apache.org/jira/browse/HIVE-3851 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor A flag indicating that a map reduce job is produces final output (ignoring moves/merges) will be added as part of HIVE-933. It would be good to include this in the output of EXPLAIN EXTENDED. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-3528
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7431/ --- (Updated Jan. 2, 2013, 11:11 p.m.) Review request for hive and Jakob Homan. Changes --- Added a clientpositive test for all nullable types. Subsumed HIVE-3538, with changes in anticipation of AVRO-997's stricter handling of enums. Description (updated) --- Changes AvroSerDe to properly give the non-null schema to serialization routines when using Nullable complex types. Properly restores the enum-ness of Avro Enums prior to serialization. Diffs (updated) - /trunk/data/files/csv.txt PRE-CREATION /trunk/ql/src/test/queries/clientpositive/avro_nullable_fields.q PRE-CREATION /trunk/ql/src/test/results/clientpositive/avro_nullable_fields.q.out PRE-CREATION /trunk/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerializer.java 1426606 /trunk/serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroDeserializer.java 1426606 /trunk/serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroObjectInspectorGenerator.java 1426606 /trunk/serde/src/test/org/apache/hadoop/hive/serde2/avro/TestAvroSerializer.java 1426606 Diff: https://reviews.apache.org/r/7431/diff/ Testing (updated) --- Adds tests that check each of the Avro types that Serialization needs to use a user-provided schema, both as top level fields and as nested members of a complex type. Adds a client positive test that reads in a CSV table with NULLs, copies that data into an Avro backed table, then reads the data out of the table. Thanks, Sean Busbey
[jira] [Resolved] (HIVE-3538) Avro SerDe can't handle Nullable Enums
[ https://issues.apache.org/jira/browse/HIVE-3538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey resolved HIVE-3538. --- Resolution: Duplicate Subsumed by HIVE-3528 Avro SerDe can't handle Nullable Enums -- Key: HIVE-3538 URL: https://issues.apache.org/jira/browse/HIVE-3538 Project: Hive Issue Type: Bug Reporter: Sean Busbey Attachments: HIVE-3538.tests.txt If a field has a schema that unions NULL with an enum, Avro fails to resolve the union because Avro SerDe doesn't restore enumness. Since the enum datum is a String, avro internals check the union for a string schema, which is not present. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3528) Avro SerDe doesn't handle serializing Nullable types that require access to a Schema
[ https://issues.apache.org/jira/browse/HIVE-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542533#comment-13542533 ] Sean Busbey commented on HIVE-3528: --- Update with clientpositive .q test and subsumed enum handling (HIVE-3538) to [review board #7431|https://reviews.apache.org/r/7431/] Avro SerDe doesn't handle serializing Nullable types that require access to a Schema Key: HIVE-3528 URL: https://issues.apache.org/jira/browse/HIVE-3528 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Sean Busbey Labels: avro Attachments: HIVE-3528.1.patch.txt Deserialization properly handles hiding Nullable Avro types, including complex types like record, map, array, etc. However, when Serialization attempts to write out these types it erroneously makes use of the UNION schema that contains NULL and the other type. This results in Schema mis-match errors for Record, Array, Enum, Fixed, and Bytes. Here's a [review board of unit tests that express the problem|https://reviews.apache.org/r/7431/], as well as one that supports the case that it's only when the schema is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3585) Integrate Trevni as another columnar oriented file format
[ https://issues.apache.org/jira/browse/HIVE-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach reassigned HIVE-3585: Assignee: Mark Wagner (was: Jakob Homan) Integrate Trevni as another columnar oriented file format - Key: HIVE-3585 URL: https://issues.apache.org/jira/browse/HIVE-3585 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: alex gemini Assignee: Mark Wagner Priority: Minor add new avro module trevni as another columnar format.New columnar format need a columnar SerDe,seems fastutil is a good choice.the shark project use fastutil library as columnar serde library but it seems too large (almost 15m) for just a few primitive array collection. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3699) Multiple insert overwrite into multiple tables query stores same results in all tables
[ https://issues.apache.org/jira/browse/HIVE-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542568#comment-13542568 ] Shanzhong Zhu commented on HIVE-3699: - Any updates on this item? We are also observing a similar issue. In the following query, the result of test17 was supposed to be empty. But test17 seems to have the same results as test18. FROM ( SELECT info.product, info.sid, info.id, t.persona, info.service FROM info_table info JOIN main_tbl t ON info.service=t.service WHERE (info.id BETWEEN 17 AND 18) AND t.dt='2012-11-20' AND t.m='XXX1' AND t.g = 'XXX2' AND t.s = 'XXX3' ) u INSERT OVERWRITE TABLE test18 PARTITION (dt='2012-11-20', service) SELECT u.product, u.sid, u.id, u.persona, u.service WHERE u.id=18 INSERT OVERWRITE TABLE test17 PARTITION (dt='2012-11-20', service) SELECT u.product, u.sid, u.id, u.persona, u.service WHERE u.id=17; Multiple insert overwrite into multiple tables query stores same results in all tables -- Key: HIVE-3699 URL: https://issues.apache.org/jira/browse/HIVE-3699 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Environment: Cloudera 4.1 on Amazon Linux (rebranded Centos 6): hive-0.9.0+150-1.cdh4.1.1.p0.4.el6.noarch Reporter: Alexandre Fouché (Note: This might be related to HIVE-2750) I am doing a query with multiple INSERT OVERWRITE to multiple tables in order to scan the dataset only 1 time, and i end up having all these tables with the same content ! It seems the GROUP BY query that returns results is overwriting all the temp tables. Weird enough, if i had further GROUP BY queries into additional temp tables, grouped by a different field, then all temp tables, even the ones that would have been wrong content are all correctly populated. This is the misbehaving query: FROM nikon INSERT OVERWRITE TABLE e1 SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Impressions WHERE qs_cs_s_cat='PRINT' GROUP BY qs_cs_s_aid INSERT OVERWRITE TABLE e2 SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Vues WHERE qs_cs_s_cat='VIEW' GROUP BY qs_cs_s_aid ; It launches only one MR job and here are the results. Why does table 'e1' contains results from table 'e2' ?! Table 'e1' should have been empty (see individual SELECTs further below) hive SELECT * from e1; OK NULL2 1627575 25 1627576 70 1690950 22 1690952 42 1696705 199 1696706 66 1696730 229 1696759 85 1696893 218 Time taken: 0.229 seconds hive SELECT * from e2; OK NULL2 1627575 25 1627576 70 1690950 22 1690952 42 1696705 199 1696706 66 1696730 229 1696759 85 1696893 218 Time taken: 0.11 seconds Here is are the result to the indiviual queries (only the second query returns a result set): hive SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Impressions FROM nikon WHERE qs_cs_s_cat='PRINT' GROUP BY qs_cs_s_aid; (...) OK - There are no results, this is normal Time taken: 41.471 seconds hive SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Vues FROM nikon WHERE qs_cs_s_cat='VIEW' GROUP BY qs_cs_s_aid; (...) OK NULL 2 1627575 25 1627576 70 1690950 22 1690952 42 1696705 199 1696706 66 1696730 229 1696759 85 1696893 218 Time taken: 39.607 seconds -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: write access to the Hive wiki?
Hi Sean, I added you to the Hive wiki ACL. Thanks. Carl On Wed, Jan 2, 2013 at 11:48 AM, Sean Busbey bus...@cloudera.com wrote: Hi All! Could I have write access to the Hive wiki? I'd like to fix some documentation errors. The most immediate is the Avro SerDe page, which contains incorrect table creation statements. -- Sean
[jira] [Commented] (HIVE-3428) Fix log4j configuration errors when running hive on hadoop23
[ https://issues.apache.org/jira/browse/HIVE-3428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542594#comment-13542594 ] Gunther Hagleitner commented on HIVE-3428: -- Patch doesn't seem to provide configuration for hadoop.mr.rev=20S (hadoop 1-line). More importantly, if the only difference between the files is the Hadoop's deprecated EventCounter, it seems better to create a shim class for that. This way there's only one conf file and hive will pick the right file from the classpath at runtime. Fix log4j configuration errors when running hive on hadoop23 Key: HIVE-3428 URL: https://issues.apache.org/jira/browse/HIVE-3428 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Zhenxiao Luo Assignee: Zhenxiao Luo Attachments: HIVE-3428.1.patch.txt, HIVE-3428.2.patch.txt, HIVE-3428.3.patch.txt, HIVE-3428.4.patch.txt, HIVE-3428.5.patch.txt, HIVE-3428.6.patch.txt There are log4j configuration errors when running hive on hadoop23, some of them may fail testcases, since the following log4j error message could printed to console, or to output file, which diffs from the expected output: [junit] log4j:ERROR Could not find value for key log4j.appender.NullAppender [junit] log4j:ERROR Could not instantiate appender named NullAppender. [junit] 12/09/04 11:34:42 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3562) Some limit can be pushed down to map stage
[ https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3562: Status: Patch Available (was: Open) Some limit can be pushed down to map stage -- Key: HIVE-3562 URL: https://issues.apache.org/jira/browse/HIVE-3562 Project: Hive Issue Type: Bug Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3562.D5967.1.patch, HIVE-3562.D5967.2.patch, HIVE-3562.D5967.3.patch Queries with limit clause (with reasonable number), for example {noformat} select * from src order by key limit 10; {noformat} makes operator tree, TS-SEL-RS-EXT-LIMIT-FS But LIMIT can be partially calculated in RS, reducing size of shuffling. TS-SEL-RS(TOP-N)-EXT-LIMIT-FS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3562) Some limit can be pushed down to map stage
[ https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3562: -- Attachment: HIVE-3562.D5967.3.patch navis updated the revision HIVE-3562 [jira] Some limit can be pushed down to map stage. Reviewers: JIRA, tarball Addressed comments Prevent multi-GBY single-RS case REVISION DETAIL https://reviews.facebook.net/D5967 AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java conf/hive-default.xml.template ql/src/java/org/apache/hadoop/hive/ql/exec/ExtractOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/ForwardOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java ql/src/java/org/apache/hadoop/hive/ql/io/HiveKey.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/LimitPushdownOptimizer.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java ql/src/test/queries/clientpositive/limit_pushdown.q ql/src/test/results/clientpositive/limit_pushdown.q.out To: JIRA, tarball, navis Cc: njain Some limit can be pushed down to map stage -- Key: HIVE-3562 URL: https://issues.apache.org/jira/browse/HIVE-3562 Project: Hive Issue Type: Bug Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3562.D5967.1.patch, HIVE-3562.D5967.2.patch, HIVE-3562.D5967.3.patch Queries with limit clause (with reasonable number), for example {noformat} select * from src order by key limit 10; {noformat} makes operator tree, TS-SEL-RS-EXT-LIMIT-FS But LIMIT can be partially calculated in RS, reducing size of shuffling. TS-SEL-RS(TOP-N)-EXT-LIMIT-FS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: write access to the Hive wiki?
What is your id ? On 1/3/13 1:18 AM, Sean Busbey bus...@cloudera.com wrote: Hi All! Could I have write access to the Hive wiki? I'd like to fix some documentation errors. The most immediate is the Avro SerDe page, which contains incorrect table creation statements. -- Sean
Hive-trunk-h0.21 - Build # 1891 - Fixed
Changes for Build #1886 Changes for Build #1887 Changes for Build #1888 Changes for Build #1889 Changes for Build #1890 [namit] HIVE-446 Implement TRUNCATE (Navis via namit) Changes for Build #1891 All tests passed The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1891) Status: Fixed Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1891/ to view the results.
[jira] [Updated] (HIVE-933) Infer bucketing/sorting properties
[ https://issues.apache.org/jira/browse/HIVE-933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-933: --- Attachment: HIVE-933.7.patch.txt Infer bucketing/sorting properties -- Key: HIVE-933 URL: https://issues.apache.org/jira/browse/HIVE-933 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Kevin Wilfong Attachments: HIVE-933.1.patch.txt, HIVE-933.2.patch.txt, HIVE-933.3.patch.txt, HIVE-933.4.patch.txt, HIVE-933.5.patch.txt, HIVE-933.6.patch.txt, HIVE-933.7.patch.txt This is a long-term plan, and may require major changes. From the query, we can figure out the sorting/bucketing properties, and change the metadata of the destination at that time. However, this means that different partitions may have different metadata. Currently, the query plan is same for all the partitions of the table - we can do the following: 1. In the first cut, have a simple approach where you take the union all metadata, and create the most defensive plan. 2. Enhance mapredWork() to include partition specific operator trees. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3852) Multi-groupby optimization fails when same distinct column is used twice or more
Navis created HIVE-3852: --- Summary: Multi-groupby optimization fails when same distinct column is used twice or more Key: HIVE-3852 URL: https://issues.apache.org/jira/browse/HIVE-3852 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial {code} FROM INPUT INSERT OVERWRITE TABLE dest1 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct substr(INPUT.value,5)) GROUP BY INPUT.key INSERT OVERWRITE TABLE dest2 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct substr(INPUT.value,5)) GROUP BY INPUT.key; {code} fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3852) Multi-groupby optimization fails when same distinct column is used twice or more
[ https://issues.apache.org/jira/browse/HIVE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3852: Status: Patch Available (was: Open) Multi-groupby optimization fails when same distinct column is used twice or more Key: HIVE-3852 URL: https://issues.apache.org/jira/browse/HIVE-3852 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3852.D7737.1.patch {code} FROM INPUT INSERT OVERWRITE TABLE dest1 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct substr(INPUT.value,5)) GROUP BY INPUT.key INSERT OVERWRITE TABLE dest2 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct substr(INPUT.value,5)) GROUP BY INPUT.key; {code} fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3852) Multi-groupby optimization fails when same distinct column is used twice or more
[ https://issues.apache.org/jira/browse/HIVE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3852: -- Attachment: HIVE-3852.D7737.1.patch navis requested code review of HIVE-3852 [jira] Multi-groupby optimization fails when same distinct column is used twice or more. Reviewers: JIRA DPAL-1951 Multi-groupby optimization fails when same distinct column is used twice or more FROM INPUT INSERT OVERWRITE TABLE dest1 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct substr(INPUT.value,5)) GROUP BY INPUT.key INSERT OVERWRITE TABLE dest2 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct substr(INPUT.value,5)) GROUP BY INPUT.key; fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0 TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D7737 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/test/queries/clientpositive/groupby10.q ql/src/test/results/clientpositive/groupby10.q.out MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/18621/ To: JIRA, navis Multi-groupby optimization fails when same distinct column is used twice or more Key: HIVE-3852 URL: https://issues.apache.org/jira/browse/HIVE-3852 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3852.D7737.1.patch {code} FROM INPUT INSERT OVERWRITE TABLE dest1 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct substr(INPUT.value,5)) GROUP BY INPUT.key INSERT OVERWRITE TABLE dest2 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct substr(INPUT.value,5)) GROUP BY INPUT.key; {code} fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-446) Implement TRUNCATE
[ https://issues.apache.org/jira/browse/HIVE-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542688#comment-13542688 ] Phabricator commented on HIVE-446: -- mgrover has commented on the revision HIVE-446 [jira] Implement TRUNCATE. I am happy to take care of all rmr changes in HIVE-3701 REVISION DETAIL https://reviews.facebook.net/D7371 To: JIRA, navis Cc: njain, mgrover Implement TRUNCATE -- Key: HIVE-446 URL: https://issues.apache.org/jira/browse/HIVE-446 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Prasad Chakka Assignee: Navis Fix For: 0.11.0 Attachments: HIVE-446.D7371.1.patch, HIVE-446.D7371.2.patch, HIVE-446.D7371.3.patch, HIVE-446.D7371.4.patch truncate the data but leave the table and metadata intact. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage
[ https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542690#comment-13542690 ] Phabricator commented on HIVE-3562: --- tarball has commented on the revision HIVE-3562 [jira] Some limit can be pushed down to map stage. Looks good to me. +1 REVISION DETAIL https://reviews.facebook.net/D5967 To: JIRA, tarball, navis Cc: njain Some limit can be pushed down to map stage -- Key: HIVE-3562 URL: https://issues.apache.org/jira/browse/HIVE-3562 Project: Hive Issue Type: Bug Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3562.D5967.1.patch, HIVE-3562.D5967.2.patch, HIVE-3562.D5967.3.patch Queries with limit clause (with reasonable number), for example {noformat} select * from src order by key limit 10; {noformat} makes operator tree, TS-SEL-RS-EXT-LIMIT-FS But LIMIT can be partially calculated in RS, reducing size of shuffling. TS-SEL-RS(TOP-N)-EXT-LIMIT-FS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage
[ https://issues.apache.org/jira/browse/HIVE-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542689#comment-13542689 ] Sivaramakrishnan Narayanan commented on HIVE-3562: -- Looks good to me. +1 Some limit can be pushed down to map stage -- Key: HIVE-3562 URL: https://issues.apache.org/jira/browse/HIVE-3562 Project: Hive Issue Type: Bug Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-3562.D5967.1.patch, HIVE-3562.D5967.2.patch, HIVE-3562.D5967.3.patch Queries with limit clause (with reasonable number), for example {noformat} select * from src order by key limit 10; {noformat} makes operator tree, TS-SEL-RS-EXT-LIMIT-FS But LIMIT can be partially calculated in RS, reducing size of shuffling. TS-SEL-RS(TOP-N)-EXT-LIMIT-FS -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3652) Join optimization for star schema
[ https://issues.apache.org/jira/browse/HIVE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542723#comment-13542723 ] Amareshwari Sriramadasu commented on HIVE-3652: --- [~vikram.dixit]I'm not working on it right now. May not get time in next one month also. Please feel free work on it, if interested. Join optimization for star schema - Key: HIVE-3652 URL: https://issues.apache.org/jira/browse/HIVE-3652 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Currently, if we join one fact table with multiple dimension tables, it results in multiple mapreduce jobs for each join with dimension table, because join would be on different keys for each dimension. Usually all the dimension tables will be small and can fit into memory and so map-side join can used to join with fact table. In this issue I want to look at optimizing such query to generate single mapreduce job sothat mapper loads dimension tables into memory and joins with fact table on different keys as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3652) Join optimization for star schema
[ https://issues.apache.org/jira/browse/HIVE-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu reassigned HIVE-3652: - Assignee: (was: Amareshwari Sriramadasu) Join optimization for star schema - Key: HIVE-3652 URL: https://issues.apache.org/jira/browse/HIVE-3652 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Amareshwari Sriramadasu Currently, if we join one fact table with multiple dimension tables, it results in multiple mapreduce jobs for each join with dimension table, because join would be on different keys for each dimension. Usually all the dimension tables will be small and can fit into memory and so map-side join can used to join with fact table. In this issue I want to look at optimizing such query to generate single mapreduce job sothat mapper loads dimension tables into memory and joins with fact table on different keys as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira