[jira] [Commented] (HIVE-2933) analyze command throw NPE when table doesn't exists
[ https://issues.apache.org/jira/browse/HIVE-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254442#comment-13254442 ] Namit Jain commented on HIVE-2933: -- Can you add a test ? analyze command throw NPE when table doesn't exists --- Key: HIVE-2933 URL: https://issues.apache.org/jira/browse/HIVE-2933 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.8.1 Reporter: alex gemini Priority: Minor Attachments: HIVE-2933-0.8.1-1.patch analyze command throw NPE when table doesn't exists -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2904) ant gen-test failed
[ https://issues.apache.org/jira/browse/HIVE-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254453#comment-13254453 ] Namit Jain commented on HIVE-2904: -- Otherwise, it looks good. Will commit after that ant gen-test failed --- Key: HIVE-2904 URL: https://issues.apache.org/jira/browse/HIVE-2904 Project: Hive Issue Type: Bug Affects Versions: 0.8.1 Reporter: Sho Shimauchi Labels: patch Attachments: HIVE-2904.1.patch, HIVE-2904.D2487.1.patch, HIVE-2904.D2487.2.patch When I ran the commands introduced in Getting Started page, ant gen-test failed with the following error. {quote} $ ant gen-test Buildfile: /Users/sho/src/apache/hive/ql/build.xml test-conditions: [echo] Project: ql test-init: [echo] Project: ql [mkdir] Created dir: /Users/sho/src/apache/hive/build/ql/test/data [mkdir] Created dir: /Users/sho/src/apache/hive/build/ql/test/logs/clientpositive [mkdir] Created dir: /Users/sho/src/apache/hive/build/ql/test/logs/clientnegative [mkdir] Created dir: /Users/sho/src/apache/hive/build/ql/test/logs/positive [mkdir] Created dir: /Users/sho/src/apache/hive/build/ql/test/logs/negative [mkdir] Created dir: /Users/sho/src/apache/hive/build/ql/test/data/warehouse [mkdir] Created dir: /Users/sho/src/apache/hive/build/ql/test/data/metadb gen-test: [echo] ql [qtestgen] Template Path:/Users/sho/src/apache/hive/ql/src/test/templates [qtestgen] 2012/03/25 15:27:10 org.apache.velocity.runtime.log.JdkLogChute log [qtestgen] ???: FileResourceLoader : adding path '/Users/sho/src/apache/hive/ql/src/test/templates' [qtestgen] Generated /Users/sho/src/apache/hive/build/ql/test/src/org/apache/hadoop/hive/ql/parse/TestParse.java from template TestParse.vm [qtestgen] Template Path:/Users/sho/src/apache/hive/ql/src/test/templates [qtestgen] 2012/03/25 15:27:10 org.apache.velocity.runtime.log.JdkLogChute log [qtestgen] ???: FileResourceLoader : adding path '/Users/sho/src/apache/hive/ql/src/test/templates' [qtestgen] Generated /Users/sho/src/apache/hive/build/ql/test/src/org/apache/hadoop/hive/ql/parse/TestParseNegative.java from template TestParseNegative.vm [qtestgen] Template Path:/Users/sho/src/apache/hive/ql/src/test/templates [qtestgen] 2012/03/25 15:27:10 org.apache.velocity.runtime.log.JdkLogChute log [qtestgen] ???: FileResourceLoader : adding path '/Users/sho/src/apache/hive/ql/src/test/templates' [qtestgen] Generated /Users/sho/src/apache/hive/build/ql/test/src/org/apache/hadoop/hive/cli/TestCliDriver.java from template TestCliDriver.vm BUILD FAILED /Users/sho/src/apache/hive/ql/build.xml:116: Problem: failed to create task or type if Cause: The name is undefined. Action: Check the spelling. Action: Check that any custom tasks/types have been declared. Action: Check that any presetdef/macrodef declarations have taken place. {quote} Getting Started: https://cwiki.apache.org/confluence/display/Hive/GettingStarted+EclipseSetup -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2942) substr on string containing UTF-8 characters produces StringIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254459#comment-13254459 ] Namit Jain commented on HIVE-2942: -- +1 running tests substr on string containing UTF-8 characters produces StringIndexOutOfBoundsException - Key: HIVE-2942 URL: https://issues.apache.org/jira/browse/HIVE-2942 Project: Hive Issue Type: Bug Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2942.D2727.1.patch After HIVE-2792, the substr function produces a StringIndexOutOfBoundsException when called on a string containing UTF-8 characters without the length argument being present. E.g. select substr(str, 1) from table1; now fails with that exception if str contains a UTF-8 character for any row in the table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2929) race condition in DAG execute tasks for hive
[ https://issues.apache.org/jira/browse/HIVE-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13248114#comment-13248114 ] Namit Jain commented on HIVE-2929: -- https://reviews.facebook.net/differential/diff/8571/ race condition in DAG execute tasks for hive Key: HIVE-2929 URL: https://issues.apache.org/jira/browse/HIVE-2929 Project: Hive Issue Type: Bug Reporter: Namit Jain select ... ( SubQuery involving MapReduce union all SubQuery involving MapReduce ); or select ... (SubQuery involving MapReduce) join (SubQuery involving MapReduce) ; If both the subQueries finish at nearly the same time, there is a race condition in which the results of the subQuery finishing last will be completely missed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2931) conf settings may be ignored
[ https://issues.apache.org/jira/browse/HIVE-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13248643#comment-13248643 ] Namit Jain commented on HIVE-2931: -- https://reviews.facebook.net/differential/diff/8583/ conf settings may be ignored Key: HIVE-2931 URL: https://issues.apache.org/jira/browse/HIVE-2931 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain This is a pretty serious problem. If a conf variable is changed, Hive may not pick up the variable unless the metastore variables are changed. When any session variables are changed, it might be simpler to update the corresponding Hive conf. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2909) SHOW COLUMNS table_name; to provide a comma-delimited list of columns.
[ https://issues.apache.org/jira/browse/HIVE-2909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242575#comment-13242575 ] Namit Jain commented on HIVE-2909: -- https://cwiki.apache.org/Hive/phabricatorcodereview.html SHOW COLUMNS table_name; to provide a comma-delimited list of columns. -- Key: HIVE-2909 URL: https://issues.apache.org/jira/browse/HIVE-2909 Project: Hive Issue Type: New Feature Reporter: Adam Kramer Assignee: Dikang Gu Priority: Minor Due to the way that SELECT * and partitioning works, it is frequently obnoxious to insert data into tables of the same schema. This could be fixed in a number of ways, all murky; this feature request reduces the obnoxicity of the current situation. SHOW COLUMNS foo; OK bar, baz, tball, ds ...then I could just copy the first three and not the last. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2911) Move global .hiverc file
[ https://issues.apache.org/jira/browse/HIVE-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242951#comment-13242951 ] Namit Jain commented on HIVE-2911: -- This is a backward incompatible change. Move global .hiverc file Key: HIVE-2911 URL: https://issues.apache.org/jira/browse/HIVE-2911 Project: Hive Issue Type: Improvement Components: CLI Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: HIVE-2911.D2529.1.patch, HIVE-2911.D2529.2.patch Currently, the .hiverc files are loaded from: {code} $HIVE_HOME/bin/.hiverc ~/.hiverc {code} It seems more ops-friendly to have it in the config directory. {code} $HIVE_HOME/bin/.hiverc - for backwards compatibility $HIVE_CONF_DIR/.hiverc ~/.hiverc {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2866) Cache local map reduce job errors for additional logging
[ https://issues.apache.org/jira/browse/HIVE-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13243019#comment-13243019 ] Namit Jain commented on HIVE-2866: -- +1 running tests Cache local map reduce job errors for additional logging Key: HIVE-2866 URL: https://issues.apache.org/jira/browse/HIVE-2866 Project: Hive Issue Type: Improvement Components: Logging Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2866.D2277.1.patch Using the CachingPrintStream we are storing errors that occur locally in Hive, but because local map reduce jobs are run in a separate JVM we are not storing that occur for these. We can use this same construct to store errors written to the subprocesses error stream. This way, when we log failed queries, these will give us a decent idea of why those queries failed. See related issues: https://issues.apache.org/jira/browse/HIVE-2832 https://issues.apache.org/jira/browse/HIVE-2858 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2852) hive.stats.autogather should default to false.
[ https://issues.apache.org/jira/browse/HIVE-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235383#comment-13235383 ] Namit Jain commented on HIVE-2852: -- +1 hive.stats.autogather should default to false. -- Key: HIVE-2852 URL: https://issues.apache.org/jira/browse/HIVE-2852 Project: Hive Issue Type: Task Components: Configuration Affects Versions: 0.8.1 Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Minor Fix For: 0.9.0 Attachments: hive-2852.patch.1.txt hive.stats.autogather is set to true. A majority of people are not using indexing or this feature, as a result INSERT OVERWRITE TABLE statements are creating derby stat_db directories in the users CWD. This should be disabled for now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2852) hive.stats.autogather should default to false.
[ https://issues.apache.org/jira/browse/HIVE-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235846#comment-13235846 ] Namit Jain commented on HIVE-2852: -- A lot of test files need to be updated for this one hive.stats.autogather should default to false. -- Key: HIVE-2852 URL: https://issues.apache.org/jira/browse/HIVE-2852 Project: Hive Issue Type: Task Components: Configuration Affects Versions: 0.8.1 Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Minor Fix For: 0.9.0 Attachments: hive-2852.patch.1.txt hive.stats.autogather is set to true. A majority of people are not using indexing or this feature, as a result INSERT OVERWRITE TABLE statements are creating derby stat_db directories in the users CWD. This should be disabled for now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1
[ https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235266#comment-13235266 ] Namit Jain commented on HIVE-2084: -- @Alan, I cant seem to find the test which was failing for the old JDO upgrade. There was some problem when we upgraded last time. Upgrade datanucleus from 2.0.3 to 3.0.1 --- Key: HIVE-2084 URL: https://issues.apache.org/jira/browse/HIVE-2084 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ning Zhang Assignee: Sushanth Sowmyan Labels: datanucleus Attachments: HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, HIVE-2084.D2397.1.patch, HIVE-2084.patch It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2084) Upgrade datanucleus from 2.0.3 to 3.0.1
[ https://issues.apache.org/jira/browse/HIVE-2084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235268#comment-13235268 ] Namit Jain commented on HIVE-2084: -- OK, it is HIVE-1862. Did you run that test with the new JDO ? Upgrade datanucleus from 2.0.3 to 3.0.1 --- Key: HIVE-2084 URL: https://issues.apache.org/jira/browse/HIVE-2084 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ning Zhang Assignee: Sushanth Sowmyan Labels: datanucleus Attachments: HIVE-2084.1.patch.txt, HIVE-2084.2.patch.txt, HIVE-2084.D2397.1.patch, HIVE-2084.patch It seems the datanucleus 2.2.3 does a better join in caching. The time it takes to get the same set of partition objects takes about 1/4 of the time it took for the first time. While with 2.0.3, it took almost the same amount of time in the second execution. We should retest the test case mentioned in HIVE-1853, HIVE-1862. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2845) Add support for index joins in Hive
[ https://issues.apache.org/jira/browse/HIVE-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235371#comment-13235371 ] Namit Jain commented on HIVE-2845: -- This may not be very useful in the current hive setup, since the point lookup on the hive index is not very fast. However, this opens up a very wide variety of applications. Consider the scenario when one of the tables is stored in HBase. In that case, a join with other table can be reduced to a map-only job, with the mapper doing a point lookup for every row. This is very different from map-join where one of the tables is so small that it fits in memory. Add support for index joins in Hive --- Key: HIVE-2845 URL: https://issues.apache.org/jira/browse/HIVE-2845 Project: Hive Issue Type: New Feature Reporter: Namit Jain Labels: gsoc, gsoc2012 Hive supports indexes, which are used for filters currently. It would be very useful to add support for index-based joins in Hive. If 2 tables A and B are being joined, and an index exists on the join key of A, B can be scanned (by the mappers), and for each row in B, a lookup for the corresponding row in A can be performed. This can be very useful for some usecases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2875) Renaming partition changes partition location prefix
[ https://issues.apache.org/jira/browse/HIVE-2875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13231597#comment-13231597 ] Namit Jain commented on HIVE-2875: -- Lot of tests are failing for me. Can you debug this ? Renaming partition changes partition location prefix Key: HIVE-2875 URL: https://issues.apache.org/jira/browse/HIVE-2875 Project: Hive Issue Type: Bug Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2875.D2349.1.patch Renaming a partition changes the location of the partition to the default location of the table, followed by the partition specification. It should just change the partition specification of the path. If the path does not end with the old partition specification, we should probably throw an exception because renaming a partition should not change the path so dramatically, and not changing the path to reflect the new partition name could leave the partition in a very confusing state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2871) Add a new hook to run at the beginning and end of the Driver.run method
[ https://issues.apache.org/jira/browse/HIVE-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230756#comment-13230756 ] Namit Jain commented on HIVE-2871: -- +1 running tests Add a new hook to run at the beginning and end of the Driver.run method --- Key: HIVE-2871 URL: https://issues.apache.org/jira/browse/HIVE-2871 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2871.D2331.1.patch, HIVE-2871.D2331.2.patch Driver.run is the highest level method which all queries go through, whether they come from Hive Server, the CLI, or any other entry. We also do not have any hooks before the compilation method is called, and having hooks in Driver.run would provide this. Having hooks in Driver.run will allow, for example, being able to overwrite config values used throughout query processing, including compilation, and at the other end, cleaning up any resources/logging any final values just before returning to the user. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2837) insert into external tables should not be allowed
[ https://issues.apache.org/jira/browse/HIVE-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221942#comment-13221942 ] Namit Jain commented on HIVE-2837: -- I dont want to it table-by-table. The only other option I can think of is add another configuration variable. insert into external tables should not be allowed - Key: HIVE-2837 URL: https://issues.apache.org/jira/browse/HIVE-2837 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain This is a very risky thing to allow. Since, the external tables can point to any user location, which can potentially corrupt some other tables. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2833) Fix test failures caused by HIVE-2716
[ https://issues.apache.org/jira/browse/HIVE-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221946#comment-13221946 ] Namit Jain commented on HIVE-2833: -- running tests Fix test failures caused by HIVE-2716 - Key: HIVE-2833 URL: https://issues.apache.org/jira/browse/HIVE-2833 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Kevin Wilfong Attachments: HIVE-2716.D2055.1.patch, HIVE-2833.D2055.2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2833) Fix test failures caused by HIVE-2716
[ https://issues.apache.org/jira/browse/HIVE-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221129#comment-13221129 ] Namit Jain commented on HIVE-2833: -- Enis/Ashutosh, we can try coming up with the testcase. But, in the meanwhile, can I upload a patch which reverts this ? This will unblock us (facebook branch). Fix test failures caused by HIVE-2716 - Key: HIVE-2833 URL: https://issues.apache.org/jira/browse/HIVE-2833 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Enis Soztutar -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2833) Fix test failures caused by HIVE-2716
[ https://issues.apache.org/jira/browse/HIVE-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221435#comment-13221435 ] Namit Jain commented on HIVE-2833: -- Looks good to me, but I will wait from Ashutosh/Enis before I commit this. Please let us know if you see any problems with this. Fix test failures caused by HIVE-2716 - Key: HIVE-2833 URL: https://issues.apache.org/jira/browse/HIVE-2833 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Kevin Wilfong Attachments: HIVE-2716.D2055.1.patch, HIVE-2833.D2055.2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2833) Fix test failures caused by HIVE-2716
[ https://issues.apache.org/jira/browse/HIVE-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13220557#comment-13220557 ] Namit Jain commented on HIVE-2833: -- Can you please work on this as a very high priority ? We need a fix asap for this. Fix test failures caused by HIVE-2716 - Key: HIVE-2833 URL: https://issues.apache.org/jira/browse/HIVE-2833 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Enis Soztutar -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2612) support hive table/partitions exists in more than one region
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205688#comment-13205688 ] Namit Jain commented on HIVE-2612: -- Changed 'cluster' to 'region' and ran the testclidriver tests. Overwrote a bunch of test files. Kevin, can you change the upgrade files, since the schema has changed. support hive table/partitions exists in more than one region Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Namit Jain Attachments: HIVE-2612.1.patch, HIVE-2612.2.patch.txt, HIVE-2612.3.patch.txt, HIVE-2612.4.patch.txt, HIVE-2612.D1569.1.patch, HIVE-2612.D1569.2.patch, HIVE-2612.D1569.3.patch, HIVE-2612.D1569.4.patch, HIVE-2612.D1569.5.patch, HIVE-2612.D1569.6.patch, HIVE-2612.D1569.7.patch, hive.2612.5.patch 1) add region object into hive metastore 2) each partition/table has a primary region and a list of living regions, and also data location in each region -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2785) support use cluster
[ https://issues.apache.org/jira/browse/HIVE-2785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205700#comment-13205700 ] Namit Jain commented on HIVE-2785: -- The semantics are as follows: use region; Use the default region based on the primary region of the table being queried. Choose the jobtracker, dfs from the region provided. use region region_name; Use the region specified. The query should only succeed if all the partitions of the table are present in the primary region support use cluster --- Key: HIVE-2785 URL: https://issues.apache.org/jira/browse/HIVE-2785 Project: Hive Issue Type: New Feature Reporter: Namit Jain use cluster; use cluster cluster_name; should be supported -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2786) Throw an error if the user tries to insert a table into a region other than the primary region
[ https://issues.apache.org/jira/browse/HIVE-2786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205706#comment-13205706 ] Namit Jain commented on HIVE-2786: -- The semantics are as follows: If the table/partition is present in the primary region, and the insert is happening in the secondary region, verify that the table/partition is the same (the file size should be exactly the same). Otherwise, the operation fails. If the table/partition is present in the secondary region, and the insert is happening in the primary region, verify that the table/partition is the same (the file size should be exactly the same). Otherwise, the secondary region needs to be deleted. Throw an error if the user tries to insert a table into a region other than the primary region -- Key: HIVE-2786 URL: https://issues.apache.org/jira/browse/HIVE-2786 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Namit Jain By default, the user can only insert into the primary region. Add an option to insert into the secondary region also. The config variable is 'hive.insert.secondary.regions' - default for that variable is false -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2612) support hive table/partitions coexistes in more than one clusters
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13202726#comment-13202726 ] Namit Jain commented on HIVE-2612: -- All the existing APIs will continue to work. support hive table/partitions coexistes in more than one clusters - Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Namit Jain Attachments: HIVE-2612.1.patch, HIVE-2612.2.patch.txt, HIVE-2612.D1569.1.patch, HIVE-2612.D1569.2.patch, HIVE-2612.D1569.3.patch, HIVE-2612.D1569.4.patch, HIVE-2612.D1569.5.patch, HIVE-2612.D1569.6.patch 1) add cluster object into hive metastore 2) each partition/table has a creation cluster and a list of living clusters, and also data location in each cluster -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2612) support hive table/partitions coexistes in more than one clusters
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13202724#comment-13202724 ] Namit Jain commented on HIVE-2612: -- Can everyone concerned please take a look ? For anyone not using clusters, they need to run the scripts provided in this patch to upgrade the metastore. The time taken for the upgrade depends on the size of the metastore (number of tables/partitions), but it should be fairly small - it is less than 10 minutes for facebook cluster. support hive table/partitions coexistes in more than one clusters - Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Namit Jain Attachments: HIVE-2612.1.patch, HIVE-2612.2.patch.txt, HIVE-2612.D1569.1.patch, HIVE-2612.D1569.2.patch, HIVE-2612.D1569.3.patch, HIVE-2612.D1569.4.patch, HIVE-2612.D1569.5.patch, HIVE-2612.D1569.6.patch 1) add cluster object into hive metastore 2) each partition/table has a creation cluster and a list of living clusters, and also data location in each cluster -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2612) support hive table/partitions coexistes in more than one clusters
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13200101#comment-13200101 ] Namit Jain commented on HIVE-2612: -- https://cwiki.apache.org/confluence/display/Hive/Hive+across+Multiple+Data+Centers+%28Physical+Clusters%29 is the correct link to the wiki support hive table/partitions coexistes in more than one clusters - Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Namit Jain Attachments: HIVE-2612.1.patch, HIVE-2612.D1569.1.patch 1) add cluster object into hive metastore 2) each partition/table has a creation cluster and a list of living clusters, and also data location in each cluster -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2747) UNION ALL with subquery which selects NULL and performs group by fails
[ https://issues.apache.org/jira/browse/HIVE-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13199099#comment-13199099 ] Namit Jain commented on HIVE-2747: -- create table tmp_tx1 (key string, value string); insert overwrite table tmp_tx1 select * from src where key 10; create table tmp_tx2 (key int, value string); insert overwrite table tmp_tx2 select * from src where key 10; select * from ( select key, count(1) from tmp_tx1 group by key union all select key, count(1) from tmp_tx2 group by key ) u; The above also fails. Basically, if the types of the union dont match, we get a run-time error. The query: select * from ( select key, count(1) from tmp_tx1 group by key union all select cast(key as string) as key, count(1) from tmp_tx2 group by key ) u; works fine UNION ALL with subquery which selects NULL and performs group by fails -- Key: HIVE-2747 URL: https://issues.apache.org/jira/browse/HIVE-2747 Project: Hive Issue Type: Bug Reporter: Kevin Wilfong Assignee: Kevin Wilfong Queries like the following from (select key, value, count(1) as count from src group by key, value union all select NULL as key, value, count(1) as count from src group by value) a select count(*); fail with the exception java.lang.NullPointerException at org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector.toString(StructObjectInspector.java:60) at java.lang.String.valueOf(String.java:2826) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:110) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:427) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:98) ... 18 more This should at least provide a more informative error message if not work. It works without the group by. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2747) UNION ALL with subquery which selects NULL and performs group by fails
[ https://issues.apache.org/jira/browse/HIVE-2747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13199166#comment-13199166 ] Namit Jain commented on HIVE-2747: -- Even a error message for now is fine. The implicit type conversion can be done later. UNION ALL with subquery which selects NULL and performs group by fails -- Key: HIVE-2747 URL: https://issues.apache.org/jira/browse/HIVE-2747 Project: Hive Issue Type: Bug Reporter: Kevin Wilfong Queries like the following from (select key, value, count(1) as count from src group by key, value union all select NULL as key, value, count(1) as count from src group by value) a select count(*); fail with the exception java.lang.NullPointerException at org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector.toString(StructObjectInspector.java:60) at java.lang.String.valueOf(String.java:2826) at java.lang.StringBuilder.append(StringBuilder.java:115) at org.apache.hadoop.hive.ql.exec.UnionOperator.initializeOp(UnionOperator.java:110) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:427) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:98) ... 18 more This should at least provide a more informative error message if not work. It works without the group by. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2612) support hive table/partitions coexistes in more than one clusters
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13198175#comment-13198175 ] Namit Jain commented on HIVE-2612: -- https://cwiki.apache.org/confluence/display/Hive/Hive+across+Multiple+Data+Centers+(Physical+Clusters) Added a new document which explains some of the thinking and the design. Please comment support hive table/partitions coexistes in more than one clusters - Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Namit Jain Attachments: HIVE-2612.1.patch 1) add cluster object into hive metastore 2) each partition/table has a creation cluster and a list of living clusters, and also data location in each cluster -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2762) Alter Table Partition Concatenate Fails On Certain Characters
[ https://issues.apache.org/jira/browse/HIVE-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13198342#comment-13198342 ] Namit Jain commented on HIVE-2762: -- +1 Running tests Alter Table Partition Concatenate Fails On Certain Characters - Key: HIVE-2762 URL: https://issues.apache.org/jira/browse/HIVE-2762 Project: Hive Issue Type: Bug Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2762.1.patch.txt, HIVE-2762.2.patch.txt, HIVE-2762.D1533.1.patch, HIVE-2762.D1533.2.patch, HIVE-2762.D1533.3.patch Alter table partition concatenate creates a Java URI object for the location of a partition. If the partition name contains certain characters, such as } or space ' ', the object constructor fails, causing the query to fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2612) support hive table/partitions coexistes in more than one clusters
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13196422#comment-13196422 ] Namit Jain commented on HIVE-2612: -- bq. A table T1's primary cluster is C1 meaning :1) C1 contains all data that is available in all other clusters. Does this mean that if T1's primary cluster is C1, then all of the partitions in T1 must also have have their primary partition set to C1? If that's the case then primary cluster should probably be a table level property, and the list of replica clusters can be a table/partition level property. I agree support hive table/partitions coexistes in more than one clusters - Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Namit Jain 1) add cluster object into hive metastore 2) each partition/table has a creation cluster and a list of living clusters, and also data location in each cluster -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2612) support hive table/partitions coexistes in more than one clusters
[ https://issues.apache.org/jira/browse/HIVE-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13196447#comment-13196447 ] Namit Jain commented on HIVE-2612: -- .bq write is only allowed in this cluster for table C1. but need to allow exceptions here. What are the exceptions? Currently, there should be no exceptions. Eventually, if we provide something in hive to do a cross-cluster write, that should be like an exception. There may be a hive command like, Replicate T@P from cluster1 to c1uster2. .bq all data changes to T1 happened in the primary cluster should be replicated to other clusters if there are any secondary clusters. but there should be a conf to disable it as there are some exception situations. This question should not be relevant now. A much simpler to visualize this is: for every table, there is a primary cluster, and a list of secondary clusters. All the partitions belong to the primary cluster, and may belong to one or more secondary clusters. Every hive session has a current cluster, and the read happens from the current cluster. An error is thrown if the partition is missing from the current cluster, but is present in the primary cluster. I will write a new wiki, and attach it - it might be simpler to understand that way. Dynamic partitions should not require anything different. .bq overwrite database name for the purpose of cluster name. And allow a table co-exist in multiple databases. But that require to promote table to top level citizen, and degrade database. For example, show tables used to scan all tables in current db, but now need to scan all tables in all databases. I don't think this is an option since it breaks backwards compatibility and effectively changes the whole notion of what a db/schema is. A lot of people in the community already depend on this feature. Agreed. .bq add a cluster parameter to existing thrift interfaces. This sounds like the best option to me. I think Thrift supports API evolution via default values for missing parameters, but setting a default value in this case may be a little tricky. Agreed .bq Also, instead of modifying the Thrift interface, is it possible that you could instead leverage the work that's being done in HIVE-2720? Will look into it support hive table/partitions coexistes in more than one clusters - Key: HIVE-2612 URL: https://issues.apache.org/jira/browse/HIVE-2612 Project: Hive Issue Type: New Feature Components: Metastore Reporter: He Yongqiang Assignee: Namit Jain 1) add cluster object into hive metastore 2) each partition/table has a creation cluster and a list of living clusters, and also data location in each cluster -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2750) Hive multi group by single reducer optimization causes invalid column reference error
[ https://issues.apache.org/jira/browse/HIVE-2750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193555#comment-13193555 ] Namit Jain commented on HIVE-2750: -- +1 Hive multi group by single reducer optimization causes invalid column reference error - Key: HIVE-2750 URL: https://issues.apache.org/jira/browse/HIVE-2750 Project: Hive Issue Type: Bug Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2750.D1455.1.patch After the optimization, if two query blocks have the same distinct clause and the same group by keys, but the first query block does not reference all the rows the second query block does, an invalid column reference error is raised for the columns unreferenced in the first query block. E.g. FROM src INSERT OVERWRITE TABLE dest_g2 SELECT substr(src.key,1,1), count(DISTINCT src.key) WHERE substr(src.key,1,1) = 5 GROUP BY substr(src.key,1,1) INSERT OVERWRITE TABLE dest_g3 SELECT substr(src.key,1,1), count(DISTINCT src.key), count(src.value) WHERE substr(src.key,1,1) 5 GROUP BY substr(src.key,1,1); This results in an invalid column reference error on src.value -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2674) get_partitions_ps throws TApplicationException if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-2674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189362#comment-13189362 ] Namit Jain commented on HIVE-2674: -- +1 get_partitions_ps throws TApplicationException if table doesn't exist - Key: HIVE-2674 URL: https://issues.apache.org/jira/browse/HIVE-2674 Project: Hive Issue Type: Bug Components: Metastore Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2674.D987.1.patch, HIVE-2674.D987.2.patch If the table passed to get_partition_ps doesn't exist, a NPE is thrown by getPartitionPsQueryResults. There should be a check here, which throws a NoSuchObjectException if the table doesn't exist. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2674) get_partitions_ps throws TApplicationException if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-2674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189364#comment-13189364 ] Namit Jain commented on HIVE-2674: -- running tests get_partitions_ps throws TApplicationException if table doesn't exist - Key: HIVE-2674 URL: https://issues.apache.org/jira/browse/HIVE-2674 Project: Hive Issue Type: Bug Components: Metastore Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2674.D987.1.patch, HIVE-2674.D987.2.patch If the table passed to get_partition_ps doesn't exist, a NPE is thrown by getPartitionPsQueryResults. There should be a check here, which throws a NoSuchObjectException if the table doesn't exist. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2589) Newly created partition should inherit properties from table
[ https://issues.apache.org/jira/browse/HIVE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185209#comment-13185209 ] Namit Jain commented on HIVE-2589: -- @Ashutosh, usually we dont commit our patches ourselves. The chances of a mistake are high. Newly created partition should inherit properties from table Key: HIVE-2589 URL: https://issues.apache.org/jira/browse/HIVE-2589 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.9.0 Attachments: hive-2589.patch, hive-2589.patch, hive-2589_1.patch, hive-2589_2.patch This will make all the info contained in table properties available to partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2682) Clean-up logs
[ https://issues.apache.org/jira/browse/HIVE-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13184505#comment-13184505 ] Namit Jain commented on HIVE-2682: -- Sorry Ashutosh, didn't notice that you are looking at it. Clean-up logs - Key: HIVE-2682 URL: https://issues.apache.org/jira/browse/HIVE-2682 Project: Hive Issue Type: Wish Components: Logging Affects Versions: 0.8.1, 0.9.0 Reporter: Rajat Goel Assignee: Rajat Goel Priority: Trivial Labels: logging Attachments: HIVE-2682.D1035.1.patch, HIVE-2682.D1035.2.patch, HIVE-2682.D1035.3.patch, hive-2682.patch Original Estimate: 24h Remaining Estimate: 24h Just wanted to cleanup some logs being printed at wrong loglevel - 1. org.apache.hadoop.hive.ql.exec.CommonJoinOperator prints table 0 has 1000 rows for join key [...] as WARNING. Is it really that? 2. org.apache.hadoop.hive.ql.exec.GroupByOperator prints Hash Table completed flushed and Begin Hash Table flush at close: size = 21 as WARNING. It shouldn't be. 3. org.apache.hadoop.hive.ql.stats.jdbc.JDBCStatsPublisher prints Warning. Invalid statistic. which looks fishy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2695) Add PRINTF() Udf
[ https://issues.apache.org/jira/browse/HIVE-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13184518#comment-13184518 ] Namit Jain commented on HIVE-2695: -- +1 Add PRINTF() Udf Key: HIVE-2695 URL: https://issues.apache.org/jira/browse/HIVE-2695 Project: Hive Issue Type: New Feature Components: UDF Reporter: Carl Steinbach Assignee: Zhenxiao Luo Attachments: HIVE-2695.D1155.1.patch, HIVE-2695.D1161.1.patch, HIVE-2695.D1173.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2589) Newly created partition should inherit properties from table
[ https://issues.apache.org/jira/browse/HIVE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13184539#comment-13184539 ] Namit Jain commented on HIVE-2589: -- Sorry for the delay, looks good +1 Newly created partition should inherit properties from table Key: HIVE-2589 URL: https://issues.apache.org/jira/browse/HIVE-2589 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.9.0 Attachments: hive-2589.patch, hive-2589.patch, hive-2589_1.patch, hive-2589_2.patch This will make all the info contained in table properties available to partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2674) get_partitions_ps throws TApplicationException if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-2674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13184546#comment-13184546 ] Namit Jain commented on HIVE-2674: -- The code changes look good, but there are a bunch of conflicts. Can you refresh ? get_partitions_ps throws TApplicationException if table doesn't exist - Key: HIVE-2674 URL: https://issues.apache.org/jira/browse/HIVE-2674 Project: Hive Issue Type: Bug Components: Metastore Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2674.D987.1.patch If the table passed to get_partition_ps doesn't exist, a NPE is thrown by getPartitionPsQueryResults. There should be a check here, which throws a NoSuchObjectException if the table doesn't exist. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2504) Warehouse table subdirectories should inherit the group permissions of the warehouse parent directory
[ https://issues.apache.org/jira/browse/HIVE-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13184550#comment-13184550 ] Namit Jain commented on HIVE-2504: -- +1 Warehouse table subdirectories should inherit the group permissions of the warehouse parent directory - Key: HIVE-2504 URL: https://issues.apache.org/jira/browse/HIVE-2504 Project: Hive Issue Type: Bug Components: Metastore Reporter: Carl Steinbach Assignee: Chinna Rao Lalam Attachments: HIVE-2504.patch When the Hive Metastore creates a subdirectory in the Hive warehouse for a new table it does so with the default HDFS permissions. Since the default dfs.umask value is 022, this means that the new subdirectory will not inherit the group write permissions of the hive warehouse directory. We should make the umask used by Warehouse.mkdirs() configurable, and set it to use a default value of 002. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2621) Allow multiple group bys with the same input data and spray keys to be run on the same reducer.
[ https://issues.apache.org/jira/browse/HIVE-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174891#comment-13174891 ] Namit Jain commented on HIVE-2621: -- Let me take a look at the code again: But the general flow should be as follows: if hive.multigroupby.singlereducer is true (which should always be), find common distincts. (or the check hive.multigroupby.singlereducer can be done inside find common distincts function itself) if common distincts == null old (current) approach - map side aggr should be used else: new code path What do you think ? That way, we are guaranteed that the existing behavior is not changed. This new parameter is only affecting distincts, and we it is very easy to turn it off I know the code is kind of messy here, but can you spend some time to modularize it, and reuse as much as possible ? Allow multiple group bys with the same input data and spray keys to be run on the same reducer. --- Key: HIVE-2621 URL: https://issues.apache.org/jira/browse/HIVE-2621 Project: Hive Issue Type: New Feature Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2621.1.patch.txt, HIVE-2621.D567.1.patch, HIVE-2621.D567.2.patch, HIVE-2621.D567.3.patch Currently, when a user runs a query, such as a multi-insert, where each insertion subclause consists of a simple query followed by a group by, the group bys for each clause are run on a separate reducer. This requires writing the data for each group by clause to an intermediate file, and then reading it back. This uses a significant amount of the total CPU consumed by the query for an otherwise simple query. If the subclauses are grouped by their distinct expressions and group by keys, with all of the group by expressions for a group of subclauses run on a single reducer, this would reduce the amount of reading/writing to intermediate files for some queries. To do this, for each group of subclauses, in the mapper we would execute a the filters for each subclause 'or'd together (provided each subclause has a filter) followed by a reduce sink. In the reducer, the child operators would be each subclauses filter followed by the group by and any subsequent operations. Note that this would require turning off map aggregation, so we would need to make using this type of plan configurable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2642) fix Hive-2566 and make union optimization more aggressive
[ https://issues.apache.org/jira/browse/HIVE-2642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174211#comment-13174211 ] Namit Jain commented on HIVE-2642: -- 1 general comment about the new test union26.q -- Reduce the test output, I mean, you dont need to load all 500 rows for this test. It makes the test output really difficult to review. Again, all the above 3 are not blockers - I am still reviewing, I will file a enhancement for all the follow-ups. fix Hive-2566 and make union optimization more aggressive -- Key: HIVE-2642 URL: https://issues.apache.org/jira/browse/HIVE-2642 Project: Hive Issue Type: Improvement Reporter: He Yongqiang Assignee: He Yongqiang Attachments: HIVE-2642.D735.1.patch Hive-2566 did some optimizations to union, but cause some problems. And then got reverted. This is to get it back and fix the problems we saw, and also make union optimization more aggressive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2642) fix Hive-2566 and make union optimization more aggressive
[ https://issues.apache.org/jira/browse/HIVE-2642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172854#comment-13172854 ] Namit Jain commented on HIVE-2642: -- Look at union22.q.out. map-join followed by union, an extra stage is introduced. We dont hsve to optimize this - just wanted to make sure it is intentional. fix Hive-2566 and make union optimization more aggressive -- Key: HIVE-2642 URL: https://issues.apache.org/jira/browse/HIVE-2642 Project: Hive Issue Type: Improvement Reporter: He Yongqiang Assignee: He Yongqiang Attachments: HIVE-2642.D735.1.patch Hive-2566 did some optimizations to union, but cause some problems. And then got reverted. This is to get it back and fix the problems we saw, and also make union optimization more aggressive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2660) Need better exception handling in RCFile tolerate corruptions mode
[ https://issues.apache.org/jira/browse/HIVE-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13170829#comment-13170829 ] Namit Jain commented on HIVE-2660: -- +1 Need better exception handling in RCFile tolerate corruptions mode -- Key: HIVE-2660 URL: https://issues.apache.org/jira/browse/HIVE-2660 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Ramkumar Vadali Assignee: Ramkumar Vadali Priority: Minor Attachments: HIVE-2660.patch The exception handling in nextKeyValueTolerateCorruptions treats IOException as follows: - if EOFException, corrupt, can be tolerated - If CheckSumException, corrupt, can be tolerated - else not a corruption, re-throw But the compression code can also throw IOException in case of corruption, which will get re-thrown in this case. The correct way of handling IOException is: - if BlockMissingException, re-throw. - if not BlockMissingException - corruption, can be tolerated -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2617) Insert overwrite table db.tname fails if partition already exists
[ https://issues.apache.org/jira/browse/HIVE-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169655#comment-13169655 ] Namit Jain commented on HIVE-2617: -- +1 Insert overwrite table db.tname fails if partition already exists -- Key: HIVE-2617 URL: https://issues.apache.org/jira/browse/HIVE-2617 Project: Hive Issue Type: Bug Components: Metastore Reporter: Aniket Mokashi Assignee: Chinna Rao Lalam Attachments: HIVE-2617.1.patch, HIVE-2617.D843.1.patch, HIVE-2617.patch Insert Overwrite table db.tname fails if partition already exists. For example- insert overwrite table db.tname PARTITION(part='p') select .. from t2 where part='p'; fails if partition 'p' already exists. Workaround is - use db; and the fire the command. From the source code- alterPartition(tbl.getTableName(), new Partition(tbl, tpart)); takes String tablename as argument and loses db information. Table table = newTable(tablename) is called to retrieve table from name. But, it relies on currentDatabase value (hence the workaround). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2602) add support for insert partition overwrite(...) if not exists
[ https://issues.apache.org/jira/browse/HIVE-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169978#comment-13169978 ] Namit Jain commented on HIVE-2602: -- 1. make the tests deterministic. Add a order by when you are selecting 5 rows (limit 5) 2. Throw a semantic error for 'if not exists' for dynamic partitions - it might be confusing to document this behavior for dynamic partitions, so let us now allow it. add support for insert partition overwrite(...) if not exists - Key: HIVE-2602 URL: https://issues.apache.org/jira/browse/HIVE-2602 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Chinna Rao Lalam Attachments: HIVE-2602.1.patch, HIVE-2602.2.patch, HIVE-2602.D579.1.patch, HIVE-2602.D879.1.patch, HIVE-2602.patch INSERT OVERWRITE TABLE X PARTITION (a=b, c=d) IF NOT EXISTS ... The partition should be created and written if and only if it's not there already. The support can be added for dynamic partitions in the future, but this jira is for adding this support for static partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2654) hive.querylog.location requires parent directory to be exist or else folder creation fails
[ https://issues.apache.org/jira/browse/HIVE-2654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169984#comment-13169984 ] Namit Jain commented on HIVE-2654: -- +1 hive.querylog.location requires parent directory to be exist or else folder creation fails Key: HIVE-2654 URL: https://issues.apache.org/jira/browse/HIVE-2654 Project: Hive Issue Type: Bug Components: Query Processor Environment: Hadoop 0.20.1, Hive0.9.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5). Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-2654.D885.1.patch, HIVE-2654.patch if value of hive.querylog.location is '/tmp/root/hive123/test' if the parent directories not exist the creation of the folder is failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2589) Newly created partition should inherit properties from table
[ https://issues.apache.org/jira/browse/HIVE-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169985#comment-13169985 ] Namit Jain commented on HIVE-2589: -- @Ashutosh, sorry about the delay. Can you refresh ? I am getting some conflicts while applying the patch. I will take a look. Newly created partition should inherit properties from table Key: HIVE-2589 URL: https://issues.apache.org/jira/browse/HIVE-2589 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.9.0 Attachments: hive-2589.patch, hive-2589.patch, hive-2589_1.patch This will make all the info contained in table properties available to partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2640) Add alterPartition to AlterHandler interface
[ https://issues.apache.org/jira/browse/HIVE-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13168990#comment-13168990 ] Namit Jain commented on HIVE-2640: -- +1 Add alterPartition to AlterHandler interface Key: HIVE-2640 URL: https://issues.apache.org/jira/browse/HIVE-2640 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2640.D699.1.patch, HIVE-2640.D699.2.patch Adding alterPartition to the AlterHandler interface would allow for customized functionality to be executed as part of altering a partition, much like it is already allowed for alterTable. Based on the name of the interface, and a comment in the AlterHandler code, it looks like alterPartition was meant to be included along with alterTable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2611) Make index table output of create index command if index is table based
[ https://issues.apache.org/jira/browse/HIVE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13168998#comment-13168998 ] Namit Jain commented on HIVE-2611: -- +1 Make index table output of create index command if index is table based --- Key: HIVE-2611 URL: https://issues.apache.org/jira/browse/HIVE-2611 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2611.1.patch.txt, HIVE-2611.D705.1.patch, HIVE-2611.D705.2.patch, HIVE-2611.D705.3.patch If an index is table based, when that index is created a table is created to contain that index. This should be listed in the output of the command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2602) add support for insert partition overwrite(...) if not exists
[ https://issues.apache.org/jira/browse/HIVE-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169000#comment-13169000 ] Namit Jain commented on HIVE-2602: -- I am getting a lot of merge conflicts. Can you refresh ? add support for insert partition overwrite(...) if not exists - Key: HIVE-2602 URL: https://issues.apache.org/jira/browse/HIVE-2602 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Chinna Rao Lalam Attachments: HIVE-2602.1.patch, HIVE-2602.D579.1.patch, HIVE-2602.patch INSERT OVERWRITE TABLE X PARTITION (a=b, c=d) IF NOT EXISTS ... The partition should be created and written if and only if it's not there already. The support can be added for dynamic partitions in the future, but this jira is for adding this support for static partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2617) Insert overwrite table db.tname fails if partition already exists
[ https://issues.apache.org/jira/browse/HIVE-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13169103#comment-13169103 ] Namit Jain commented on HIVE-2617: -- The code changes look good - but the test is failing for me. Can you reduce the testcase by selecting fewer rows from the table or creating new data files with fewer rows ? Insert overwrite table db.tname fails if partition already exists -- Key: HIVE-2617 URL: https://issues.apache.org/jira/browse/HIVE-2617 Project: Hive Issue Type: Bug Components: Metastore Reporter: Aniket Mokashi Assignee: Chinna Rao Lalam Attachments: HIVE-2617.D843.1.patch, HIVE-2617.patch Insert Overwrite table db.tname fails if partition already exists. For example- insert overwrite table db.tname PARTITION(part='p') select .. from t2 where part='p'; fails if partition 'p' already exists. Workaround is - use db; and the fire the command. From the source code- alterPartition(tbl.getTableName(), new Partition(tbl, tpart)); takes String tablename as argument and loses db information. Table table = newTable(tablename) is called to retrieve table from name. But, it relies on currentDatabase value (hence the workaround). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2611) Make index table output of create index command if index is table based
[ https://issues.apache.org/jira/browse/HIVE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166374#comment-13166374 ] Namit Jain commented on HIVE-2611: -- https://cwiki.apache.org/Hive/phabricatorcodereview.html Make index table output of create index command if index is table based --- Key: HIVE-2611 URL: https://issues.apache.org/jira/browse/HIVE-2611 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2611.1.patch.txt If an index is table based, when that index is created a table is created to contain that index. This should be listed in the output of the command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2611) Make index table output of create index command if index is table based
[ https://issues.apache.org/jira/browse/HIVE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166373#comment-13166373 ] Namit Jain commented on HIVE-2611: -- Kevin, can you use phabricator to submit a diff ? Make index table output of create index command if index is table based --- Key: HIVE-2611 URL: https://issues.apache.org/jira/browse/HIVE-2611 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2611.1.patch.txt If an index is table based, when that index is created a table is created to contain that index. This should be listed in the output of the command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2611) Make index table output of create index command if index is table based
[ https://issues.apache.org/jira/browse/HIVE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166376#comment-13166376 ] Namit Jain commented on HIVE-2611: -- +1 Make index table output of create index command if index is table based --- Key: HIVE-2611 URL: https://issues.apache.org/jira/browse/HIVE-2611 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2611.1.patch.txt If an index is table based, when that index is created a table is created to contain that index. This should be listed in the output of the command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2611) Make index table output of create index command if index is table based
[ https://issues.apache.org/jira/browse/HIVE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166470#comment-13166470 ] Namit Jain commented on HIVE-2611: -- Can you refresh ? I am getting some merge conflicts Make index table output of create index command if index is table based --- Key: HIVE-2611 URL: https://issues.apache.org/jira/browse/HIVE-2611 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2611.1.patch.txt, HIVE-2611.D705.1.patch, HIVE-2611.D705.2.patch If an index is table based, when that index is created a table is created to contain that index. This should be listed in the output of the command. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2520) left semi join will duplicate data
[ https://issues.apache.org/jira/browse/HIVE-2520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166493#comment-13166493 ] Namit Jain commented on HIVE-2520: -- +1 left semi join will duplicate data -- Key: HIVE-2520 URL: https://issues.apache.org/jira/browse/HIVE-2520 Project: Hive Issue Type: Bug Affects Versions: 0.7.0 Reporter: binlijin Assignee: binlijin Priority: Critical Labels: patch Attachments: HIVE-2520.D717.1.patch, hive-2520.2.patch, hive-2520.patch CREATE TABLE sales (name STRING, id INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'; CREATE TABLE things (id INT, name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'; The 'sales' table has data in a file: sales.txt, and the data is: Joe 2 Hank 2 The 'things' table has data int two files: things.txt and things2.txt: The content of things.txt is : 2 Tie The content of things2.txt is : 2 Tie SELECT * FROM sales LEFT SEMI JOIN things ON (sales.id = things.id); will output: Joe 2 Joe 2 Hank 2 Hank 2 so the result is wrong. In CommonJoinOperator left semi join should use genObject(null, 0, new IntermediateObject(new ArrayList[numAliases], 0), true); to generate data. but now it uses genUniqueJoinObject(0, 0); to generate data. This patch will solve this problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2640) Add alterPartition to AlterHandler interface
[ https://issues.apache.org/jira/browse/HIVE-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166576#comment-13166576 ] Namit Jain commented on HIVE-2640: -- Can you refresh ? Applying the patch fails Add alterPartition to AlterHandler interface Key: HIVE-2640 URL: https://issues.apache.org/jira/browse/HIVE-2640 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2640.D699.1.patch Adding alterPartition to the AlterHandler interface would allow for customized functionality to be executed as part of altering a partition, much like it is already allowed for alterTable. Based on the name of the interface, and a comment in the AlterHandler code, it looks like alterPartition was meant to be included along with alterTable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2628) move one line log from MapOperator to HiveContextAwareRecordReader
[ https://issues.apache.org/jira/browse/HIVE-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166581#comment-13166581 ] Namit Jain commented on HIVE-2628: -- +1 move one line log from MapOperator to HiveContextAwareRecordReader -- Key: HIVE-2628 URL: https://issues.apache.org/jira/browse/HIVE-2628 Project: Hive Issue Type: Improvement Reporter: He Yongqiang Assignee: He Yongqiang Attachments: HIVE-2628.D615.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2526) lastAceesTime is always zero when executed through describe extended table_name unlike show table extende like table_name where lastAccessTime is updated.
[ https://issues.apache.org/jira/browse/HIVE-2526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166588#comment-13166588 ] Namit Jain commented on HIVE-2526: -- If we want to maintain the last access time, it should be at the metadata level (in the metastore), and not computed dynamically. Also, if you want to maintain the last access time, it should be done for both hive access and metastore access. lastAceesTime is always zero when executed through describe extended table_name unlike show table extende like table_name where lastAccessTime is updated. - Key: HIVE-2526 URL: https://issues.apache.org/jira/browse/HIVE-2526 Project: Hive Issue Type: Bug Affects Versions: 0.9.0 Environment: Linux : SuSE 11 SP1 Reporter: rohithsharma Assignee: Priyadarshini Priority: Minor Attachments: HIVE-2526.patch When the table is accessed(load),lastAccessTime is displaying updated accessTime in show table extended like table_name.But describe extended table_name is always displaying zero. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1996) LOAD DATA INPATH fails when the table already contains a file of the same name
[ https://issues.apache.org/jira/browse/HIVE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13166594#comment-13166594 ] Namit Jain commented on HIVE-1996: -- Yongqiang, can you take a look ? LOAD DATA INPATH fails when the table already contains a file of the same name Key: HIVE-1996 URL: https://issues.apache.org/jira/browse/HIVE-1996 Project: Hive Issue Type: Bug Affects Versions: 0.7.0 Reporter: Kirk True Assignee: Chinna Rao Lalam Attachments: HIVE-1996.1.Patch, HIVE-1996.2.Patch, HIVE-1996.Patch Steps: 1. From the command line copy the kv2.txt data file into the current user's HDFS directory: {{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt kv2.txt}} 2. In Hive, create the table: {{create table tst_src1 (key_ int, value_ string);}} 3. Load the data into the table from HDFS: {{load data inpath './kv2.txt' into table tst_src1;}} 4. Repeat step 1 5. Repeat step 3 Expected: To have kv2.txt renamed in HDFS and then copied to the destination as per HIVE-307. Actual: File is renamed, but {{Hive.copyFiles}} doesn't see the change in {{srcs}} as it continues to use the same array elements (with the un-renamed, old file names). It crashes with this error: {noformat} java.lang.NullPointerException at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725) at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541) at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1060) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:897) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:745) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2602) add support for insert partition overwrite(...) if not exists
[ https://issues.apache.org/jira/browse/HIVE-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165358#comment-13165358 ] Namit Jain commented on HIVE-2602: -- Sorry about the delay on my part for review add support for insert partition overwrite(...) if not exists - Key: HIVE-2602 URL: https://issues.apache.org/jira/browse/HIVE-2602 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Chinna Rao Lalam Attachments: HIVE-2602.D579.1.patch, HIVE-2602.patch INSERT OVERWRITE TABLE X PARTITION (a=b, c=d) IF NOT EXISTS ... The partition should be created and written if and only if it's not there already. The support can be added for dynamic partitions in the future, but this jira is for adding this support for static partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2332) If all of the parameters of distinct functions are exists in group by columns, query fails in runtime
[ https://issues.apache.org/jira/browse/HIVE-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165374#comment-13165374 ] Namit Jain commented on HIVE-2332: -- Navis, please use the following instructions in future for patches https://cwiki.apache.org/Hive/phabricatorcodereview.html. We are planning to move to phabricator from review board. I have already done this for this jira. Will review it soon. If all of the parameters of distinct functions are exists in group by columns, query fails in runtime - Key: HIVE-2332 URL: https://issues.apache.org/jira/browse/HIVE-2332 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Critical Fix For: 0.9.0 Attachments: HIVE-2332.1.patch.txt, HIVE-2332.2.patch.txt, HIVE-2332.D663.1.patch select sum(key_int1), sum(distinct key_int1) from t1 group by key_int1; fails with message.. {code} FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask {code} hadoop says.. {code} Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:95) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:86) at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:252) at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initEvaluatorsAndReturnStruct(ReduceSinkOperator.java:188) at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:197) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:85) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:532) {code} I think the deficient number of key expression, compared to number of key column, is the problem, which should be equal or more. Would it be solved if add some key expression? I'll try. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2329) Not using map aggregation, fails to execute group-by after cluster-by with same key
[ https://issues.apache.org/jira/browse/HIVE-2329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165871#comment-13165871 ] Namit Jain commented on HIVE-2329: -- +1 Not using map aggregation, fails to execute group-by after cluster-by with same key --- Key: HIVE-2329 URL: https://issues.apache.org/jira/browse/HIVE-2329 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Fix For: 0.9.0 Attachments: HIVE-2329.1.patch.txt, HIVE-2329.D657.1.patch hive.map.aggr=false select Q1.key_int1, sum(Q1.key_int1), sum(distinct Q1.key_int1) from (select * from t1 cluster by key_int1) Q1 group by Q1.key_int1 resulted.. FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask from hadoop logs.. Caused by: java.lang.RuntimeException: cannot find field key from [] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:321) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:119) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:82) at org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:198) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433) I think the problem is caused by ReduceSinkDeDuplication, removing RS which was providing rs.key for GBY operation. If child of child RS is a GBY, we should bypass the optimization. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2246) Dedupe tables' column schemas from partitions in the metastore db
[ https://issues.apache.org/jira/browse/HIVE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13161747#comment-13161747 ] Namit Jain commented on HIVE-2246: -- Note that there is a bug in the upgrade script. After running this script, the column information for all the partitions is lost. They all inherit the columns from the table definition. It is not a serious problem, as the partition column information is not really used by Hive. The only command whose results will change is: describe table T partition P; Dedupe tables' column schemas from partitions in the metastore db - Key: HIVE-2246 URL: https://issues.apache.org/jira/browse/HIVE-2246 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Fix For: 0.8.0 Attachments: HIVE-2246.2.patch, HIVE-2246.3.patch, HIVE-2246.4.patch, HIVE-2246.8.patch Note: this patch proposes a schema change, and is therefore incompatible with the current metastore. We can re-organize the JDO models to reduce space usage to keep the metastore scalable for the future. Currently, partitions are the fastest growing objects in the metastore, and the metastore keeps a separate copy of the columns list for each partition. We can normalize the metastore db by decoupling Columns from Storage Descriptors and not storing duplicate lists of the columns for each partition. An idea is to create an additional level of indirection with a Column Descriptor that has a list of columns. A table has a reference to its latest Column Descriptor (note: a table may have more than one Column Descriptor in the case of schema evolution). Partitions and Indexes can reference the same Column Descriptors as their parent table. Currently, the COLUMNS table in the metastore has roughly (number of partitions + number of tables) * (average number of columns pertable) rows. We can reduce this to (number of tables) * (average number of columns per table) rows, while incurring a small cost proportional to the number of tables to store the Column Descriptors. Please see the latest review board for additional implementation details. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2602) add support for insert partition overwrite(...) if not exists
[ https://issues.apache.org/jira/browse/HIVE-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13161457#comment-13161457 ] Namit Jain commented on HIVE-2602: -- https://reviews.facebook.net/D579 Thanks Chinna. I will take a look add support for insert partition overwrite(...) if not exists - Key: HIVE-2602 URL: https://issues.apache.org/jira/browse/HIVE-2602 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Chinna Rao Lalam Attachments: HIVE-2602.D579.1.patch, HIVE-2602.patch INSERT OVERWRITE TABLE X PARTITION (a=b, c=d) IF NOT EXISTS ... The partition should be created and written if and only if it's not there already. The support can be added for dynamic partitions in the future, but this jira is for adding this support for static partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2619) Add hook to run in meatastore's endFunction which can collect more fb303 counters
[ https://issues.apache.org/jira/browse/HIVE-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13160651#comment-13160651 ] Namit Jain commented on HIVE-2619: -- https://reviews.facebook.net/D561 Add hook to run in meatastore's endFunction which can collect more fb303 counters - Key: HIVE-2619 URL: https://issues.apache.org/jira/browse/HIVE-2619 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2619.1.patch.txt Create the potential for hooks to run in the endFunction method of HMSHandler which take the name of a function and whether or not it succeeded. Also, override getCounters from fb303 to allow these hooks to add counters which they collect, should this be desired. These hooks can be similar to EventListeners, but they should be more generic. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2619) Add hook to run in meatastore's endFunction which can collect more fb303 counters
[ https://issues.apache.org/jira/browse/HIVE-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13160660#comment-13160660 ] Namit Jain commented on HIVE-2619: -- I thought so too - but it did not for some reason. John, do you know ? I did: ant arc-setup arc diff --jira HIVE-2619 Add hook to run in meatastore's endFunction which can collect more fb303 counters - Key: HIVE-2619 URL: https://issues.apache.org/jira/browse/HIVE-2619 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2619.1.patch.txt Create the potential for hooks to run in the endFunction method of HMSHandler which take the name of a function and whether or not it succeeded. Also, override getCounters from fb303 to allow these hooks to add counters which they collect, should this be desired. These hooks can be similar to EventListeners, but they should be more generic. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2618) Describe partition returns table columns but should return partition columns
[ https://issues.apache.org/jira/browse/HIVE-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13160661#comment-13160661 ] Namit Jain commented on HIVE-2618: -- https://reviews.facebook.net/D105 Describe partition returns table columns but should return partition columns Key: HIVE-2618 URL: https://issues.apache.org/jira/browse/HIVE-2618 Project: Hive Issue Type: Bug Reporter: Kevin Wilfong Assignee: Namit Jain If a partitioned table and some partitions are created, and then the table is altered adding a columns, if describe is called on the partitions created before the columns were added it will show the new columns, even though it should not. In particular, in the metastore, the partition will not have these columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2619) Add hook to run in meatastore's endFunction which can collect more fb303 counters
[ https://issues.apache.org/jira/browse/HIVE-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13160663#comment-13160663 ] Namit Jain commented on HIVE-2619: -- It did show up for me - a little delayed. I should have been more patient. Add hook to run in meatastore's endFunction which can collect more fb303 counters - Key: HIVE-2619 URL: https://issues.apache.org/jira/browse/HIVE-2619 Project: Hive Issue Type: Improvement Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2619.1.patch.txt Create the potential for hooks to run in the endFunction method of HMSHandler which take the name of a function and whether or not it succeeded. Also, override getCounters from fb303 to allow these hooks to add counters which they collect, should this be desired. These hooks can be similar to EventListeners, but they should be more generic. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2605) Setting no_drop on a table should cascade to child partitions
[ https://issues.apache.org/jira/browse/HIVE-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13156352#comment-13156352 ] Namit Jain commented on HIVE-2605: -- The semantics of alter table T enable no_drop cascase; is as follows. If a table is marked as no_drop cascade, neither the table nor any of it's partitions can be dropped. Setting no_drop on a table should cascade to child partitions - Key: HIVE-2605 URL: https://issues.apache.org/jira/browse/HIVE-2605 Project: Hive Issue Type: Improvement Reporter: Namit Jain Assignee: Namit Jain Attachments: HIVE-2605.D525.1.patch When NO_DROP is set on a table, it does not cascade to the partitions of the table. There should be an option to do so. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2568) HIVE-2246 upgrade script needs to drop foreign key in COLUMNS_OLD
[ https://issues.apache.org/jira/browse/HIVE-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147973#comment-13147973 ] Namit Jain commented on HIVE-2568: -- +1 running tests HIVE-2246 upgrade script needs to drop foreign key in COLUMNS_OLD - Key: HIVE-2568 URL: https://issues.apache.org/jira/browse/HIVE-2568 Project: Hive Issue Type: Bug Affects Versions: 0.8.0 Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.8.0, 0.9.0 Attachments: D369.1.patch One more bug in the MySQL metastore upgrade script: the foreign key in COLUMNS needs to be dropped, otherwise drop_partition will fail because the SDS row cannot be deleted due to the foreign key constraint. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2562) HIVE-2247 Changed the Thrift API causing compatibility issues.
[ https://issues.apache.org/jira/browse/HIVE-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147197#comment-13147197 ] Namit Jain commented on HIVE-2562: -- Remove the files /tmp/*9000 from your machine. load_fs.q will succeed after that HIVE-2247 Changed the Thrift API causing compatibility issues. -- Key: HIVE-2562 URL: https://issues.apache.org/jira/browse/HIVE-2562 Project: Hive Issue Type: Bug Reporter: Kevin Wilfong Assignee: Weiyan Wang Attachments: HIVE-2562.patch HIVE-2247 Added a parameter to alter_partition in the Metastore Thrift API which has been causing compatibility issues with some scripts. We would like to change this to have two methods, one called alter_partition which takes the old parameters, and one called something else (I'll leave the naming up to you) which has the new parameters. The implementation of the old method should just call the new method with null for the new parameter. This will fix the compatibility issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1237) select coalesce(null) from src dies
[ https://issues.apache.org/jira/browse/HIVE-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145655#comment-13145655 ] Namit Jain commented on HIVE-1237: -- Actually, coalesce returns first non-null argument. With your patch, coalesce('x') will error out, which is not correct select coalesce(null) from src dies --- Key: HIVE-1237 URL: https://issues.apache.org/jira/browse/HIVE-1237 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.7.0, 0.8.0 Reporter: Namit Jain Assignee: Ashutosh Chauhan Attachments: hive-1237.patch, hive-1237_1.patch select coalesce(null) from src ; FAILED: Unknown exception: null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2466) mapjoin_subquery dump small table (mapjoin table) to the same file
[ https://issues.apache.org/jira/browse/HIVE-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145670#comment-13145670 ] Namit Jain commented on HIVE-2466: -- +1 mapjoin_subquery dump small table (mapjoin table) to the same file --- Key: HIVE-2466 URL: https://issues.apache.org/jira/browse/HIVE-2466 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.7.1 Reporter: binlijin Assignee: binlijin Priority: Critical Attachments: D285.1.patch, D285.2.patch, hive-2466.1.patch, hive-2466.2.patch, hive-2466.3.patch, hive-2466.4.patch in mapjoin_subquery.q there is a query: SELECT /*+ MAPJOIN(z) */ subq.key1, z.value FROM (SELECT /*+ MAPJOIN(x) */ x.key as key1, x.value as value1, y.key as key2, y.value as value2 FROM src1 x JOIN src y ON (x.key = y.key)) subq JOIN srcpart z ON (subq.key1 = z.key and z.ds='2008-04-08' and z.hr=11); when dump x and z to a local file,there all dump to the same file, so we lost the data of x -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2466) mapjoin_subquery dump small table (mapjoin table) to the same file
[ https://issues.apache.org/jira/browse/HIVE-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145214#comment-13145214 ] Namit Jain commented on HIVE-2466: -- [javac] Compiling 680 source files to /data/users/njain/hive_commit2/build/ql/classes [javac] /data/users/njain/hive_commit2/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java:191: cannot find symbol [javac] symbol : method generatePath(java.lang.String,java.lang.Byte,java.lang.String) [javac] location: class org.apache.hadoop.hive.ql.exec.Utilities [javac] String filePath = Utilities.generatePath(baseDir, pos, currentFileName); Can you refresh ? I am getting the above error in compiling. Also, create a arc diff entry for helping reviewing. mapjoin_subquery dump small table (mapjoin table) to the same file --- Key: HIVE-2466 URL: https://issues.apache.org/jira/browse/HIVE-2466 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.7.1 Reporter: binlijin Assignee: binlijin Priority: Blocker Attachments: D285.1.patch, hive-2466.1.patch, hive-2466.2.patch, hive-2466.3.patch in mapjoin_subquery.q there is a query: SELECT /*+ MAPJOIN(z) */ subq.key1, z.value FROM (SELECT /*+ MAPJOIN(x) */ x.key as key1, x.value as value1, y.key as key2, y.value as value2 FROM src1 x JOIN src y ON (x.key = y.key)) subq JOIN srcpart z ON (subq.key1 = z.key and z.ds='2008-04-08' and z.hr=11); when dump x and z to a local file,there all dump to the same file, so we lost the data of x -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1237) select coalesce(null) from src dies
[ https://issues.apache.org/jira/browse/HIVE-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145216#comment-13145216 ] Namit Jain commented on HIVE-1237: -- +1 running tests select coalesce(null) from src dies --- Key: HIVE-1237 URL: https://issues.apache.org/jira/browse/HIVE-1237 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.7.0, 0.8.0 Reporter: Namit Jain Assignee: Ashutosh Chauhan Attachments: hive-1237.patch, hive-1237_1.patch select coalesce(null) from src ; FAILED: Unknown exception: null -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2196) Ensure HiveConf includes all properties defined in hive-default.xml
[ https://issues.apache.org/jira/browse/HIVE-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145219#comment-13145219 ] Namit Jain commented on HIVE-2196: -- @Chinna, this slipped through the cracks and is not generating a lot of merge conflicts. I think this is a really good idea - Can you refresh ? I will definitely go over it this time and try to get it in. Ensure HiveConf includes all properties defined in hive-default.xml --- Key: HIVE-2196 URL: https://issues.apache.org/jira/browse/HIVE-2196 Project: Hive Issue Type: Bug Components: Configuration Affects Versions: 0.8.0 Reporter: Carl Steinbach Assignee: Chinna Rao Lalam Attachments: HIVE-2196.1.patch, HIVE-2196.2.patch, HIVE-2196.3.patch, HIVE-2196.4.patch, HIVE-2196.5.patch, HIVE-2196.build.log, HIVE-2196.patch There are a bunch of properties that are defined in hive-default.xml but not in HiveConf. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2178) Log related Check style Comments fixes
[ https://issues.apache.org/jira/browse/HIVE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145223#comment-13145223 ] Namit Jain commented on HIVE-2178: -- +1 Log related Check style Comments fixes -- Key: HIVE-2178 URL: https://issues.apache.org/jira/browse/HIVE-2178 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.5.0, 0.8.0 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5) Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: D291.1.patch, HIVE-2178.1.patch, HIVE-2178.2.patch, HIVE-2178.3.patch, HIVE-2178.4.patch, HIVE-2178.5.patch, HIVE-2178.patch Fix Log related Check style Comments -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2017) Driver.execute() should maintaining SessionState in case of runtime errors
[ https://issues.apache.org/jira/browse/HIVE-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13145224#comment-13145224 ] Namit Jain commented on HIVE-2017: -- @Chinna, can you refresh it again ? This is leading to conflicts - I will look at it Driver.execute() should maintaining SessionState in case of runtime errors -- Key: HIVE-2017 URL: https://issues.apache.org/jira/browse/HIVE-2017 Project: Hive Issue Type: Bug Reporter: Ning Zhang Assignee: Chinna Rao Lalam Attachments: HIVE-2017.1.patch, HIVE-2017.2.patch, HIVE-2017.3.patch, HIVE-2017.4.patch Here's a snippet from Driver.execute(): {code} // TODO: This error messaging is not very informative. Fix that. errorMessage = FAILED: Execution Error, return code + exitVal + from + tsk.getClass().getName(); SQLState = 08S01; console.printError(errorMessage); if (running.size() != 0) { taskCleanup(); } return 9; {code} I simply returned in case of runtime errors without maintaining SessionState. It could cause resource leak mentioned in HIVE-1959. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2543) Compact index table's files merged in creation
[ https://issues.apache.org/jira/browse/HIVE-2543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141832#comment-13141832 ] Namit Jain commented on HIVE-2543: -- +1 Compact index table's files merged in creation -- Key: HIVE-2543 URL: https://issues.apache.org/jira/browse/HIVE-2543 Project: Hive Issue Type: Bug Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2543.1.patch.txt When a compact index is built there is the possibility of a merge task at the end of the task tree. If this happens, the index table's files will no longer be sorted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2278) Support archiving for multiple partitions if the table is partitioned by multiple columns
[ https://issues.apache.org/jira/browse/HIVE-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140458#comment-13140458 ] Namit Jain commented on HIVE-2278: -- ? ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java.orig ? ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java.orig ? ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java.orig Getting merge conflicts, can you resolve them and resubmit the patch. Support archiving for multiple partitions if the table is partitioned by multiple columns - Key: HIVE-2278 URL: https://issues.apache.org/jira/browse/HIVE-2278 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Marcin Kurczych Attachments: HIVE-2278.2.patch, HIVE-2278.3.patch, HIVE-2278.4.patch, HIVE-2278.5.patch, HIVE-2278.5.patch, HIVE-2278.6.patch, HIVE-2278.6.patch, HIVE-2278.7.patch, HIVE-2278.8.patch, HIVE-2278.9.patch, archive_corrupt.rc, hive.2278.1.patch If a table is partitioned by ds,hr it should be possible to archive all the files in ds to reduce the number of files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2278) Support archiving for multiple partitions if the table is partitioned by multiple columns
[ https://issues.apache.org/jira/browse/HIVE-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13140902#comment-13140902 ] Namit Jain commented on HIVE-2278: -- The following tests failed: TestMetaStoreEventListener TestRemoteHiveMetaStore TestCliDriver: archive_corrupt archive_multi authorization_2 create_view_partitioned drop_multi_partitions escape1 inputddl6 Support archiving for multiple partitions if the table is partitioned by multiple columns - Key: HIVE-2278 URL: https://issues.apache.org/jira/browse/HIVE-2278 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Marcin Kurczych Attachments: HIVE-2278.10.patch, HIVE-2278.2.patch, HIVE-2278.3.patch, HIVE-2278.4.patch, HIVE-2278.5.patch, HIVE-2278.5.patch, HIVE-2278.6.patch, HIVE-2278.6.patch, HIVE-2278.7.patch, HIVE-2278.8.patch, HIVE-2278.9.patch, archive_corrupt.rc, hive.2278.1.patch If a table is partitioned by ds,hr it should be possible to archive all the files in ds to reduce the number of files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2278) Support archiving for multiple partitions if the table is partitioned by multiple columns
[ https://issues.apache.org/jira/browse/HIVE-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139904#comment-13139904 ] Namit Jain commented on HIVE-2278: -- +1 looks good - will get it in once HIVE-1003 is in, dont want to conflict 2 big patches Support archiving for multiple partitions if the table is partitioned by multiple columns - Key: HIVE-2278 URL: https://issues.apache.org/jira/browse/HIVE-2278 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Marcin Kurczych Attachments: HIVE-2278.2.patch, HIVE-2278.3.patch, HIVE-2278.4.patch, HIVE-2278.5.patch, HIVE-2278.5.patch, HIVE-2278.6.patch, HIVE-2278.6.patch, HIVE-2278.7.patch, HIVE-2278.8.patch, HIVE-2278.9.patch, archive_corrupt.rc, hive.2278.1.patch If a table is partitioned by ds,hr it should be possible to archive all the files in ds to reduce the number of files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1003) optimize metadata only queries
[ https://issues.apache.org/jira/browse/HIVE-1003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13138515#comment-13138515 ] Namit Jain commented on HIVE-1003: -- addressed comments optimize metadata only queries -- Key: HIVE-1003 URL: https://issues.apache.org/jira/browse/HIVE-1003 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Marcin Kurczych Attachments: D105.1.patch, HIVE-1003.1.patch, hive.1003.2.patch, hive.1003.3.patch, hive.1003.4.patch Queries like: select max(ds) from T where ds is a partitioning column should be optimized. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2533) test load_fs.q failing
[ https://issues.apache.org/jira/browse/HIVE-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13138518#comment-13138518 ] Namit Jain commented on HIVE-2533: -- It is failing on trunk test load_fs.q failing -- Key: HIVE-2533 URL: https://issues.apache.org/jira/browse/HIVE-2533 Project: Hive Issue Type: Bug Reporter: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1975) insert overwrite directory Not able to insert data with multi level directory path
[ https://issues.apache.org/jira/browse/HIVE-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13138537#comment-13138537 ] Namit Jain commented on HIVE-1975: -- +1 insert overwrite directory Not able to insert data with multi level directory path Key: HIVE-1975 URL: https://issues.apache.org/jira/browse/HIVE-1975 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.5.0 Environment: Hadoop 0.20.1, Hive0.5.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5). Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-1975.1.patch, HIVE-1975.2.patch, HIVE-1975.3.patch, HIVE-1975.patch Below query execution is failed Ex: {noformat} insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j; {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2278) Support archiving for multiple partitions if the table is partitioned by multiple columns
[ https://issues.apache.org/jira/browse/HIVE-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13139039#comment-13139039 ] Namit Jain commented on HIVE-2278: -- commented on the review board - looks good otherwise Support archiving for multiple partitions if the table is partitioned by multiple columns - Key: HIVE-2278 URL: https://issues.apache.org/jira/browse/HIVE-2278 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Marcin Kurczych Attachments: HIVE-2278.2.patch, HIVE-2278.3.patch, HIVE-2278.4.patch, HIVE-2278.5.patch, HIVE-2278.5.patch, HIVE-2278.6.patch, HIVE-2278.6.patch, HIVE-2278.7.patch, HIVE-2278.8.patch, archive_corrupt.rc, hive.2278.1.patch If a table is partitioned by ds,hr it should be possible to archive all the files in ds to reduce the number of files -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1461) Clean up references to 'hive.metastore.local'
[ https://issues.apache.org/jira/browse/HIVE-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135221#comment-13135221 ] Namit Jain commented on HIVE-1461: -- +1 Clean up references to 'hive.metastore.local' - Key: HIVE-1461 URL: https://issues.apache.org/jira/browse/HIVE-1461 Project: Hive Issue Type: Bug Reporter: Paul Yang Assignee: Ashutosh Chauhan Priority: Minor Attachments: hive-1461.patch 'hive.metastore.local' should not be referred directly as a string. Instead, a HiveConf.ConfVar entry should be created and used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2525) Introduce proper error messages, when explain fails on commands that otherwise could be succesful (commands that are not analyzed by Semantic Analyzer)
[ https://issues.apache.org/jira/browse/HIVE-2525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135227#comment-13135227 ] Namit Jain commented on HIVE-2525: -- hive explain dfs -ls PATH; FAILED: Parse Error: line 1:8 cannot recognize input near 'dfs' '-' 'ls' in statement The above error is a legitimate parse error. If we want to fix it with a better error message, we should catch the parser error and throw a beter message (the way you have put it in CliDriver). The approach you have taken is difficult to maintain in the long run. Introduce proper error messages, when explain fails on commands that otherwise could be succesful (commands that are not analyzed by Semantic Analyzer) --- Key: HIVE-2525 URL: https://issues.apache.org/jira/browse/HIVE-2525 Project: Hive Issue Type: Improvement Reporter: Robert Surówka Assignee: Robert Surówka Priority: Trivial Attachments: HIVE-2525.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2253) Merge failing of join tree in exceptional case
[ https://issues.apache.org/jira/browse/HIVE-2253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135237#comment-13135237 ] Namit Jain commented on HIVE-2253: -- @Navis, your code changes look good. Thanks for catching this. For testing, the convention we have adopted is to have a test file for such a case. Look at the *.q files in ql/src/test/queries/clientpositive Add a new .q file (say, mergejoins.q) Add a explain plan in the .q file to check the number of map-reduce jobs etc. You can run the test by executing: ant test -Dtestcase=TestCliDriver -Dqfile=mergejoins.q Verify that the output file mergejoins.q.out is correct, and check it in ql/src/results/clientpositive Merge failing of join tree in exceptional case -- Key: HIVE-2253 URL: https://issues.apache.org/jira/browse/HIVE-2253 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.7.0 Environment: hadoop 0.20.2, hive 0.7.0 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-2253-0.8.0.patch In some very exceptional cases, SemanticAnayzer fails to merge join tree. Example is below. create table a (val1 int, val2 int) create table b (val1 int, val2 int) create table c (val1 int, val2 int) create table d (val1 int, val2 int) create table e (val1 int, val2 int) 1. all same(single) join key -- one MR, good select * from a join b on a.val1=b.val1 join c on a.val1=c.val1 join d on a.val1=d.val1 join e on a.val1=e.val1 2. two join keys -- expected to have two MR, but resulted to three MR select * from a join b on a.val1=b.val1 join c on a.val1=c.val1 join d on a.val1=d.val1 join e on a.val2=e.val2 3. by changing the join order, we could attain two MR as first-expectation. select * from a join e on a.val2=e.val2 join c on a.val1=c.val1 join d on a.val1=d.val1 join b on a.val1=b.val1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2465) Primitive Data Types returning null if the data is out of range of the data type.
[ https://issues.apache.org/jira/browse/HIVE-2465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135238#comment-13135238 ] Namit Jain commented on HIVE-2465: -- The changes look good to me. However, it might be better to use LOG.warn or LOG.debug instead of LOG.info - this might generate a lot of logs. Primitive Data Types returning null if the data is out of range of the data type. - Key: HIVE-2465 URL: https://issues.apache.org/jira/browse/HIVE-2465 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Environment: Hadoop 0.20.1, Hive0.9.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5) Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-2465.1.patch, HIVE-2465.patch Primitive Data Types returning null if the input data is out of range of the data type. In this case it is better to log the message with the proper message and actual data then user get to know some data is missing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1850) alter table set serdeproperties bypasses regexps checks (leaves table in a non-recoverable state?)
[ https://issues.apache.org/jira/browse/HIVE-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13135242#comment-13135242 ] Namit Jain commented on HIVE-1850: -- +1 alter table set serdeproperties bypasses regexps checks (leaves table in a non-recoverable state?) -- Key: HIVE-1850 URL: https://issues.apache.org/jira/browse/HIVE-1850 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.7.0 Environment: Trunk build from a few days ago, but seen once before with older version as well. Reporter: Terje Marthinussen Assignee: Amareshwari Sriramadasu Attachments: patch-1850-2.txt, patch-1850.txt {code} create table aa ( test STRING ) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES (input.regex = [^\\](.*), output.format.string = $1s); {code} This will fail. Great! {code} create table aa ( test STRING ) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES (input.regex = (.*), output.format.string = $1s); {code} Works, no problem there. {code} alter table aa set serdeproperties (input.regex = [^\\](.*), output.format.string = $1s); {code} Wups... I can set that without any problems! {code} alter table aa set serdeproperties (input.regex = (.*), output.format.string = $1s); FAILED: Hive Internal Error: java.util.regex.PatternSyntaxException(Unclosed character class near index 7 [^\](.*) ^) java.util.regex.PatternSyntaxException: Unclosed character class near index 7 [^\](.*) ^ at java.util.regex.Pattern.error(Pattern.java:1713) at java.util.regex.Pattern.clazz(Pattern.java:2254) at java.util.regex.Pattern.sequence(Pattern.java:1818) at java.util.regex.Pattern.expr(Pattern.java:1752) at java.util.regex.Pattern.compile(Pattern.java:1460) at java.util.regex.Pattern.init(Pattern.java:1133) at java.util.regex.Pattern.compile(Pattern.java:847) at org.apache.hadoop.hive.contrib.serde2.RegexSerDe.initialize(RegexSerDe.java:101) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:199) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253) at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:484) at org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:161) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:803) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeAlterTableSerdeProps(DDLSemanticAnalyzer.java:558) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:232) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:686) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:142) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:370) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) {code} After this, all further commands on the table fails, including drop table :) 1. The alter table command should probably check the regexp just like the create table command does 2. Even though the regexp is bad, it should be possible to do things like set the regexp again or drop the table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2519) Dynamic partition insert should enforce the order of the partition spec is the same as the one in schema
[ https://issues.apache.org/jira/browse/HIVE-2519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134232#comment-13134232 ] Namit Jain commented on HIVE-2519: -- +1 Dynamic partition insert should enforce the order of the partition spec is the same as the one in schema Key: HIVE-2519 URL: https://issues.apache.org/jira/browse/HIVE-2519 Project: Hive Issue Type: Bug Reporter: Ning Zhang Assignee: Ning Zhang Fix For: 0.9.0 Attachments: HIVE-2519.3.patch, HIVE-2519.patch Suppose the table schema is (a string, b string) partitioned by (p1 string, p2 string), a dynamic partition insert is allowed to: insert overwrite ... partition (p2=..., p1); which will create the wrong HDFS directory structure such as /.../p2=.../p1= This is contradictory to the metastore's assumption of the HDFS directory structure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-123) refactor DDL code (both DDLWork and DDLTask)
[ https://issues.apache.org/jira/browse/HIVE-123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134245#comment-13134245 ] Namit Jain commented on HIVE-123: - Thanks for taking this, Ashutosh. Can you write a quick 10 line note on what this patch does ? It will help tremendously in the review. refactor DDL code (both DDLWork and DDLTask) Key: HIVE-123 URL: https://issues.apache.org/jira/browse/HIVE-123 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Ashutosh Chauhan Labels: cleanup, refactoring Attachments: hive-123.patch It might be good to break DDLTask into separate tasks. The abstract class DDLWork can have various subclasses: showTablesWork, DescribeTableWork etc. and a separate task for each of them. This will make them completely independent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1975) insert overwrite directory Not able to insert data with multi level directory path
[ https://issues.apache.org/jira/browse/HIVE-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134265#comment-13134265 ] Namit Jain commented on HIVE-1975: -- Some minor comments: 1. Can you add a testcase - use build as a temporary directory to add data ? 2. Add some comments explaining that the multi-level directory move is not atomic. In your example, it is possible that /x/y/z/1/2/3 is created, but /x/y/z/1/2/4. It is same as any multi-partition insert (dynamic insert), but would be good to explicitly call it out. insert overwrite directory Not able to insert data with multi level directory path Key: HIVE-1975 URL: https://issues.apache.org/jira/browse/HIVE-1975 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.5.0 Environment: Hadoop 0.20.1, Hive0.5.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5). Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-1975.1.patch, HIVE-1975.2.patch, HIVE-1975.patch Below query execution is failed Ex: {noformat} insert overwrite directory '/HIVEFT25686/chinna/' select * from dept_j; {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1567) increase hive.mapjoin.maxsize to 10 million
[ https://issues.apache.org/jira/browse/HIVE-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134276#comment-13134276 ] Namit Jain commented on HIVE-1567: -- +1 increase hive.mapjoin.maxsize to 10 million --- Key: HIVE-1567 URL: https://issues.apache.org/jira/browse/HIVE-1567 Project: Hive Issue Type: Improvement Reporter: He Yongqiang Assignee: Ashutosh Chauhan Attachments: hive-1567.patch i saw in a very wide table, hive can process 1million rows in less than one minute (select all columns). setting the hive.mapjoin.maxsize to 100k is kind of too restrictive. Let's increase this to 10 million. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2466) mapjoin_subquery dump small table (mapjoin table) to the same file
[ https://issues.apache.org/jira/browse/HIVE-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134321#comment-13134321 ] Namit Jain commented on HIVE-2466: -- reviewing now mapjoin_subquery dump small table (mapjoin table) to the same file --- Key: HIVE-2466 URL: https://issues.apache.org/jira/browse/HIVE-2466 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.7.1 Reporter: binlijin Assignee: binlijin Priority: Blocker Attachments: hive-2466.1.patch, hive-2466.2.patch in mapjoin_subquery.q there is a query: SELECT /*+ MAPJOIN(z) */ subq.key1, z.value FROM (SELECT /*+ MAPJOIN(x) */ x.key as key1, x.value as value1, y.key as key2, y.value as value2 FROM src1 x JOIN src y ON (x.key = y.key)) subq JOIN srcpart z ON (subq.key1 = z.key and z.ds='2008-04-08' and z.hr=11); when dump x and z to a local file,there all dump to the same file, so we lost the data of x -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2466) mapjoin_subquery dump small table (mapjoin table) to the same file
[ https://issues.apache.org/jira/browse/HIVE-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13134440#comment-13134440 ] Namit Jain commented on HIVE-2466: -- A few high level comments: Instead of making the dump prefix optional - why dont you always have it in hashtablesinkdesc and mapjoindesc. This way, you can get rid of all the checks : if dumpdescriptor is not null. The logic will be simpler - the names of map files will be : mapfile1 mapfile2 .. etc Also, it might be nicer to add the static function in PlanUtils.java instead of QBJoinTree.java. mapjoin_subquery dump small table (mapjoin table) to the same file --- Key: HIVE-2466 URL: https://issues.apache.org/jira/browse/HIVE-2466 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.7.1 Reporter: binlijin Assignee: binlijin Priority: Blocker Attachments: hive-2466.1.patch, hive-2466.2.patch in mapjoin_subquery.q there is a query: SELECT /*+ MAPJOIN(z) */ subq.key1, z.value FROM (SELECT /*+ MAPJOIN(x) */ x.key as key1, x.value as value1, y.key as key2, y.value as value2 FROM src1 x JOIN src y ON (x.key = y.key)) subq JOIN srcpart z ON (subq.key1 = z.key and z.ds='2008-04-08' and z.hr=11); when dump x and z to a local file,there all dump to the same file, so we lost the data of x -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira