[jira] [Created] (HIVE-11804) Different describe formatted behavior depending on whether the table name is qualified with database name or not
Mark Grover created HIVE-11804: -- Summary: Different describe formatted behavior depending on whether the table name is qualified with database name or not Key: HIVE-11804 URL: https://issues.apache.org/jira/browse/HIVE-11804 Project: Hive Issue Type: Bug Components: Metastore Reporter: Mark Grover I have a simple text file based managed table on HDFS: {quote} show create table src; +---+--+ |createtab_stmt | +---+--+ | CREATE TABLE `src`( | | `first` string, | | `word` string) | | PARTITIONED BY ( | | `length` int) | | ROW FORMAT SERDE | | 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' | | STORED AS INPUTFORMAT | | 'org.apache.hadoop.mapred.TextInputFormat' | | OUTPUTFORMAT | | 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' | | LOCATION | | 'hdfs://name-node:8020/user/hive/warehouse/my.db/src' | | TBLPROPERTIES ( | | 'transient_lastDdlTime'='1441921577') | +---+--+ {quote} The describe formatted with the database name returns: {quote} describe formatted my.src first partition(length=1); +-+---+---+---++-+--+--++-+--+--+ |col_name | data_type | min | max | num_nulls | distinct_count | avg_col_len | max_col_len | num_trues | num_falses | comment | +-+---+---+---++-+--+--++-+--+--+ | # col_name | data_type | comment | | NULL | NULL| NULL | NULL | NULL | NULL| NULL | | | NULL | NULL | NULL | NULL | NULL| NULL | NULL | NULL | NULL| NULL | | first | string| from deserializer | NULL | NULL | NULL| NULL | NULL | NULL | NULL| NULL | +-+---+---+---++-+--+--++-+--+--+ {quote} while without it returns: {quote} describe formatted src first partition(length=1); +---+---+---+--+ | col_name| data_type |comment| +---+---+---+--+ | # col_name| data_type | comment | | | NULL | NULL | | first | string | | | word | string | | | | NULL | NULL | | # Partition Information | NULL | NULL | | # col_name| data_type | comment | | | NULL
[jira] [Created] (HIVE-10041) Set defaults for HBASE_HOME in a smarter way
Mark Grover created HIVE-10041: -- Summary: Set defaults for HBASE_HOME in a smarter way Key: HIVE-10041 URL: https://issues.apache.org/jira/browse/HIVE-10041 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.1.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 1.1.0 Similar to SQOOP-2145, hcat binary script doesn't do smart detection of HBASE_HOME. It assumes it's always located in /usr/lib/hbase and if not HBASE_HOME variable needs to be exported. The reason is often times people have tarballs like ~/hive-hcatalog, ~/hbase, etc. and it would be good to have their hcat scripts work out of the book. This doesn't regress anything because it only sets the HBASE_HOME if it's not already set and the directory it's setting it to exists and is valid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9818) Print error messages on stderr instead of stdout
Mark Grover created HIVE-9818: - Summary: Print error messages on stderr instead of stdout Key: HIVE-9818 URL: https://issues.apache.org/jira/browse/HIVE-9818 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 1.0.0 Reporter: Mark Grover hcat script (see [here|https://github.com/apache/hive/blob/trunk/hcatalog/bin/hcat#L103], for example) prints error messages via echo which go on stdout the way they are written. This is bad because downstream projects using it receive error messages as expected output instead of actually receiving nothing (see SQOOP-2147 as example). We need to put error messages on stdout. Something like: {code} 2 echo This is an error message {code} instead of a simple echo that we have been using so far. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9000) LAST_VALUE Window function returns wrong results
Mark Grover created HIVE-9000: - Summary: LAST_VALUE Window function returns wrong results Key: HIVE-9000 URL: https://issues.apache.org/jira/browse/HIVE-9000 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 0.13.1 Reporter: Mark Grover Priority: Critical Fix For: 0.14.1 LAST_VALUE Windowing function has been returning bad results, as far as I can tell from day 1. And, it seems like the tests are also asserting that LAST_VALUE gives the wrong result. Here's the test output: https://github.com/apache/hive/blob/branch-0.14/ql/src/test/results/clientpositive/windowing_navfn.q.out#L587 The query is: {code} select t, s, i, last_value(i) over (partition by t order by s) {code} The result is: {code} t si last_value(i) --- 10 oscar allen 65662 65662 10 oscar carson65549 65549 {code} LAST_VALUE(i) should have returned 65549 in both records, instead it simply ends up returning i. Another way you can make sure LAST_VALUE is bad is to verify it's result against LEAD(i,1) over (partition by t order by s). LAST_VALUE being last value should always be more (in terms of the specified 'order by s') than the lead by 1. While this doesn't directly apply to the above query, if the result set had more rows, you would clearly see records where lead is higher than last_value which is semantically incorrect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9000) LAST_VALUE Window function returns wrong results
[ https://issues.apache.org/jira/browse/HIVE-9000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-9000: -- Description: LAST_VALUE Windowing function has been returning bad results, as far as I can tell from day 1. And, it seems like the tests are also asserting that LAST_VALUE gives the wrong result. Here's the test output: https://github.com/apache/hive/blob/branch-0.14/ql/src/test/results/clientpositive/windowing_navfn.q.out#L587 The query is: {code} select t, s, i, last_value(i) over (partition by t order by s) {code} The result is: {code} t si last_value(i) --- 10 oscar allen 65662 65662 10 oscar carson65549 65549 {code} {{LAST_VALUE( i )}} should have returned 65549 in both records, instead it simply ends up returning i. Another way you can make sure LAST_VALUE is bad is to verify it's result against LEAD(i,1) over (partition by t order by s). LAST_VALUE being last value should always be more (in terms of the specified 'order by s') than the lead by 1. While this doesn't directly apply to the above query, if the result set had more rows, you would clearly see records where lead is higher than last_value which is semantically incorrect. was: LAST_VALUE Windowing function has been returning bad results, as far as I can tell from day 1. And, it seems like the tests are also asserting that LAST_VALUE gives the wrong result. Here's the test output: https://github.com/apache/hive/blob/branch-0.14/ql/src/test/results/clientpositive/windowing_navfn.q.out#L587 The query is: {code} select t, s, i, last_value(i) over (partition by t order by s) {code} The result is: {code} t si last_value(i) --- 10 oscar allen 65662 65662 10 oscar carson65549 65549 {code} LAST_VALUE(i) should have returned 65549 in both records, instead it simply ends up returning i. Another way you can make sure LAST_VALUE is bad is to verify it's result against LEAD(i,1) over (partition by t order by s). LAST_VALUE being last value should always be more (in terms of the specified 'order by s') than the lead by 1. While this doesn't directly apply to the above query, if the result set had more rows, you would clearly see records where lead is higher than last_value which is semantically incorrect. LAST_VALUE Window function returns wrong results Key: HIVE-9000 URL: https://issues.apache.org/jira/browse/HIVE-9000 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 0.13.1 Reporter: Mark Grover Priority: Critical Fix For: 0.14.1 LAST_VALUE Windowing function has been returning bad results, as far as I can tell from day 1. And, it seems like the tests are also asserting that LAST_VALUE gives the wrong result. Here's the test output: https://github.com/apache/hive/blob/branch-0.14/ql/src/test/results/clientpositive/windowing_navfn.q.out#L587 The query is: {code} select t, s, i, last_value(i) over (partition by t order by s) {code} The result is: {code} t si last_value(i) --- 10oscar allen 65662 65662 10oscar carson65549 65549 {code} {{LAST_VALUE( i )}} should have returned 65549 in both records, instead it simply ends up returning i. Another way you can make sure LAST_VALUE is bad is to verify it's result against LEAD(i,1) over (partition by t order by s). LAST_VALUE being last value should always be more (in terms of the specified 'order by s') than the lead by 1. While this doesn't directly apply to the above query, if the result set had more rows, you would clearly see records where lead is higher than last_value which is semantically incorrect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6593) Create a maven assembly for hive-jdbc
[ https://issues.apache.org/jira/browse/HIVE-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924974#comment-13924974 ] Mark Grover commented on HIVE-6593: --- Thanks Szehon for taking this up and everyone for their input. Cos or I can review the patch. Create a maven assembly for hive-jdbc - Key: HIVE-6593 URL: https://issues.apache.org/jira/browse/HIVE-6593 Project: Hive Issue Type: Improvement Components: Build Infrastructure Affects Versions: 0.12.0 Reporter: Mark Grover Assignee: Szehon Ho Attachments: HIVE-6593.patch Currently in Apache Bigtop we bundle and distribute Hive. In particular, for users to not have to install the entirety of Hive on machines that are just jdbc clients, we have a special package which is a subset of hive, called hive-jdbc that bundles only the jdbc driver jar and it's dependencies. However, because Hive doesn't have an assembly for the jdbc jar, we have to hack and hardcode the list of jdbc jars and it's dependencies: https://github.com/apache/bigtop/blob/master/bigtop-packages/src/rpm/hive/SPECS/hive.spec#L361 As Hive moves to Maven, it would be pretty fantastic if Hive could leverage the maven-assembly-plugin and generate a .tar.gz assembly for what's required for jdbc gateway machines. That we can simply take that distribution and build a jdbc package from it without having to hard code jar names and dependencies. That would make the process much less error prone. NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-6593) Create a maven assembly for hive-jdbc
Mark Grover created HIVE-6593: - Summary: Create a maven assembly for hive-jdbc Key: HIVE-6593 URL: https://issues.apache.org/jira/browse/HIVE-6593 Project: Hive Issue Type: Improvement Components: Build Infrastructure Affects Versions: 0.12.0 Reporter: Mark Grover Currently in Apache Bigtop we bundle and distribute Hive. In particular, for users to not have to install the entirety of Hive on machines that are just jdbc clients, we have a special package which is a subset of hive, called hive-jdbc that bundles only the jdbc driver jar and it's dependencies. However, because Hive doesn't have an assembly for the jdbc jar, we have to hack and hardcode the list of jdbc jars and it's dependencies: https://github.com/apache/bigtop/blob/master/bigtop-packages/src/rpm/hive/SPECS/hive.spec#L361 As Hive moves to Maven, it would be pretty fantastic if Hive could leverage the maven-assembly-plugin and generate a .tar.gz assembly for what's required for jdbc gateway machines. That we can simply take that distribution and build a jdbc package from it without having to hard code jar names and dependencies. That would make the process much less error prone. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-3819) Creating a table on Hive without Hadoop daemons running returns a misleading error
[ https://issues.apache.org/jira/browse/HIVE-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817348#comment-13817348 ] Mark Grover commented on HIVE-3819: --- Sorry, Xuefu, I haven't had the time. If you can't reproduce this, please go ahead and mark this as Cant' reproduce. Thanks for checking! Creating a table on Hive without Hadoop daemons running returns a misleading error -- Key: HIVE-3819 URL: https://issues.apache.org/jira/browse/HIVE-3819 Project: Hive Issue Type: Bug Components: CLI, Metastore Reporter: Mark Grover Assignee: Xuefu Zhang I was running hive without running the underlying hadoop daemon's running. Hadoop was configured to run in pseudo-distributed mode. However, when I tried to create a hive table, I got this rather misleading error: {code} FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask {code} We should look into making this error message less misleading (more about hadoop daemons not running instead of metastore client not being instantiable). -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: Proposal to move Hive Apache Jenkins jobs to Bigtop Jenkins?
Done! On Tue, Oct 29, 2013 at 8:50 AM, Brock Noland br...@cloudera.com wrote: On Mon, Oct 28, 2013 at 10:51 PM, Roman Shaposhnik ro...@shaposhnik.org wrote: My username is brock. I gave you perms to manipulate the jobs. Go wild ;-) Would you be able to add a throttle category for us? Call it hive-unit perhaps? Thank you!! Brock
[jira] [Assigned] (HIVE-3844) Unix timestamps don't seem to be read correctly from HDFS as Timestamp column
[ https://issues.apache.org/jira/browse/HIVE-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover reassigned HIVE-3844: - Assignee: Venki Korukanti (was: Mark Grover) Venki, I am not. Assigned it to you. Unix timestamps don't seem to be read correctly from HDFS as Timestamp column - Key: HIVE-3844 URL: https://issues.apache.org/jira/browse/HIVE-3844 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.8.0 Reporter: Mark Grover Assignee: Venki Korukanti Serega Shepak pointed out that something like {code} select cast(date_occurrence as timestamp) from xvlr_data limit 10 {code} where date_occurrence has BIGINT type (timestamp in milliseconds) works. But it doesn't work if the declared type is TIMESTAMP on column. The data in the date_occurence column in unix timestamp in millis. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-3819) Creating a table on Hive without Hadoop daemons running returns a misleading error
[ https://issues.apache.org/jira/browse/HIVE-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800063#comment-13800063 ] Mark Grover commented on HIVE-3819: --- Let merry to reproduce it. Creating a table on Hive without Hadoop daemons running returns a misleading error -- Key: HIVE-3819 URL: https://issues.apache.org/jira/browse/HIVE-3819 Project: Hive Issue Type: Bug Components: CLI, Metastore Reporter: Mark Grover Assignee: Xuefu Zhang I was running hive without running the underlying hadoop daemon's running. Hadoop was configured to run in pseudo-distributed mode. However, when I tried to create a hive table, I got this rather misleading error: {code} FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask {code} We should look into making this error message less misleading (more about hadoop daemons not running instead of metastore client not being instantiable). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-3819) Creating a table on Hive without Hadoop daemons running returns a misleading error
[ https://issues.apache.org/jira/browse/HIVE-3819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13800064#comment-13800064 ] Mark Grover commented on HIVE-3819: --- Stupid autocorrect! I meant to say let me try to reproduce it:-) Creating a table on Hive without Hadoop daemons running returns a misleading error -- Key: HIVE-3819 URL: https://issues.apache.org/jira/browse/HIVE-3819 Project: Hive Issue Type: Bug Components: CLI, Metastore Reporter: Mark Grover Assignee: Xuefu Zhang I was running hive without running the underlying hadoop daemon's running. Hadoop was configured to run in pseudo-distributed mode. However, when I tried to create a hive table, I got this rather misleading error: {code} FAILED: Error in metadata: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask {code} We should look into making this error message less misleading (more about hadoop daemons not running instead of metastore client not being instantiable). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HIVE-5534) Webhcatalog shell scripts have non-executable permissions
Mark Grover created HIVE-5534: - Summary: Webhcatalog shell scripts have non-executable permissions Key: HIVE-5534 URL: https://issues.apache.org/jira/browse/HIVE-5534 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0 Reporter: Mark Grover Fix For: 0.12.0 While playing with the hive 0.12 rc1, I noticed that the script for starting and stopping webhcatalog server don't have the correct permissions. {code} localhost:hive-0.12.0 mgrover$ find . -name *webhcat*.sh | grep -v './src/' | xargs ls -lrt -rw-r--r-- 1 mgrover staff 7141 Oct 9 18:50 ./hcatalog/sbin/webhcat_server.sh -rw-r--r-- 1 mgrover staff 4243 Oct 9 18:50 ./hcatalog/sbin/webhcat_config.sh {code} This means that we can't start or stop webhcat server using the tarball out of the box. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: [VOTE] Apache Hive 0.12.0 Release Candidate 1
I wasn't able to start webhcat server. It seems to be related to file permissions. FWIW, it's not a regression (the same problem existed in Hive 0.11). Having said that, it makes webhcat pretty unusable out of the box. I created HIVE-5534 to track this. On Sun, Oct 13, 2013 at 4:37 PM, Carl Steinbach c...@apache.org wrote: +1 (binding) Regarding the 3 day deadline for voting, that is what is in the hive bylaws. I also see that has been followed in last few releases I checked. 3 days is the minimum length of the voting period, not the maximum. Thanks. Carl
[jira] [Updated] (HIVE-5534) HCatalog shell scripts have non-executable permissions
[ https://issues.apache.org/jira/browse/HIVE-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-5534: -- Priority: Critical (was: Major) Summary: HCatalog shell scripts have non-executable permissions (was: Webhcatalog shell scripts have non-executable permissions) HCatalog shell scripts have non-executable permissions -- Key: HIVE-5534 URL: https://issues.apache.org/jira/browse/HIVE-5534 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0 Reporter: Mark Grover Priority: Critical Fix For: 0.12.0 While playing with the hive 0.12 rc1, I noticed that the script for starting and stopping webhcatalog server don't have the correct permissions. {code} localhost:hive-0.12.0 mgrover$ find . -name *webhcat*.sh | grep -v './src/' | xargs ls -lrt -rw-r--r-- 1 mgrover staff 7141 Oct 9 18:50 ./hcatalog/sbin/webhcat_server.sh -rw-r--r-- 1 mgrover staff 4243 Oct 9 18:50 ./hcatalog/sbin/webhcat_config.sh {code} This means that we can't start or stop webhcat server using the tarball out of the box. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5534) Webhcatalog shell scripts have non-executable permissions
[ https://issues.apache.org/jira/browse/HIVE-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13794386#comment-13794386 ] Mark Grover commented on HIVE-5534: --- Poking more seems like this a problem with all hcatalog scripts so starting hcatalog server is affected as well: {code} -rw-r--r-- 1 mgrover staff 2048 Oct 9 18:50 ./hcatalog/share/hcatalog/scripts/hcat_server_stop.sh -rw-r--r-- 1 mgrover staff 3003 Oct 9 18:50 ./hcatalog/share/hcatalog/scripts/hcat_server_start.sh -rw-r--r-- 1 mgrover staff 4698 Oct 9 18:50 ./hcatalog/share/hcatalog/scripts/hcat_server_install.sh -rw-r--r-- 1 mgrover staff 7141 Oct 9 18:50 ./hcatalog/sbin/webhcat_server.sh -rw-r--r-- 1 mgrover staff 4243 Oct 9 18:50 ./hcatalog/sbin/webhcat_config.sh -rw-r--r-- 1 mgrover staff 10013 Oct 9 18:50 ./hcatalog/sbin/update-hcatalog-env.sh -rw-r--r-- 1 mgrover staff 4540 Oct 9 18:50 ./hcatalog/sbin/hcat_server.sh -rw-r--r-- 1 mgrover staff 2543 Oct 9 18:50 ./hcatalog/libexec/hcat-config.sh {code} I update the title of the JIRA to reflect the above. Webhcatalog shell scripts have non-executable permissions - Key: HIVE-5534 URL: https://issues.apache.org/jira/browse/HIVE-5534 Project: Hive Issue Type: Bug Components: WebHCat Affects Versions: 0.11.0, 0.12.0 Reporter: Mark Grover Fix For: 0.12.0 While playing with the hive 0.12 rc1, I noticed that the script for starting and stopping webhcatalog server don't have the correct permissions. {code} localhost:hive-0.12.0 mgrover$ find . -name *webhcat*.sh | grep -v './src/' | xargs ls -lrt -rw-r--r-- 1 mgrover staff 7141 Oct 9 18:50 ./hcatalog/sbin/webhcat_server.sh -rw-r--r-- 1 mgrover staff 4243 Oct 9 18:50 ./hcatalog/sbin/webhcat_config.sh {code} This means that we can't start or stop webhcat server using the tarball out of the box. -- This message was sent by Atlassian JIRA (v6.1#6144)
Re: [VOTE] Apache Hive 0.12.0 Release Candidate 1
Poking more it looks like this affects not just webhcatalog scripts but just all hcatalog scripts. Consequently, it looks like starting hcatalog server wouldn't work either. I will stop spamming multiple mailing lists now and post more updates, if any, on the JIRA (HIVE-5534) itself. On Mon, Oct 14, 2013 at 12:18 PM, Mark Grover m...@apache.org wrote: I wasn't able to start webhcat server. It seems to be related to file permissions. FWIW, it's not a regression (the same problem existed in Hive 0.11). Having said that, it makes webhcat pretty unusable out of the box. I created HIVE-5534 to track this. On Sun, Oct 13, 2013 at 4:37 PM, Carl Steinbach c...@apache.org wrote: +1 (binding) Regarding the 3 day deadline for voting, that is what is in the hive bylaws. I also see that has been followed in last few releases I checked. 3 days is the minimum length of the voting period, not the maximum. Thanks. Carl
Re: [VOTE] Apache Hive 0.12.0 Release Candidate 0
Thejas, Thanks for working on Hive 0.12 release! I work on Apache Bigtop http://bigtop.apache.org and we build rpm and deb packages by building and packaging the source tarballs. Most components (if not all) release a source tarball. Releasing a source tarball would make it make Hive consistent with other projects in terms of what is released, and would make life easier for those users who may not want binaries (like some Hive developers and Bigtop). I don't know how much work it will be, but both I personally and the larger Bigtop community would greatly appreciate if Hive released a source tarball for 0.12 release. Would love to hear what you think. Thanks again! Mark On Tue, Oct 8, 2013 at 3:56 PM, Thejas Nair the...@hortonworks.com wrote: On Tue, Oct 8, 2013 at 8:18 AM, Brock Noland br...@cloudera.com wrote: Hi Thejas, Again thank you very much for all the hard work! Two items of discussion: The tag contains .gitignore files so I believe the source tarball (src/ directory) should as well. It is strange that other files files with . prefix do get included (.checkstyle, .arcconfig ), but .gitignore doesn't get included. This might be a wider item than the current release. However, our source tarball actually contains all the hive-*.jar files in addition to the all the libraries. Beyond that the source tarball actually doesn't match the tag structure, the src directory of the source tarball does. I think we should change this at some point so the source tarball structure exactly matches the tag. Yes, I think we should address this for the next release. It might take some time to get this done right. Brock On Mon, Oct 7, 2013 at 11:02 PM, Thejas Nair the...@hortonworks.com wrote: Carl pointed some issues with the RC. I will be rolling out a new RC to address those (hopefully sometime tomorrow). If anybody finds additional issues, please let me know, so that I can address those as well in the next RC. HIVE-5489 - NOTICE copyright dates are out of date HIVE-5488 - some files are missing apache license headers On Mon, Oct 7, 2013 at 4:38 PM, Thejas Nair the...@hortonworks.com wrote: Yes, that is the correct tag. Thanks for pointing it out. I also update the tag as it was a little behind what is in the RC (found some issues with maven-publish). I have also updated the release vote email template in hive HowToRelease wiki page, to include note about the tag . Thanks, Thejas On Mon, Oct 7, 2013 at 4:26 PM, Brock Noland br...@cloudera.com wrote: Hi Thejas, Thank you very much for the hard work! I believe the vote email should contain a link to the tag we are voting on. I assume the tag is: release-0.12.0-rc0 ( http://svn.apache.org/viewvc/hive/tags/release-0.12.0-rc0/). Is that correct? Brock On Mon, Oct 7, 2013 at 6:02 PM, Thejas Nair the...@hortonworks.com wrote: Apache Hive 0.12.0 Release Candidate 0 is available here: http://people.apache.org/~thejas/hive-0.12.0-rc0/ Maven artifacts are available here: https://repository.apache.org/content/repositories/orgapachehive-138/ This release has 406 issues fixed. This includes several new features such as data types date and varchar, optimizer improvements, ORC format improvements and many bug fixes. Hcatalog packages have now moved to org.apache.hive.hcatalog (from org.apache.hcatalog), and the maven packages are published under org.apache.hive.hcatalog. Voting will conclude in 72 hours. Hive PMC Members: Please test and vote. Thanks, Thejas -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You. -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Commented] (HIVE-3976) Support specifying scale and precision with Hive decimal type
[ https://issues.apache.org/jira/browse/HIVE-3976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765119#comment-13765119 ] Mark Grover commented on HIVE-3976: --- Hi Xuefu, Thanks for posting the document. Overall, looks good! It's great that you are working on this. One minor question, in the document you mention Specifically, for +/-, the result scale is s1 + s2, and for multiplication, max(s1, s2). Did you mean the other way around (max for +/-, s1+s2 for multiplication)? In the error handling case, you mention rounding instead of using null. That definitely seems like a plausible idea. However, I think it may be a slippery slope. For example, we may be using a rounding policy here by default. This policy may not work for all users and then we may have to expose configuration/table specific options to specify such rounding policies. There might be value in masking the values that don't conform as nulls and putting the onus on the users to explicitly round them to conform to the column type. However, if mysql or another popular database has set a precedent, what you are suggesting may be ok. Big +1 on changing arithmetic operation UDFs to GenericUDFs like you and Jason already suggested. Support specifying scale and precision with Hive decimal type - Key: HIVE-3976 URL: https://issues.apache.org/jira/browse/HIVE-3976 Project: Hive Issue Type: Improvement Components: Query Processor, Types Reporter: Mark Grover Assignee: Xuefu Zhang Attachments: remove_prec_scale.diff HIVE-2693 introduced support for Decimal datatype in Hive. However, the current implementation has unlimited precision and provides no way to specify precision and scale when creating the table. For example, MySQL allows users to specify scale and precision of the decimal datatype when creating the table: {code} CREATE TABLE numbers (a DECIMAL(20,2)); {code} Hive should support something similar too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4003) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
[ https://issues.apache.org/jira/browse/HIVE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762017#comment-13762017 ] Mark Grover commented on HIVE-4003: --- [~appodictic] or [~brocknoland] would one of you mind committing this? NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java - Key: HIVE-4003 URL: https://issues.apache.org/jira/browse/HIVE-4003 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thomas Adam Assignee: Mark Grover Attachments: HIVE-4003.patch, HIVE-4003.patch Utilities.java seems to be throwing a NPE. Change contributed by Thomas Adam. Reference: https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4003) NullPointerException in exec.Utilities
[ https://issues.apache.org/jira/browse/HIVE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13762057#comment-13762057 ] Mark Grover commented on HIVE-4003: --- Thank you! NullPointerException in exec.Utilities -- Key: HIVE-4003 URL: https://issues.apache.org/jira/browse/HIVE-4003 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thomas Adam Assignee: Mark Grover Priority: Blocker Fix For: 0.12.0, 0.13.0 Attachments: HIVE-4003.patch, HIVE-4003.patch Utilities.java seems to be throwing a NPE. Change contributed by Thomas Adam. Reference: https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Proposing a 0.11.1
Hi folks, Any update on this? We are considering including Hive 0.11* in Bigtop 0.7 and it would be very useful and much appreciated to get a little more context into what the Hive 0.11.1 release would look like. Thanks in advance! Mark On Tue, Aug 13, 2013 at 9:24 PM, Edward Capriolo edlinuxg...@gmail.comwrote: I am fealing more like we should release a 12.0 rather then backport things into 11.X. On Wed, Aug 14, 2013 at 12:08 AM, Navis류승우 navis@nexr.com wrote: If this is only for addressing npath problem, we got three months for that. Would it be enough time for releasing 0.12.0? ps. IMHO, n-path seemed too generic name to be patented. I hate Teradata. 2013/8/14 Edward Capriolo edlinuxg...@gmail.com: Should we get the npath rename in? Do we have a jira for this? If not I will take it. On Tue, Aug 13, 2013 at 1:58 PM, Mark Wagner wagner.mar...@gmail.com wrote: It'd be good to get both HIVE-3953 and HIVE-4789 in there. 3953 has been committed to trunk and it looks like 4789 is close. Thanks, Mark On Tue, Aug 13, 2013 at 10:02 AM, Owen O'Malley omal...@apache.org wrote: All, I'd like to create an 0.11.1 with some fixes in it. I plan to put together a release candidate over the next week. I'm in the process of putting together the list of bugs that I want to include, but I wanted to solicit the jiras that others though would be important for an 0.11.1. Thanks, Owen
Re: Proposing a 0.11.1
Hi Owen, Sounds good. Thanks for the update! Mark On Mon, Aug 26, 2013 at 12:56 PM, Owen O'Malley omal...@apache.org wrote: Hi Mark, I haven't made any progress on it yet. I hope to make progress on it this week. I will certainly include the npath changes. On a separate thread, I'll start a discussion about starting to lock down 0.12.0. -- Owen On Mon, Aug 26, 2013 at 10:20 AM, Mark Grover m...@apache.org wrote: Hi folks, Any update on this? We are considering including Hive 0.11* in Bigtop 0.7 and it would be very useful and much appreciated to get a little more context into what the Hive 0.11.1 release would look like. Thanks in advance! Mark On Tue, Aug 13, 2013 at 9:24 PM, Edward Capriolo edlinuxg...@gmail.com wrote: I am fealing more like we should release a 12.0 rather then backport things into 11.X. On Wed, Aug 14, 2013 at 12:08 AM, Navis류승우 navis@nexr.com wrote: If this is only for addressing npath problem, we got three months for that. Would it be enough time for releasing 0.12.0? ps. IMHO, n-path seemed too generic name to be patented. I hate Teradata. 2013/8/14 Edward Capriolo edlinuxg...@gmail.com: Should we get the npath rename in? Do we have a jira for this? If not I will take it. On Tue, Aug 13, 2013 at 1:58 PM, Mark Wagner wagner.mar...@gmail.com wrote: It'd be good to get both HIVE-3953 and HIVE-4789 in there. 3953 has been committed to trunk and it looks like 4789 is close. Thanks, Mark On Tue, Aug 13, 2013 at 10:02 AM, Owen O'Malley omal...@apache.org wrote: All, I'd like to create an 0.11.1 with some fixes in it. I plan to put together a release candidate over the next week. I'm in the process of putting together the list of bugs that I want to include, but I wanted to solicit the jiras that others though would be important for an 0.11.1. Thanks, Owen
[jira] [Commented] (HIVE-4003) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
[ https://issues.apache.org/jira/browse/HIVE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13738355#comment-13738355 ] Mark Grover commented on HIVE-4003: --- Thanks, Brock, for taking a look. I will rebase this. NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java - Key: HIVE-4003 URL: https://issues.apache.org/jira/browse/HIVE-4003 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thomas Adam Assignee: Mark Grover Attachments: HIVE-4003.patch Utilities.java seems to be throwing a NPE. Change contributed by Thomas Adam. Reference: https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Proposing a 0.11.1
+ d...@bigtop.apache.org Hi Owen, I work on Apache Bigtop http://bigtop.apache.org and we were recently discussinghttp://search-hadoop.com/m/A8Jne2SAnHq1/bigtop+0.7+bomsubj=Re+DISCUSS+BOM+for+release+0+7+0+of+Bigtopinclusion of Hive 0.11 in the next release of Bigtop - Bigtop 0.7. However, we learned during that discussion that there is a request to remove the NPath UDF from Hive code (ASF Board minutes available herehttp://www.apache.org/foundation/records/minutes/2013/board_minutes_2013_06_19.txt ). I was wondering if Hive 0.11.1 release would address that? Or, if a later Hive release would or if the Hive PMC has determined that there is no issue to be addressed afterall? Thanks! Mark On Tue, Aug 13, 2013 at 10:12 AM, Konstantin Boudnik c...@apache.org wrote: Hi. Can I suggest to include HIVE-3772 (performance fix; committed to trunk) HIVE-4583 (OpenJDK7 support; all subtasks seem to be done) Thanks, Cos On Tue, Aug 13, 2013 at 10:02AM, Owen O'Malley wrote: All, I'd like to create an 0.11.1 with some fixes in it. I plan to put together a release candidate over the next week. I'm in the process of putting together the list of bugs that I want to include, but I wanted to solicit the jiras that others though would be important for an 0.11.1. Thanks, Owen
[jira] [Updated] (HIVE-4003) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
[ https://issues.apache.org/jira/browse/HIVE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-4003: -- Status: Patch Available (was: Open) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java - Key: HIVE-4003 URL: https://issues.apache.org/jira/browse/HIVE-4003 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thomas Adam Assignee: Mark Grover Attachments: HIVE-4003.patch, HIVE-4003.patch Utilities.java seems to be throwing a NPE. Change contributed by Thomas Adam. Reference: https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4003) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
[ https://issues.apache.org/jira/browse/HIVE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-4003: -- Attachment: HIVE-4003.patch I am guessing I have to keep the name of the patch same, so tests can be run. Correcting the name to be HIVE-4003.patch now. NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java - Key: HIVE-4003 URL: https://issues.apache.org/jira/browse/HIVE-4003 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thomas Adam Assignee: Mark Grover Attachments: HIVE-4003.patch, HIVE-4003.patch Utilities.java seems to be throwing a NPE. Change contributed by Thomas Adam. Reference: https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4003) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
[ https://issues.apache.org/jira/browse/HIVE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-4003: -- Attachment: (was: HIVE-4003.2.patch) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java - Key: HIVE-4003 URL: https://issues.apache.org/jira/browse/HIVE-4003 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thomas Adam Assignee: Mark Grover Attachments: HIVE-4003.patch, HIVE-4003.patch Utilities.java seems to be throwing a NPE. Change contributed by Thomas Adam. Reference: https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4003) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
[ https://issues.apache.org/jira/browse/HIVE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-4003: -- Attachment: HIVE-4003.2.patch Uploaded a new rebased patch. NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java - Key: HIVE-4003 URL: https://issues.apache.org/jira/browse/HIVE-4003 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thomas Adam Assignee: Mark Grover Attachments: HIVE-4003.patch, HIVE-4003.patch Utilities.java seems to be throwing a NPE. Change contributed by Thomas Adam. Reference: https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5046) Hcatalog's bin/hcat script doesn't respect HIVE_HOME
[ https://issues.apache.org/jira/browse/HIVE-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737008#comment-13737008 ] Mark Grover commented on HIVE-5046: --- Thanks Brock! Hcatalog's bin/hcat script doesn't respect HIVE_HOME Key: HIVE-5046 URL: https://issues.apache.org/jira/browse/HIVE-5046 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.12.0 Attachments: HIVE-5046.1.patch https://github.com/apache/hive/blob/trunk/hcatalog/bin/hcat#L81 The quoted snippet (see below) intends to set HIVE_HOME if it's not set (i.e. HIVE_HOME is currently null). {code} if [ -n ${HIVE_HOME} ]; then {code} However, {{-n}} checks if the variable is _not_ null. So, the above code ends up setting HIVE_HOME to the default value if it is actually set already, overriding the set value. This condition needs to be negated. Moreover, {{-n}} checks requires the string being tested to be enclosed in quotes. Reference: http://tldp.org/LDP/abs/html/comparison-ops.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4388) HBase tests fail against Hadoop 2
[ https://issues.apache.org/jira/browse/HIVE-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737106#comment-13737106 ] Mark Grover commented on HIVE-4388: --- Brock, thanks for looking into this. I was reviewing the patch and saw that you have several references to {{getFamilyMap()}}. This method's return type was changed in newer version of HBase. Even though HBASE-9142 introduces the original method back in 0.95.2, it's deprecated. Do you think it makes more sense to use {{getFamilyCellMap()}} here instead? HBase tests fail against Hadoop 2 - Key: HIVE-4388 URL: https://issues.apache.org/jira/browse/HIVE-4388 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Gunther Hagleitner Assignee: Brock Noland Attachments: HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388.patch, HIVE-4388-wip.txt Currently we're building by default against 0.92. When you run against hadoop 2 (-Dhadoop.mr.rev=23) builds fail because of: HBASE-5963. HIVE-3861 upgrades the version of hbase used. This will get you past the problem in HBASE-5963 (which was fixed in 0.94.1) but fails with: HBASE-6396. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5046) Hcatalog's bin/hcat script doesn't respect HIVE_HOME
Mark Grover created HIVE-5046: - Summary: Hcatalog's bin/hcat script doesn't respect HIVE_HOME Key: HIVE-5046 URL: https://issues.apache.org/jira/browse/HIVE-5046 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Mark Grover Assignee: Mark Grover https://github.com/apache/hive/blob/trunk/hcatalog/bin/hcat#L81 The quoted snippet (see below) intends to set HIVE_HOME if it's not set (i.e. HIVE_HOME is currently null). {code} if [ -n ${HIVE_HOME} ]; then {code} However, {{-n}} checks if the variable is _not_ null. So, the condition is checking for the boolean inverse of what it's supposed to be checking for. Moreover, {{-n}} checks requires the string being tested to be enclosed in quotes. Reference: http://tldp.org/LDP/abs/html/comparison-ops.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5046) Hcatalog's bin/hcat script doesn't respect HIVE_HOME
[ https://issues.apache.org/jira/browse/HIVE-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-5046: -- Description: https://github.com/apache/hive/blob/trunk/hcatalog/bin/hcat#L81 The quoted snippet (see below) intends to set HIVE_HOME if it's not set (i.e. HIVE_HOME is currently null). {code} if [ -n ${HIVE_HOME} ]; then {code} However, {{-n}} checks if the variable is _not_ null. So, the above code ends up setting HIVE_HOME to the default value if it is actually set already, overriding the set value. This condition needs to be negated. Moreover, {{-n}} checks requires the string being tested to be enclosed in quotes. Reference: http://tldp.org/LDP/abs/html/comparison-ops.html was: https://github.com/apache/hive/blob/trunk/hcatalog/bin/hcat#L81 The quoted snippet (see below) intends to set HIVE_HOME if it's not set (i.e. HIVE_HOME is currently null). {code} if [ -n ${HIVE_HOME} ]; then {code} However, {{-n}} checks if the variable is _not_ null. So, the condition is checking for the boolean inverse of what it's supposed to be checking for. Moreover, {{-n}} checks requires the string being tested to be enclosed in quotes. Reference: http://tldp.org/LDP/abs/html/comparison-ops.html Hcatalog's bin/hcat script doesn't respect HIVE_HOME Key: HIVE-5046 URL: https://issues.apache.org/jira/browse/HIVE-5046 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Mark Grover Assignee: Mark Grover https://github.com/apache/hive/blob/trunk/hcatalog/bin/hcat#L81 The quoted snippet (see below) intends to set HIVE_HOME if it's not set (i.e. HIVE_HOME is currently null). {code} if [ -n ${HIVE_HOME} ]; then {code} However, {{-n}} checks if the variable is _not_ null. So, the above code ends up setting HIVE_HOME to the default value if it is actually set already, overriding the set value. This condition needs to be negated. Moreover, {{-n}} checks requires the string being tested to be enclosed in quotes. Reference: http://tldp.org/LDP/abs/html/comparison-ops.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5046) Hcatalog's bin/hcat script doesn't respect HIVE_HOME
[ https://issues.apache.org/jira/browse/HIVE-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-5046: -- Attachment: HIVE-5046.1.patch Hcatalog's bin/hcat script doesn't respect HIVE_HOME Key: HIVE-5046 URL: https://issues.apache.org/jira/browse/HIVE-5046 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Mark Grover Assignee: Mark Grover Attachments: HIVE-5046.1.patch https://github.com/apache/hive/blob/trunk/hcatalog/bin/hcat#L81 The quoted snippet (see below) intends to set HIVE_HOME if it's not set (i.e. HIVE_HOME is currently null). {code} if [ -n ${HIVE_HOME} ]; then {code} However, {{-n}} checks if the variable is _not_ null. So, the above code ends up setting HIVE_HOME to the default value if it is actually set already, overriding the set value. This condition needs to be negated. Moreover, {{-n}} checks requires the string being tested to be enclosed in quotes. Reference: http://tldp.org/LDP/abs/html/comparison-ops.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5046) Hcatalog's bin/hcat script doesn't respect HIVE_HOME
[ https://issues.apache.org/jira/browse/HIVE-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-5046: -- Status: Patch Available (was: Open) Hcatalog's bin/hcat script doesn't respect HIVE_HOME Key: HIVE-5046 URL: https://issues.apache.org/jira/browse/HIVE-5046 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Mark Grover Assignee: Mark Grover Attachments: HIVE-5046.1.patch https://github.com/apache/hive/blob/trunk/hcatalog/bin/hcat#L81 The quoted snippet (see below) intends to set HIVE_HOME if it's not set (i.e. HIVE_HOME is currently null). {code} if [ -n ${HIVE_HOME} ]; then {code} However, {{-n}} checks if the variable is _not_ null. So, the above code ends up setting HIVE_HOME to the default value if it is actually set already, overriding the set value. This condition needs to be negated. Moreover, {{-n}} checks requires the string being tested to be enclosed in quotes. Reference: http://tldp.org/LDP/abs/html/comparison-ops.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-5046) Hcatalog's bin/hcat script doesn't respect HIVE_HOME
[ https://issues.apache.org/jira/browse/HIVE-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13735566#comment-13735566 ] Mark Grover commented on HIVE-5046: --- Eugene, thanks for posting. I don't think so. The intent of the snippet you mentioned is to do something with the variable ({{HIVE_IN_PATH}}) if it is defined. That something is to make use to {{HIVE_IN_PATH}} to popualate {{HIVE_DIR}}. On the other hand, the intent of the snippet 12 lines later (that this JIRA addresses) is to do something if the variable is not defined. That something is to set HIVE_HOME to some default value. This snippet however, checks if the variable is not null instead of checking if the variable is null. Hcatalog's bin/hcat script doesn't respect HIVE_HOME Key: HIVE-5046 URL: https://issues.apache.org/jira/browse/HIVE-5046 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.11.0 Reporter: Mark Grover Assignee: Mark Grover Attachments: HIVE-5046.1.patch https://github.com/apache/hive/blob/trunk/hcatalog/bin/hcat#L81 The quoted snippet (see below) intends to set HIVE_HOME if it's not set (i.e. HIVE_HOME is currently null). {code} if [ -n ${HIVE_HOME} ]; then {code} However, {{-n}} checks if the variable is _not_ null. So, the above code ends up setting HIVE_HOME to the default value if it is actually set already, overriding the set value. This condition needs to be negated. Moreover, {{-n}} checks requires the string being tested to be enclosed in quotes. Reference: http://tldp.org/LDP/abs/html/comparison-ops.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: VOTE: moving hive from forest to Apache CMS
+1 (non-binding) On Sun, Jul 21, 2013 at 11:08 AM, Jarek Jarcec Cecho jar...@apache.org wrote: +1 (non-binding) Jarcec On Sun, Jul 21, 2013 at 01:53:39PM -0400, Edward Capriolo wrote: http://hive.apache.org is generated by forest, a rather cumbersome and confusing way to run a website. Forest is difficult to maintain and publish updates with. As a nail in the coffin forest does not even work well with recent versions of java. This vote is to move the site to: Apache CMShttps://www.apache.org/dev/cms.html and away from forest. Brock Noland has offered to move the site, and I am offering to help him and look it over. Vote +1 if you support the move to Apache CMS. (This is the one case where cutting down a forest is a very good idea :) Edward
Re: [ANNOUNCE] New Hive Committer - Gunther Hagleitner
Many congratulations, Gunther! On Sun, Jul 21, 2013 at 10:55 AM, Shreepadma Venugopalan shreepa...@cloudera.com wrote: Congratulations, Gunther! On Sun, Jul 21, 2013 at 10:29 AM, Thejas Nair the...@hortonworks.comwrote: Congrats Gunther ! Great to see more bandwidth to get the patch available counts down ! On Jul 21, 2013 9:56 AM, Clark Yang (杨卓荦) yangzhuo...@gmail.com wrote: Congratulations Gunther! 2013/7/22 Brock Noland br...@cloudera.com Congratulations Gunther!! Cheers, Zhuoluo (Clark) Yang
Re: [ANNOUNCE] New Hive Committer - Brock Noland
Congrats, Brock! On Tue, Jul 16, 2013 at 12:31 PM, Thejas Nair the...@hortonworks.com wrote: Congrats Brock! On Tue, Jul 16, 2013 at 11:41 AM, Shreepadma Venugopalan shreepa...@cloudera.com wrote: Many congrats, Brock! On Tue, Jul 16, 2013 at 7:33 AM, Ricky Saltzer ri...@cloudera.com wrote: Congrats!! On Tue, Jul 16, 2013 at 10:30 AM, Alexander Alten-Lorenz wget.n...@gmail.com wrote: Congratulations, Brock! On Jul 15, 2013, at 10:29 PM, Carl Steinbach c...@apache.org wrote: The Apache Hive PMC has passed a vote to make Brock Noland a new committer on the project. Brock, please submit your ICLA to the Apache Software Foundation as described here: http://www.apache.org/licenses/#clas Thanks. Carl -- Ricky Saltzer Tools Developer http://www.cloudera.com
[jira] [Commented] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters
[ https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674027#comment-13674027 ] Mark Grover commented on HIVE-4403: --- Chu, that's fine. Nothing required from your side. Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters Key: HIVE-4403 URL: https://issues.apache.org/jira/browse/HIVE-4403 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0 Reporter: Mark Grover Assignee: Chu Tong Fix For: 0.12.0 Attachments: HIVE-4403.patch, HIVE-4403.patch While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings related to overriding final parameters in job.conf. This was on a pseudo distributed cluster. FWIW, I didn't see this happen on a fully-distributed cluster. Perhaps, Hive's job.conf is overriding some final parameters it shouldn't. Here is what the warnings looked like: {code} 2013-04-19 14:20:32,304 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2013-04-19 14:20:32,367 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. {code} To reproduce, run a query like: {code} CREATE TABLE u_data ( userid INT, movieid INT, rating INT, unixtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE; {code} Load some data into u_data, here is some sample data: https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data Run a simple query on that data (on YARN/MR2) {code} INSERT OVERWRITE DIRECTORY '/tmp/count' SELECT COUNT(1) FROM u_data {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters
[ https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13672703#comment-13672703 ] Mark Grover commented on HIVE-4403: --- Ashutosh +1'ed it. So, I guess there is no need for Phabricator now:-) Thanks Chu and Ashutosh! Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters Key: HIVE-4403 URL: https://issues.apache.org/jira/browse/HIVE-4403 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Chu Tong Attachments: HIVE-4403.patch, HIVE-4403.patch While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings related to overriding final parameters in job.conf. This was on a pseudo distributed cluster. FWIW, I didn't see this happen on a fully-distributed cluster. Perhaps, Hive's job.conf is overriding some final parameters it shouldn't. Here is what the warnings looked like: {code} 2013-04-19 14:20:32,304 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2013-04-19 14:20:32,367 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. {code} To reproduce, run a query like: {code} CREATE TABLE u_data ( userid INT, movieid INT, rating INT, unixtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE; {code} Load some data into u_data, here is some sample data: https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data Run a simple query on that data (on YARN/MR2) {code} INSERT OVERWRITE DIRECTORY '/tmp/count' SELECT COUNT(1) FROM u_data {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters
[ https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13669521#comment-13669521 ] Mark Grover commented on HIVE-4403: --- Chu, sorry about the delay. I can test patched Hive to see if this resolves my original problem, please give me a few days. In the meanwhile, even though this is a small patch, do you mind uploading it on Reviewboard or Phabricator? Thanks for contributing! Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters Key: HIVE-4403 URL: https://issues.apache.org/jira/browse/HIVE-4403 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Mark Grover Attachments: HIVE-4403.patch While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings related to overriding final parameters in job.conf. This was on a pseudo distributed cluster. FWIW, I didn't see this happen on a fully-distributed cluster. Perhaps, Hive's job.conf is overriding some final parameters it shouldn't. Here is what the warnings looked like: {code} 2013-04-19 14:20:32,304 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2013-04-19 14:20:32,367 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. {code} To reproduce, run a query like: {code} CREATE TABLE u_data ( userid INT, movieid INT, rating INT, unixtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE; {code} Load some data into u_data, here is some sample data: https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data Run a simple query on that data (on YARN/MR2) {code} INSERT OVERWRITE DIRECTORY '/tmp/count' SELECT COUNT(1) FROM u_data {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4070) Like operator in Hive is case sensitive while in MySQL (and most likely other DBs) it's case insensitive
[ https://issues.apache.org/jira/browse/HIVE-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13666785#comment-13666785 ] Mark Grover commented on HIVE-4070: --- I agree with Edward that having a property like that is risky. I also like John's idea of having two different like UDFs. We can only change the behavior (if at all) in a major release. I think in this case, we are just going to have to pick one kind of like and implement another one. Yeah, people coming from MySQL land may get confused but that's the best we can do is document it and ask them to use the other like. John, it would be great if you could create a JIRA for the case-insensitive like. Thanks! Like operator in Hive is case sensitive while in MySQL (and most likely other DBs) it's case insensitive Key: HIVE-4070 URL: https://issues.apache.org/jira/browse/HIVE-4070 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Priority: Trivial Hive's like operator seems to be case sensitive. See https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLike.java#L164 However, MySQL's like operator is case insensitive. I don't have other DB's (like PostgreSQL) installed and handy but I am guessing their LIKE is case insensitive as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3384) HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC specification
[ https://issues.apache.org/jira/browse/HIVE-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649880#comment-13649880 ] Mark Grover commented on HIVE-3384: --- I verified that this was committed on trunk (Thanks!). Is there anything left to be done regarding this patch? If not, can we please update the JIRA to resolved status? HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC specification -- Key: HIVE-3384 URL: https://issues.apache.org/jira/browse/HIVE-3384 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.10.0 Reporter: Weidong Bian Assignee: Chris Drome Priority: Minor Fix For: 0.11.0 Attachments: D6873-0.9.1.patch, D6873.1.patch, D6873.2.patch, D6873.3.patch, D6873.4.patch, D6873.5.patch, D6873.6.patch, D6873.7.patch, HIVE-3384-0.10.patch, HIVE-3384-2012-12-02.patch, HIVE-3384-2012-12-04.patch, HIVE-3384.2.patch, HIVE-3384-branch-0.9.patch, HIVE-3384.patch, HIVE-JDK7-JDBC.patch jdbc module couldn't be compiled with jdk7 as it adds some abstract method in the JDBC specification some error info: error: HiveCallableStatement is not abstract and does not override abstract method TgetObject(String,ClassT) in CallableStatement . . . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Apache Hive 0.11.0 Release Candidate 0
Thanks for the release candidate, Ashutosh. Would it also make sense to conclude the voting after the weekend so those busy during the week could also chime in? Mark On Tue, Apr 30, 2013 at 2:07 AM, Alexander Alten-Lorenz wget.n...@gmail.com wrote: Yes, a bit large. Looks like that the hive-0.11.0.tar.gz have a build and test env included. On Apr 30, 2013, at 9:34 AM, Carl Steinbach cwsteinb...@gmail.com wrote: I think the source tarball must be corrupted. It's 664MB in size, which is roughly 630MB larger than the 0.10.0 release tarball. I haven't been able to take a look at it yet because the apache archive site keeps throttling my connection midway through the download. On Mon, Apr 29, 2013 at 10:42 PM, Ashutosh Chauhan hashut...@apache.org wrote: Hey all, I am excited to announce availability of Apache Hive 0.11.0 Release Candidate 0 at: http://people.apache.org/~hashutosh/hive-0.11.0-rc0/ Maven artifacts are available here: https://repository.apache.org/content/repositories/orgapachehive-154/ This release has many goodies including HiveServer2, windowing and analytical functions, decimal data type, better query planning, performance enhancements and various bug fixes. In total, we resolved more than 350 issues. Full list of fixed issues can be found at: http://s.apache.org/8Fr Voting will conclude in 72 hours. Hive PMC Members: Please test and vote. Thanks, Ashutosh (On behalf of Hive contributors who made 0.11 a possibility) -- Alexander Alten-Lorenz http://mapredit.blogspot.com German Hadoop LinkedIn Group: http://goo.gl/N8pCF
[jira] [Created] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters
Mark Grover created HIVE-4403: - Summary: Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters Key: HIVE-4403 URL: https://issues.apache.org/jira/browse/HIVE-4403 Project: Hive Issue Type: Bug Reporter: Mark Grover While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings related to overriding final parameters in job.conf. Perhaps, Hive's job.conf is overriding some final parameters it shouldn't. Here is what the warnings looked like: {code} 2013-04-19 14:20:32,304 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2013-04-19 14:20:32,367 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. {code} To reproduce, run a query like: {code} CREATE TABLE u_data ( userid INT, movieid INT, rating INT, unixtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE; {code} Load some data into u_data, here is some sample data: https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data Run a simple query on that data (on YARN/MR2) {code} INSERT OVERWRITE DIRECTORY '/tmp/count' SELECT COUNT(1) FROM u_data {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4403) Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters
[ https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-4403: -- Affects Version/s: 0.10.0 Running Hive queries on Yarn (MR2) gives warnings related to overriding final parameters Key: HIVE-4403 URL: https://issues.apache.org/jira/browse/HIVE-4403 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Mark Grover While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings related to overriding final parameters in job.conf. Perhaps, Hive's job.conf is overriding some final parameters it shouldn't. Here is what the warnings looked like: {code} 2013-04-19 14:20:32,304 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 2013-04-19 14:20:32,367 WARN [main] conf.Configuration (Configuration.java:loadProperty(2032)) - file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. {code} To reproduce, run a query like: {code} CREATE TABLE u_data ( userid INT, movieid INT, rating INT, unixtime STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE; {code} Load some data into u_data, here is some sample data: https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data Run a simple query on that data (on YARN/MR2) {code} INSERT OVERWRITE DIRECTORY '/tmp/count' SELECT COUNT(1) FROM u_data {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4362) Allow Hive unit tests to run against fully-distributed cluster
Mark Grover created HIVE-4362: - Summary: Allow Hive unit tests to run against fully-distributed cluster Key: HIVE-4362 URL: https://issues.apache.org/jira/browse/HIVE-4362 Project: Hive Issue Type: Improvement Components: Testing Infrastructure Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 It seems like Hive unit tests can run in (Hadoop) local mode or miniMR mode. It would be nice (especially for projects like Apache Bigtop) to be able to run Hive tests in fully distributed mode. This JIRA tracks the introduction of such functionality. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Running Hive unit tests on a fully-distributed cluster
Hi all, Is there a way to run Hive unit tests against a fully-distributed Hadoop cluster? I poked around the code and was only able to find two modes - local mode and miniMR. Is there a property or something similar that I can set to do so? I have been contributing recently to Apache Bigtophttp://bigtop.apache.org, a project responsible for performing packaging and interoperability testing of various projects in the Hadoop ecosystem, including Apache Hive. I would really like Bigtop to leverage Hive's unit test framework but the catch is that Bigtop needs to run tests in non-local mode. For now, I have created a JIRA (HIVE-4362) for this but if such a way already exists please let me know. Mark
Re: Running Hive unit tests on a fully-distributed cluster
Thanks, Ashutosh! I will take a look. On Mon, Apr 15, 2013 at 5:55 PM, Ashutosh Chauhan hashut...@apache.orgwrote: Thanks Mark for looking into this. Getting our test harness to run on fully-distributed hadoop cluster will be immensely valuable. There are few things to consider. https://issues.apache.org/jira/browse/HIVE-2670 provides a framework to run tests on fully distributed cluster. But it doesnt yet support running .q files. I believe it comes with its own smaller set of tests. Other thing to consider is TestMinmrCliDriver which runs .q tests in distributed mode but only a handful of those. May be we can provide an option to have that run all tests. Ashutosh On Mon, Apr 15, 2013 at 5:26 PM, Mark Grover grover.markgro...@gmail.com wrote: Hi all, Is there a way to run Hive unit tests against a fully-distributed Hadoop cluster? I poked around the code and was only able to find two modes - local mode and miniMR. Is there a property or something similar that I can set to do so? I have been contributing recently to Apache Bigtop http://bigtop.apache.org, a project responsible for performing packaging and interoperability testing of various projects in the Hadoop ecosystem, including Apache Hive. I would really like Bigtop to leverage Hive's unit test framework but the catch is that Bigtop needs to run tests in non-local mode. For now, I have created a JIRA (HIVE-4362) for this but if such a way already exists please let me know. Mark
Re: Hive compilation issues on branch-0.10 and trunk
Nitin, I have been able to build hive trunk with JDK 1.6. Did you try the workaround listed in HIVE-4231? Mark On Thu, Apr 11, 2013 at 2:42 AM, Nitin Pawar nitinpawar...@gmail.comwrote: Hello, I am trying to build hive on both trunk and branch-0.10 I have tried both SUN JDK6 and JDK7 With both the version running into different issues with JDK6 running into issue mentioned at HIVE-4231 with JDK7 running into issue mentioned at HIVE-3384 can somebody please help out with this? What would be recommended JDK version going forward for development activities ? -- Nitin Pawar
Re: HCatalog to Hive Committership
Francis, You may have already seen this but I thought I'd post anyways: https://cwiki.apache.org/Hive/becomingacommitter.html Mark On Fri, Apr 5, 2013 at 11:35 AM, Francis Liu tof...@apache.org wrote: Thanks for your response. This clarifies a lot. I just have two follow-up questions: 1. Does reviewing patches have any bearing? 2. Is it a requirement that the aggregate of patches submitted touch all the hive components? Also it'd be great to know if the other PMCs have a similar criteria? -Francis On Apr 4, 2013, at 4:06 PM, Edward Capriolo wrote: Anecdotally, The criteria is patches/features and a track record of being dedicated to hive. 10 smaller medium features might do it, or one big one and five other bugs, or two patches and crazy dedication on the mailing list. After enough work someone takes notice and nominates you. I did not remember that shepherd language :) but the best bet is hanging out on the hive IRC, I would volunteer but I am fairly bugged down ATM with things not hive. On Thu, Apr 4, 2013 at 4:18 PM, Francis Christopher Liu fc...@yahoo-inc.com wrote: Hi, Given that moving HCatalog into Hive is well underway. As an HCatalog committer I'd like to better understand the path to becoming a Hive committer. Based on what I gleaned from email discussions. I'd like to verify the following: HCatalog committers will be assigned shepherds - How do we go about getting assigned one? HCatalog committers are expected to become Hive Committers in 6-8 months - Will there be a review done in 6-8 mos for the existing HCatalog committers. What would be the criteria? What happens if we don't get voted/nominated? I and a few of my colleagues (also HCat committers) are interested in becoming Hive committers and it would greatly help our effort if we could clarify the things discussed in the list. Feel free to point out anything I missed or misunderstood. -Francis
[jira] [Commented] (HIVE-4174) Round UDF converts BigInts to double
[ https://issues.apache.org/jira/browse/HIVE-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621069#comment-13621069 ] Mark Grover commented on HIVE-4174: --- Indeed, thanks Chen Chun. And, congratulations on your first Hive commit! Round UDF converts BigInts to double Key: HIVE-4174 URL: https://issues.apache.org/jira/browse/HIVE-4174 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Attachments: hive.4174.1.patch-nohcat, HIVE-4174.1.patch.txt, HIVE-4174.D9687.1.patch Chen Chun pointed out on the hive-user mailing list that round() in Hive 0.10 returns {code} select round(cast(1234560 as BIGINT)), round(cast(12345670 as BIGINT)) from test limit 1; //hive 0.10 1234560.0 1.234567E7 {code} This is not consistent with MySQL(http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round) which quotes {code} The return type is the same type as that of the first argument (assuming that it is integer, double, or decimal). This means that for an integer argument, the result is an integer (no decimal places) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4174) Round UDF converts BigInts to double
[ https://issues.apache.org/jira/browse/HIVE-4174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621137#comment-13621137 ] Mark Grover commented on HIVE-4174: --- It should be. [~namitjain] can you please add Chen Chun (chenchun dot feed at gmail dot com) to the project contributors list and assign this JIRA to him? Thanks! Round UDF converts BigInts to double Key: HIVE-4174 URL: https://issues.apache.org/jira/browse/HIVE-4174 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Attachments: hive.4174.1.patch-nohcat, HIVE-4174.1.patch.txt, HIVE-4174.D9687.1.patch Chen Chun pointed out on the hive-user mailing list that round() in Hive 0.10 returns {code} select round(cast(1234560 as BIGINT)), round(cast(12345670 as BIGINT)) from test limit 1; //hive 0.10 1234560.0 1.234567E7 {code} This is not consistent with MySQL(http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round) which quotes {code} The return type is the same type as that of the first argument (assuming that it is integer, double, or decimal). This means that for an integer argument, the result is an integer (no decimal places) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4213) List bucketing error too restrictive
[ https://issues.apache.org/jira/browse/HIVE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621158#comment-13621158 ] Mark Grover commented on HIVE-4213: --- Hi [~gangtimliu], really sorry for dropping the ball on this. I do see what you mean by the legal cases but is it legal to do this: {code} set hive.mapred.supports.subdirectories=false; set mapred.input.dir.recursive=true; set hive.optimize.listbucketing=false; {code} List bucketing error too restrictive Key: HIVE-4213 URL: https://issues.apache.org/jira/browse/HIVE-4213 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Gang Tim Liu Fix For: 0.11.0 With the introduction of List bucketing, we introduced a config validation step where we say: {code} SUPPORT_DIR_MUST_TRUE_FOR_LIST_BUCKETING( 10199, hive.mapred.supports.subdirectories must be true + if any one of following is true: hive.internal.ddl.list.bucketing.enable, + hive.optimize.listbucketing and mapred.input.dir.recursive), {code} This seems overly restrictive to because there are use cases where people may want to use {{mapred.input.dir.recursive}} to {{true}} even when they don't care about list bucketing. Is that not true? For example, here is the unit test code for {{clientpositive/recursive_dir.q}} {code} CREATE TABLE fact_daily(x int) PARTITIONED BY (ds STRING); CREATE TABLE fact_tz(x int) PARTITIONED BY (ds STRING, hr STRING) LOCATION 'pfile:${system:test.tmp.dir}/fact_tz'; INSERT OVERWRITE TABLE fact_tz PARTITION (ds='1', hr='1') SELECT key+11 FROM src WHERE key=484; ALTER TABLE fact_daily SET TBLPROPERTIES('EXTERNAL'='TRUE'); ALTER TABLE fact_daily ADD PARTITION (ds='1') LOCATION 'pfile:${system:test.tmp.dir}/fact_tz/ds=1'; set hive.mapred.supports.subdirectories=true; set mapred.input.dir.recursive=true; set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; SELECT * FROM fact_daily WHERE ds='1'; SELECT count(1) FROM fact_daily WHERE ds='1'; {code} The unit test doesn't seem to be concerned about list bucketing but wants to set {{mapred.input.dir.recursive}} to {{true}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3951) Allow Decimal type columns in Regex Serde
[ https://issues.apache.org/jira/browse/HIVE-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-3951: -- Attachment: HIVE-3951.2.patch Thanks for the comments, Ashutosh. Uploaded a new patch. Will update Review Board momentarily. Allow Decimal type columns in Regex Serde - Key: HIVE-3951 URL: https://issues.apache.org/jira/browse/HIVE-3951 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Attachments: HIVE-3951.1.patch, HIVE-3951.2.patch Decimal type in Hive was recently added by HIVE-2693. We should allow users to create tables with decimal type columns when using Regex Serde. HIVE-3004 did something similar for other primitive types. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-3951: Allow Decimal type columns in Regex Serde
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9173/ --- (Updated April 1, 2013, 3:13 p.m.) Review request for hive. Description --- Add support for RegexSerde to support newly added Decimal type This addresses bug HVIE-3951. https://issues.apache.org/jira/browse/HVIE-3951 Diffs (updated) - ql/src/test/queries/clientpositive/serde_regex.q c3254ca ql/src/test/results/clientpositive/serde_regex.q.out a933538 serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java 9317a6c Diff: https://reviews.apache.org/r/9173/diff/ Testing --- Added a client positive test Thanks, Mark Grover
[jira] [Commented] (HIVE-3951) Allow Decimal type columns in Regex Serde
[ https://issues.apache.org/jira/browse/HIVE-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618821#comment-13618821 ] Mark Grover commented on HIVE-3951: --- RB updated Allow Decimal type columns in Regex Serde - Key: HIVE-3951 URL: https://issues.apache.org/jira/browse/HIVE-3951 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Attachments: HIVE-3951.1.patch, HIVE-3951.2.patch Decimal type in Hive was recently added by HIVE-2693. We should allow users to create tables with decimal type columns when using Regex Serde. HIVE-3004 did something similar for other primitive types. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3951) Allow Decimal type columns in Regex Serde
[ https://issues.apache.org/jira/browse/HIVE-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13619518#comment-13619518 ] Mark Grover commented on HIVE-3951: --- Thanks, Ashutosh! Allow Decimal type columns in Regex Serde - Key: HIVE-3951 URL: https://issues.apache.org/jira/browse/HIVE-3951 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Attachments: HIVE-3951.1.patch, HIVE-3951.2.patch Decimal type in Hive was recently added by HIVE-2693. We should allow users to create tables with decimal type columns when using Regex Serde. HIVE-3004 did something similar for other primitive types. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3451) map-reduce jobs does not work for a partition containing sub-directories
[ https://issues.apache.org/jira/browse/HIVE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover reassigned HIVE-3451: - Assignee: Mark Grover (was: Gang Tim Liu) map-reduce jobs does not work for a partition containing sub-directories Key: HIVE-3451 URL: https://issues.apache.org/jira/browse/HIVE-3451 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Mark Grover Fix For: 0.10.0 Attachments: HIVE-3451.patch Consider the following test: -- The test verifies that sub-directories are supported for versions of hadoop -- where MAPREDUCE-1501 is fixed. So, enable this test only for hadoop 23. -- INCLUDE_HADOOP_MAJOR_VERSIONS(0.23) CREATE TABLE fact_daily(x int) PARTITIONED BY (ds STRING); CREATE TABLE fact_tz(x int) PARTITIONED BY (ds STRING, hr STRING) LOCATION 'pfile:${system:test.tmp.dir}/fact_tz'; INSERT OVERWRITE TABLE fact_tz PARTITION (ds='1', hr='1') SELECT key+11 FROM src WHERE key=484; ALTER TABLE fact_daily SET TBLPROPERTIES('EXTERNAL'='TRUE'); ALTER TABLE fact_daily ADD PARTITION (ds='1') LOCATION 'pfile:${system:test.tmp.dir}/fact_tz/ds=1'; set mapred.input.dir.recursive=true; SELECT * FROM fact_daily WHERE ds='1'; SELECT count(1) FROM fact_daily WHERE ds='1'; Say, the above file was named: recursive_dir.q and we ran the test for hadoop 23: by executing: ant test -Dhadoop.mr.rev=23 -Dtest.print.classpath=true -Dhadoop.version=2.0.0-alpha -Dhadoop.security.version=2.0.0-alpha -Dtestcase=TestCliDriver -Dqfile=recursive_dir.q The select * from the table works fine, but the last command does not work since it requires a map-reduce job. This will prevent other features which are creating sub-directories to add any tests which requires a map-reduce job. The work-around is to issue queries which do not require map-reduce jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3451) map-reduce jobs does not work for a partition containing sub-directories
[ https://issues.apache.org/jira/browse/HIVE-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover reassigned HIVE-3451: - Assignee: Gang Tim Liu (was: Mark Grover) map-reduce jobs does not work for a partition containing sub-directories Key: HIVE-3451 URL: https://issues.apache.org/jira/browse/HIVE-3451 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Gang Tim Liu Fix For: 0.10.0 Attachments: HIVE-3451.patch Consider the following test: -- The test verifies that sub-directories are supported for versions of hadoop -- where MAPREDUCE-1501 is fixed. So, enable this test only for hadoop 23. -- INCLUDE_HADOOP_MAJOR_VERSIONS(0.23) CREATE TABLE fact_daily(x int) PARTITIONED BY (ds STRING); CREATE TABLE fact_tz(x int) PARTITIONED BY (ds STRING, hr STRING) LOCATION 'pfile:${system:test.tmp.dir}/fact_tz'; INSERT OVERWRITE TABLE fact_tz PARTITION (ds='1', hr='1') SELECT key+11 FROM src WHERE key=484; ALTER TABLE fact_daily SET TBLPROPERTIES('EXTERNAL'='TRUE'); ALTER TABLE fact_daily ADD PARTITION (ds='1') LOCATION 'pfile:${system:test.tmp.dir}/fact_tz/ds=1'; set mapred.input.dir.recursive=true; SELECT * FROM fact_daily WHERE ds='1'; SELECT count(1) FROM fact_daily WHERE ds='1'; Say, the above file was named: recursive_dir.q and we ran the test for hadoop 23: by executing: ant test -Dhadoop.mr.rev=23 -Dtest.print.classpath=true -Dhadoop.version=2.0.0-alpha -Dhadoop.security.version=2.0.0-alpha -Dtestcase=TestCliDriver -Dqfile=recursive_dir.q The select * from the table works fine, but the last command does not work since it requires a map-reduce job. This will prevent other features which are creating sub-directories to add any tests which requires a map-reduce job. The work-around is to issue queries which do not require map-reduce jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4213) List bucketing error too restrictive
Mark Grover created HIVE-4213: - Summary: List bucketing error too restrictive Key: HIVE-4213 URL: https://issues.apache.org/jira/browse/HIVE-4213 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Mark Grover Fix For: 0.11.0 With the introduction of List bucketing, we introduced a config validation step where we say: {code} SUPPORT_DIR_MUST_TRUE_FOR_LIST_BUCKETING( 10199, hive.mapred.supports.subdirectories must be true + if any one of following is true: hive.internal.ddl.list.bucketing.enable, + hive.optimize.listbucketing and mapred.input.dir.recursive), {code} This seems overly restrictive to because there are use cases where people may want to use {{mapred.input.dir.recursive}} to {{true}} even when they don't care about list bucketing. Is that not true? For example, here is the unit test code for {{clientpositive/recursive_dir.q}} {code} CREATE TABLE fact_daily(x int) PARTITIONED BY (ds STRING); CREATE TABLE fact_tz(x int) PARTITIONED BY (ds STRING, hr STRING) LOCATION 'pfile:${system:test.tmp.dir}/fact_tz'; INSERT OVERWRITE TABLE fact_tz PARTITION (ds='1', hr='1') SELECT key+11 FROM src WHERE key=484; ALTER TABLE fact_daily SET TBLPROPERTIES('EXTERNAL'='TRUE'); ALTER TABLE fact_daily ADD PARTITION (ds='1') LOCATION 'pfile:${system:test.tmp.dir}/fact_tz/ds=1'; set hive.mapred.supports.subdirectories=true; set mapred.input.dir.recursive=true; set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; SELECT * FROM fact_daily WHERE ds='1'; SELECT count(1) FROM fact_daily WHERE ds='1'; {code} The unit test doesn't seem to be concerned about list bucketing but wants to set {{mapred.input.dir.recursive}} to {{true}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4213) List bucketing error too restrictive
[ https://issues.apache.org/jira/browse/HIVE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13609262#comment-13609262 ] Mark Grover commented on HIVE-4213: --- [~gangtimliu] I would love to hear your thoughts on this! Thanks in advance! List bucketing error too restrictive Key: HIVE-4213 URL: https://issues.apache.org/jira/browse/HIVE-4213 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Mark Grover Fix For: 0.11.0 With the introduction of List bucketing, we introduced a config validation step where we say: {code} SUPPORT_DIR_MUST_TRUE_FOR_LIST_BUCKETING( 10199, hive.mapred.supports.subdirectories must be true + if any one of following is true: hive.internal.ddl.list.bucketing.enable, + hive.optimize.listbucketing and mapred.input.dir.recursive), {code} This seems overly restrictive to because there are use cases where people may want to use {{mapred.input.dir.recursive}} to {{true}} even when they don't care about list bucketing. Is that not true? For example, here is the unit test code for {{clientpositive/recursive_dir.q}} {code} CREATE TABLE fact_daily(x int) PARTITIONED BY (ds STRING); CREATE TABLE fact_tz(x int) PARTITIONED BY (ds STRING, hr STRING) LOCATION 'pfile:${system:test.tmp.dir}/fact_tz'; INSERT OVERWRITE TABLE fact_tz PARTITION (ds='1', hr='1') SELECT key+11 FROM src WHERE key=484; ALTER TABLE fact_daily SET TBLPROPERTIES('EXTERNAL'='TRUE'); ALTER TABLE fact_daily ADD PARTITION (ds='1') LOCATION 'pfile:${system:test.tmp.dir}/fact_tz/ds=1'; set hive.mapred.supports.subdirectories=true; set mapred.input.dir.recursive=true; set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; SELECT * FROM fact_daily WHERE ds='1'; SELECT count(1) FROM fact_daily WHERE ds='1'; {code} The unit test doesn't seem to be concerned about list bucketing but wants to set {{mapred.input.dir.recursive}} to {{true}}. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4070) Like operator in Hive is case sensitive while in MySQL (and most likely other DBs) it's case insensitive
[ https://issues.apache.org/jira/browse/HIVE-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13609278#comment-13609278 ] Mark Grover commented on HIVE-4070: --- Thanks Gwen. In that case, I am ok with documenting it and resolving this JIRA as won't fix. Is that ok with you as well, [~mackrorysd]? Like operator in Hive is case sensitive while in MySQL (and most likely other DBs) it's case insensitive Key: HIVE-4070 URL: https://issues.apache.org/jira/browse/HIVE-4070 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Priority: Trivial Fix For: 0.11.0 Hive's like operator seems to be case sensitive. See https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLike.java#L164 However, MySQL's like operator is case insensitive. I don't have other DB's (like PostgreSQL) installed and handy but I am guessing their LIKE is case insensitive as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4208) Clientpositive test parenthesis_star_by is non-deteministic
Mark Grover created HIVE-4208: - Summary: Clientpositive test parenthesis_star_by is non-deteministic Key: HIVE-4208 URL: https://issues.apache.org/jira/browse/HIVE-4208 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 parenthesis_star_by is testing {{DISTRIBUTE BY}}; however, the order of rows returned by {{DISTRIBUTE BY}} is not deterministic and results in failures depending on Hadoop version. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4208) Clientpositive test parenthesis_star_by is non-deteministic
[ https://issues.apache.org/jira/browse/HIVE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-4208: -- Attachment: HIVE-4208.1.patch Uploading patch, adding {{ORDER BY}} to make the order of rows in the result deterministic. Clientpositive test parenthesis_star_by is non-deteministic --- Key: HIVE-4208 URL: https://issues.apache.org/jira/browse/HIVE-4208 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Attachments: HIVE-4208.1.patch parenthesis_star_by is testing {{DISTRIBUTE BY}}; however, the order of rows returned by {{DISTRIBUTE BY}} is not deterministic and results in failures depending on Hadoop version. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4208) Clientpositive test parenthesis_star_by is non-deteministic
[ https://issues.apache.org/jira/browse/HIVE-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-4208: -- Status: Patch Available (was: Open) [~namitjain] since you were the original author of this test, would you mind reviewing it please? Clientpositive test parenthesis_star_by is non-deteministic --- Key: HIVE-4208 URL: https://issues.apache.org/jira/browse/HIVE-4208 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Attachments: HIVE-4208.1.patch parenthesis_star_by is testing {{DISTRIBUTE BY}}; however, the order of rows returned by {{DISTRIBUTE BY}} is not deterministic and results in failures depending on Hadoop version. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4174) Round UDF converts BigInts to double
Mark Grover created HIVE-4174: - Summary: Round UDF converts BigInts to double Key: HIVE-4174 URL: https://issues.apache.org/jira/browse/HIVE-4174 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Chen Chun pointed out on the hive-user mailing list that round() in Hive 0.10 returns {code} select round(cast(1234560 as BIGINT)), round(cast(12345670 as BIGINT)) from test limit 1; //hive 0.10 1234560.0 1.234567E7 {code} This is not consistent with MySQL(http://dev.mysql.com/doc/refman/5.1/en/mathematical-functions.html#function_round) which quotes {code} The return type is the same type as that of the first argument (assuming that it is integer, double, or decimal). This means that for an integer argument, the result is an integer (no decimal places) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3963) Allow Hive to connect to RDBMS
[ https://issues.apache.org/jira/browse/HIVE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599022#comment-13599022 ] Mark Grover commented on HIVE-3963: --- Thanks Maxime. Can you also please post the patch on reviewboard (or Phabricator) as well? Allow Hive to connect to RDBMS -- Key: HIVE-3963 URL: https://issues.apache.org/jira/browse/HIVE-3963 Project: Hive Issue Type: New Feature Components: Import/Export, JDBC, SQL, StorageHandler Affects Versions: 0.10.0, 0.9.1, 0.11.0 Reporter: Maxime LANCIAUX Fix For: 0.10.1 Attachments: patchfile I am thinking about something like : SELECT jdbcload('driver','url','user','password','sql') FROM dual; There is already a JIRA https://issues.apache.org/jira/browse/HIVE-1555 for JDBCStorageHandler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4144) Add select database() command to show the current database
Mark Grover created HIVE-4144: - Summary: Add select database() command to show the current database Key: HIVE-4144 URL: https://issues.apache.org/jira/browse/HIVE-4144 Project: Hive Issue Type: Bug Components: SQL Reporter: Mark Grover A recent hive-user mailing list conversation asked about having a command to show the current database. http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E MySQL seems to have a command to do so: {code} select database(); {code} http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database We should look into having something similar in Hive. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Request to review the change
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9673/#review17649 --- Ship it! Ship It! - Mark Grover On Feb. 28, 2013, 4:05 a.m., Anandha L Ranganahan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9673/ --- (Updated Feb. 28, 2013, 4:05 a.m.) Review request for hive. Description --- Patch for issue https://issues.apache.org/jira/browse/HIVE-3850. Please review. This addresses bug https://issues.apache.org/jira/browse/HIVE-3850. https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-3850 Diffs - /trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFHour.java 115 /trunk/ql/src/test/queries/clientpositive/udf_hour.q 115 /trunk/ql/src/test/results/clientpositive/udf_hour.q.out 115 Diff: https://reviews.apache.org/r/9673/diff/ Testing --- Attached test case with results. Includes .q and .q.out Thanks, Anandha L Ranganahan
[jira] [Commented] (HIVE-3850) hour() function returns 12 hour clock value when using timestamp datatype
[ https://issues.apache.org/jira/browse/HIVE-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598049#comment-13598049 ] Mark Grover commented on HIVE-3850: --- +1 (non-committer) hour() function returns 12 hour clock value when using timestamp datatype - Key: HIVE-3850 URL: https://issues.apache.org/jira/browse/HIVE-3850 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 0.9.0, 0.10.0 Reporter: Pieterjan Vriends Fix For: 0.11.0 Attachments: hive-3850_1.patch, HIVE-3850.patch.txt Apparently UDFHour.java does have two evaluate() functions. One that does accept a Text object as parameter and one that does use a TimeStampWritable object as parameter. The first function does return the value of Calendar.HOUR_OF_DAY and the second one of Calendar.HOUR. In the documentation I couldn't find any information on the overload of the evaluation function. I did spent quite some time finding out why my statement didn't return a 24 hour clock value. Shouldn't both functions return the same? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Hive-3963
Maxime, I posted a comment on the JIRA. Thanks! On Thu, Mar 7, 2013 at 5:57 AM, mlanciau mlanc...@gmail.com wrote: Hello ! I am working on https://issues.apache.org/jira/browse/HIVE-3963 to allow Hive's users to get data from databases to do join with big Hadoop/Hive table and small/reference table. So I have coded a UDTF LoadFromJDBC and it is working well. But I am sure it can be improved a lot ! I am looking for any comments/advices/help ! Thanks. -- Maxime LANCIAUX http://maximelanciauxbi.blogspot.fr/
[jira] [Commented] (HIVE-3963) Allow Hive to connect to RDBMS
[ https://issues.apache.org/jira/browse/HIVE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598050#comment-13598050 ] Mark Grover commented on HIVE-3963: --- Maxime, can you upload a patch and post it for review. Also, out of curiosity, when would a user use this instead of using something like Apache Sqoop. Allow Hive to connect to RDBMS -- Key: HIVE-3963 URL: https://issues.apache.org/jira/browse/HIVE-3963 Project: Hive Issue Type: New Feature Components: Import/Export, JDBC, SQL, StorageHandler Affects Versions: 0.9.0, 0.10.0, 0.9.1, 0.11.0 Reporter: Maxime LANCIAUX I am thinking about something like : SELECT jdbcload('driver','url','user','password','sql') FROM dual; There is already a JIRA https://issues.apache.org/jira/browse/HIVE-1555 for JDBCStorageHandler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [ANNOUNCE] Kevin Wilfong elected to Hive PMC
Congrats, Kevin. Keep up the good work! On Mon, Mar 4, 2013 at 11:54 AM, Carl Steinbach c...@apache.org wrote: On behalf of the Apache Hive PMC I am pleased to welcome Kevin Wilfong as a member of the Apache Hive PMC. Please join me in congratulating Kevin on his new role! Thanks. Carl
[jira] [Commented] (HIVE-4053) Add support for phonetic algorithms in Hive
[ https://issues.apache.org/jira/browse/HIVE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591449#comment-13591449 ] Mark Grover commented on HIVE-4053: --- Krishna, thanks for doing this. I don't have a whole lot of insight into these particular algorithms but do they always take the same parameters? What's the possibility of a new phonetic algorithm using a different set or number of parameters? If these functions always take same parameters, it may make sense to do (2). However, if not, (1) would be a good idea. Of course, you can still refactor the code and share amongst all different UDFs even when they are separate. To post a review on reviewboard, go to reviews.apache.org. Generate a diff file of your changes on top of hive trunk (using svn diff or git diff) and upload that diff (use hive repository when using svn diff output and hive-git repository when using git diff output). Please let me know if you have any further questions. Add support for phonetic algorithms in Hive --- Key: HIVE-4053 URL: https://issues.apache.org/jira/browse/HIVE-4053 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.10.0 Reporter: Krishna Labels: patch Fix For: 0.10.0 Attachments: FunctionRegistry.java, GenericUDFRefinedSoundex.java, HIVE-4053.1.patch.txt Following phonetic algorithms should be considered, which are very useful in search: Soundex: http://en.wikipedia.org/wiki/Soundex Refined Soundex: Refer to the comment on 22/Feb/13 23:51 Daitch–Mokotoff Soundex: http://en.wikipedia.org/wiki/Daitch%E2%80%93Mokotoff_Soundex Metaphone and Double Metaphone: http://en.wikipedia.org/wiki/Metaphone New York State Identification and Intelligence System (NYSIIS): http://en.wikipedia.org/wiki/New_York_State_Identification_and_Intelligence_System Caverphone: http://en.wikipedia.org/wiki/Caverphone -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4100) Improve regex_replace UDF to allow non-ascii characters
Mark Grover created HIVE-4100: - Summary: Improve regex_replace UDF to allow non-ascii characters Key: HIVE-4100 URL: https://issues.apache.org/jira/browse/HIVE-4100 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 There have a been a few email threads on the user mailing list regarding regex_replace UDF not supporting non-ASCII characters. We should validate that and improve the UDF to allow it. Translate UDF will be a good reference since it does that by using code points instead of characters -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4070) Like operator in Hive is case sensitive while in MySQL (and most likely other DBs) it's case sensitive
Mark Grover created HIVE-4070: - Summary: Like operator in Hive is case sensitive while in MySQL (and most likely other DBs) it's case sensitive Key: HIVE-4070 URL: https://issues.apache.org/jira/browse/HIVE-4070 Project: Hive Issue Type: Bug Components: UDF Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Hive's like operator seems to be case sensitive. See https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFLike.java#L164 However, MySQL's like operator is case insensitive. I don't have other DB's (like PostgreSQL) installed and handy but I am guessing their LIKE is case insensitive as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: HIVE-4053 | Review request
Krishna, Can you please post a patch on the JIRA and post a review on reviewboard? You should also consider adding some unit tests. If you need help with any of this, please let us know. I will post this on JIRA as well for completeness. Mark On Fri, Feb 22, 2013 at 9:48 PM, Krishna research...@gmail.com wrote: Hi, I've implemented 'Refined Soundex' algorithm using a GenericUDF and would like to share it for a review by experts as I'm a newbie. Change Details: A new java class is created: GenericUDFRefinedSoundex.java Add a entry to FunctionRegistry.java: registerGenericUDF(soundex_ref, GenericUDFRefinedSoundex.class); Both files are attached to the email. I'm planning to implement other phonetic algorithms and submit all as a single patch. I understand there are many other steps that I need to finish before a patch is ready but for now, if you could review the attached code and provide feedback, it'll be great. Here are the details of Refined Soundex algorithm: First letter is stored Subsequent letters are replaced by numbers as defined below- * B, P = 1 * F, V = 2 * C, K, S = 3 * G, J = 4 * Q, X, Z = 5 * D, T = 6 * L = 7 * M, N = 8 * R = 9 * Other letters = 0 Consecutive letters belonging to the same group are replaced by one letter Example: SELECT soundex_ref('Carren') FROM src LIMIT 1; C30908 Thanks, Krishna
[jira] [Commented] (HIVE-4053) Add support for phonetic algorithms in Hive
[ https://issues.apache.org/jira/browse/HIVE-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585226#comment-13585226 ] Mark Grover commented on HIVE-4053: --- Can you please post a patch on the JIRA and post a review on reviewboard? You should also consider adding some unit tests. If you need help with any of this, please let us know. Add support for phonetic algorithms in Hive --- Key: HIVE-4053 URL: https://issues.apache.org/jira/browse/HIVE-4053 Project: Hive Issue Type: New Feature Components: UDF Reporter: Krishna Attachments: FunctionRegistry.java, GenericUDFRefinedSoundex.java Following phonetic algorithms should be considered, which are very useful in search: Soundex Refined Soundex Daitch–Mokotoff Soundex Metaphone and Double Metaphone New York State Identification and Intelligence System (NYSIIS) Caverphone -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3885) CLI command SHOW PARTITIONS could be extended to provide LOCATION information
[ https://issues.apache.org/jira/browse/HIVE-3885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13579844#comment-13579844 ] Mark Grover commented on HIVE-3885: --- Sanjay: I am replying to your question on the IRC channel. I think this will be a good starting point, https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ShowPartitionsDesc.java See who uses this class and how. CLI command SHOW PARTITIONS could be extended to provide LOCATION information --- Key: HIVE-3885 URL: https://issues.apache.org/jira/browse/HIVE-3885 Project: Hive Issue Type: New Feature Reporter: Sanjay Subramanian Priority: Minor SHOW PARTITIONS does not provide information on the HDFS location of the data. The workaround is to query the metadata DB. The following command will give you the HDFS file locations as stored in the metadata tables. echo select t.TBL_NAME, p.PART_NAME, s.LOCATION from PARTITIONS p, SDS s, TBLS t where t.TBL_ID=p.TBL_ID and p.SD_ID=s.SD_ID |mysql -uusername –ppassword hive_metastore_DB |grep HIVETABLENAME|less If this could be encapsulated in a CLI command SHOW LOCATIONS that displays PARTITION_NAMELOCATION -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3951) Allow Decimal type columns in Regex Serde
[ https://issues.apache.org/jira/browse/HIVE-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577324#comment-13577324 ] Mark Grover commented on HIVE-3951: --- This patch is ready for review. Would anyone be willing to please review? Allow Decimal type columns in Regex Serde - Key: HIVE-3951 URL: https://issues.apache.org/jira/browse/HIVE-3951 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.10.0 Reporter: Mark Grover Assignee: Mark Grover Fix For: 0.11.0 Attachments: HIVE-3951.1.patch Decimal type in Hive was recently added by HIVE-2693. We should allow users to create tables with decimal type columns when using Regex Serde. HIVE-3004 did something similar for other primitive types. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4003) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
[ https://issues.apache.org/jira/browse/HIVE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover reassigned HIVE-4003: - Assignee: Mark Grover NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java - Key: HIVE-4003 URL: https://issues.apache.org/jira/browse/HIVE-4003 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thomas Adam Assignee: Mark Grover Fix For: 0.11.0 Attachments: HIVE-4003.patch Utilities.java seems to be throwing a NPE. Change contributed by Thomas Adam. Reference: https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4003) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
[ https://issues.apache.org/jira/browse/HIVE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575185#comment-13575185 ] Mark Grover commented on HIVE-4003: --- Ok, in that case, I can take it on from here. Thanks for your contribution, Thomas! NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java - Key: HIVE-4003 URL: https://issues.apache.org/jira/browse/HIVE-4003 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thomas Adam Fix For: 0.11.0 Attachments: HIVE-4003.patch Utilities.java seems to be throwing a NPE. Change contributed by Thomas Adam. Reference: https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3995) PostgreSQL upgrade scripts are not valid
[ https://issues.apache.org/jira/browse/HIVE-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575186#comment-13575186 ] Mark Grover commented on HIVE-3995: --- Thanks, Namit! PostgreSQL upgrade scripts are not valid Key: HIVE-3995 URL: https://issues.apache.org/jira/browse/HIVE-3995 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jarek Jarcec Cecho Fix For: 0.11.0 Attachments: postgre_update_issue.patch, postgre_update_issue.patch I've noticed that scripts for upgrading metastore backed up on PostgreSQL are not valid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Add support for pulling HBase columns with prefixes
On Feb. 5, 2013, 3:43 a.m., Mark Grover wrote: hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java, line 192 https://reviews.apache.org/r/9276/diff/1/?file=254957#file254957line192 This seems like a limited case of pattern matching. Swarnim, any way we can support generic regex matching instead? Swarnim Kulkarni wrote: Mark, in this case I specifically wanted to only allow strings that end with exactly the character * and using String#endsWith seemed more simpler and readable than a regex. Do you still want me to replace this with a regex matching? Brock Noland wrote: I think the issue is that this would make it difficult to implement enhanced pattern matching later. Implementing it now, you'd only need to specify: col.* in the table configuration. Now the issue would be detecting if the particular column was a regex pattern. Because #, comma, and : are used as separators that would exclude those characters from being used. Swarnim Kulkarni wrote: Thanks Brock. Makes sense. To be sure I am understanding you right, the change now would be just to replace the parts[1].endsWith(*) with something more regexy that would still imply that the string ends with *. Correct? I think that should be do it. Personally, I think having limited regex matching is just going to confuse people, so if you could implement (and test) full Nava style regex matching (like we do for RegexSerDe for example), that would be fantastic. Of course, let me know if you have questions! Thanks for doing this, BTW! - Mark --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9276/#review16080 --- On Feb. 3, 2013, 1:04 a.m., Swarnim Kulkarni wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9276/ --- (Updated Feb. 3, 2013, 1:04 a.m.) Review request for hive. Description --- Added support for pulling hbase columns just by providing prefixes and a wildcard. So a query now could look something like this: CREATE EXTERNAL TABLE hive_hbase_test ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,fam1:col*) TBLPROPERTIES (hbase.table.name = TEST_HBASE_TABLE); This would pull in all columns under column family fam1 which start with col. This gives a little more flexibility over pull all columns format. This addresses bug HIVE-3725. https://issues.apache.org/jira/browse/HIVE-3725 Diffs - hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 7f37ba5 hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java a8ba9d9 hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java d35bb52 hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java e821282 Diff: https://reviews.apache.org/r/9276/diff/ Testing --- Added unit tests to demonstrate the new functionality. Also made sure that all existing unit tests passed. Thanks, Swarnim Kulkarni
Re: Subcribe
Karthik, Please send an email to the subscribe address listed at http://hive.apache.org/mailing_lists.html#Developers On Sat, Feb 9, 2013 at 7:41 AM, karthik tunga karthik.tu...@gmail.comwrote: Hi, I would like to subscribe to this list. Cheers, Karthik
[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly
[ https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13575246#comment-13575246 ] Mark Grover commented on HIVE-3179: --- They timed out on my pseudo-distributed laptop but that most likely is an environment issue local to me. But looks like Lars mentioned that he had run the tests, so that should be ok. I will try to fix the environment, but don't wait on me. HBase Handler doesn't handle NULLs properly --- Key: HIVE-3179 URL: https://issues.apache.org/jira/browse/HIVE-3179 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.9.0, 0.10.0 Reporter: Lars Francke Priority: Critical Attachments: HIVE-3179.1.patch We found a quite severe issue in the HBase Handler which actually means that Hive potentially returns incorrect data if a column has NULL values in HBase (which means the cell doesn't even exist) In HBase Shell: {noformat} create 'hive_hbase_test', 'test' put 'hive_hbase_test', '1', 'test:c1', 'c1-1' put 'hive_hbase_test', '1', 'test:c2', 'c2-1' put 'hive_hbase_test', '1', 'test:c3', 'c3-1' put 'hive_hbase_test', '2', 'test:c1', 'c1-2' {noformat} In Hive: {noformat} DROP TABLE IF EXISTS hive_hbase_test; CREATE EXTERNAL TABLE hive_hbase_test ( id int, c1 string, c2 string, c3 string ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key#s,test:c1#s,test:c2#s,test:c3#s) TBLPROPERTIES(hbase.table.name = hive_hbase_test); hive select * from hive_hbase_test; OK 1 c1-1c2-1c3-1 2 c1-2NULLNULL hive select c1 from hive_hbase_test; c1-1 c1-2 hive select c1, c2 from hive_hbase_test; c1-1 c2-1 c1-2 NULL {noformat} So far everything is correct but now: {noformat} hive select c1, c2, c2 from hive_hbase_test; c1-1 c2-1c2-1 c1-2 NULLc2-1 {noformat} Selecting c2 twice works the first time but the second time we actually get the value from the previous row. {noformat} hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test; c1-1 c3-1c2-1c2-1c3-1c3-1c1-1 c1-2 NULLNULLc2-1c3-1c3-1c1-2 {noformat} We've narrowed this down to an early initialization of {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and we'll try to provide a patch which surely needs review. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-3179: HBase Handler doesn't handle NULLs properly
On Feb. 8, 2013, 3:33 p.m., Mark Grover wrote: hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java, line 127 https://reviews.apache.org/r/5542/diff/1/?file=116176#file116176line127 Lars, I noticed that the Javadoc here is inconsistent with the method signature. This is obviously unrelated to your change but if you have a chance, could you please correct the Javadoc here? Brock Noland wrote: Mark, I agree we should update the javadoc, however I don't think that should hold up a commit since it's unrelated to this change and this issue is critical. That is if Lars doesn't update the patch we can address that issue in a new JIRA. Would you agree? Brock Fair enough, that's fine by me. - Mark --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/5542/#review16351 --- On June 25, 2012, 5:30 a.m., Lars Francke wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/5542/ --- (Updated June 25, 2012, 5:30 a.m.) Review request for hive. Description --- This patch moves the initialization of the fieldsInited variable to the end of the loop because otherwise NULL values will return stale data on the second iteration. All other cases should be unaffected by this change. This addresses bug HIVE-3179. https://issues.apache.org/jira/browse/HIVE-3179 Diffs - hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java d35bb52 hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestLazyHBaseObject.java f91be4c Diff: https://reviews.apache.org/r/5542/diff/ Testing --- Debugged problem, added code to existing test to force wrong behavior which is fixed by this patch, ran the hbase-handler unit tests Thanks, Lars Francke
Re: Newbie
Hi Karthik, Welcome! My first Hive JIRA was a UDF (User Defined Functions) - HIVE-2418 I personally believe starting off with UDFs is a great and relatively easy way to start contributing to Hive. I did a quick search for UDF related JIRAs: https://issues.apache.org/jira/issues/?jql=project%20%3D%20HIVE%20AND%20resolution%20%3D%20Unresolved%20AND%20text%20~%20%22udfs%22 and here are a few JIRAs that I think might be useful. If it's claimed by someone else and you want to do it and haven't seen activity for a while, request them on the JIRA if you could take over. Otherwise, take an unassigned JIRA. https://issues.apache.org/jira/browse/HIVE-2361 https://issues.apache.org/jira/browse/HIVE-2482 https://issues.apache.org/jira/browse/HIVE-2710 (I think this would be an excellent first JIRA) https://issues.apache.org/jira/browse/HIVE-686 Do let us know if you have any other questions. Mark On Sat, Feb 9, 2013 at 12:56 PM, karthik tunga karthik.tu...@gmail.comwrote: Hey Guys, I am just getting started with Hive code. I was wondering if there are any open JIRA easy enough that I could fix ? Thanks. Cheers, Karthik
[jira] [Commented] (HIVE-3999) Mysql metastore upgrade script will end up with different schema than the full schema load
[ https://issues.apache.org/jira/browse/HIVE-3999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574505#comment-13574505 ] Mark Grover commented on HIVE-3999: --- Thanks Namit! Mysql metastore upgrade script will end up with different schema than the full schema load -- Key: HIVE-3999 URL: https://issues.apache.org/jira/browse/HIVE-3999 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jarek Jarcec Cecho Fix For: 0.11.0 Attachments: mysql_upgrade_issue.patch I've noticed that the file {{hive-schema-0.10.0.mysql.sql}} is creating table SDS with following column: {code} `IS_STOREDASSUBDIRECTORIES` bit(1) NOT NULL, {code} However the upgrade script {{011-HIVE-3649.mysql.sql}} will create the column differently: {code} ALTER TABLE `SDS` ADD `IS_STOREDASSUBDIRECTORIES` bit(1) ; {code} Thus user will get slightly different schema each time - once with NOT NULL and secondly with NULL definition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3998) Oracle metastore update script will fail when upgrading from 0.9.0 to 0.10.0
[ https://issues.apache.org/jira/browse/HIVE-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574504#comment-13574504 ] Mark Grover commented on HIVE-3998: --- Thanks Namit! Could you look at HIVE-3995 as well when you get a chance? It does the same change to PostgreSQL scripts. Oracle metastore update script will fail when upgrading from 0.9.0 to 0.10.0 Key: HIVE-3998 URL: https://issues.apache.org/jira/browse/HIVE-3998 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jarek Jarcec Cecho Fix For: 0.11.0 Attachments: oracle_update_issue.patch The problem is in following file {{011-HIVE-3649.oracle.sql}} that contains following line: {code} alter table SDS add IS_STOREDASSUBDIRECTORIES NUMBER(1) NOT NULL; {code} This query will however fail if the table SDS have at least one row. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-3179: HBase Handler doesn't handle NULLs properly
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/5542/#review16351 --- hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java https://reviews.apache.org/r/5542/#comment34816 Lars, I noticed that the Javadoc here is inconsistent with the method signature. This is obviously unrelated to your change but if you have a chance, could you please correct the Javadoc here? - Mark Grover On June 25, 2012, 5:30 a.m., Lars Francke wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/5542/ --- (Updated June 25, 2012, 5:30 a.m.) Review request for hive. Description --- This patch moves the initialization of the fieldsInited variable to the end of the loop because otherwise NULL values will return stale data on the second iteration. All other cases should be unaffected by this change. This addresses bug HIVE-3179. https://issues.apache.org/jira/browse/HIVE-3179 Diffs - hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java d35bb52 hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestLazyHBaseObject.java f91be4c Diff: https://reviews.apache.org/r/5542/diff/ Testing --- Debugged problem, added code to existing test to force wrong behavior which is fixed by this patch, ran the hbase-handler unit tests Thanks, Lars Francke
[jira] [Commented] (HIVE-3179) HBase Handler doesn't handle NULLs properly
[ https://issues.apache.org/jira/browse/HIVE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574553#comment-13574553 ] Mark Grover commented on HIVE-3179: --- Running TestHBaseCliDriver tests... HBase Handler doesn't handle NULLs properly --- Key: HIVE-3179 URL: https://issues.apache.org/jira/browse/HIVE-3179 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.9.0, 0.10.0 Reporter: Lars Francke Priority: Critical Attachments: HIVE-3179.1.patch We found a quite severe issue in the HBase Handler which actually means that Hive potentially returns incorrect data if a column has NULL values in HBase (which means the cell doesn't even exist) In HBase Shell: {noformat} create 'hive_hbase_test', 'test' put 'hive_hbase_test', '1', 'test:c1', 'c1-1' put 'hive_hbase_test', '1', 'test:c2', 'c2-1' put 'hive_hbase_test', '1', 'test:c3', 'c3-1' put 'hive_hbase_test', '2', 'test:c1', 'c1-2' {noformat} In Hive: {noformat} DROP TABLE IF EXISTS hive_hbase_test; CREATE EXTERNAL TABLE hive_hbase_test ( id int, c1 string, c2 string, c3 string ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key#s,test:c1#s,test:c2#s,test:c3#s) TBLPROPERTIES(hbase.table.name = hive_hbase_test); hive select * from hive_hbase_test; OK 1 c1-1c2-1c3-1 2 c1-2NULLNULL hive select c1 from hive_hbase_test; c1-1 c1-2 hive select c1, c2 from hive_hbase_test; c1-1 c2-1 c1-2 NULL {noformat} So far everything is correct but now: {noformat} hive select c1, c2, c2 from hive_hbase_test; c1-1 c2-1c2-1 c1-2 NULLc2-1 {noformat} Selecting c2 twice works the first time but the second time we actually get the value from the previous row. {noformat} hive select c1, c3, c2, c2, c3, c3, c1 from hive_hbase_test; c1-1 c3-1c2-1c2-1c3-1c3-1c1-1 c1-2 NULLNULLc2-1c3-1c3-1c1-2 {noformat} We've narrowed this down to an early initialization of {{fieldsInited\[fieldID] = true}} in {{LazyHBaseRow#uncheckedGetField}} and we'll try to provide a patch which surely needs review. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4003) NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
[ https://issues.apache.org/jira/browse/HIVE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Grover updated HIVE-4003: -- Description: Utilities.java seems to be throwing a NPE. Change contributed by Thomas Adam. Reference: https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350 Affects Version/s: 0.10.0 Fix Version/s: 0.11.0 NullPointerException in ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java - Key: HIVE-4003 URL: https://issues.apache.org/jira/browse/HIVE-4003 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thomas Adam Fix For: 0.11.0 Attachments: HIVE-4003.patch Utilities.java seems to be throwing a NPE. Change contributed by Thomas Adam. Reference: https://github.com/tecbot/hive/commit/1e29d88837e4101a76e870a716aadb729437355b#commitcomment-2588350 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-3995 PostgreSQL upgrade scripts are not valid
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9349/#review16332 --- /trunk/metastore/scripts/upgrade/postgres/011-HIVE-3649.postgres.sql https://reviews.apache.org/r/9349/#comment34802 For the record, false seems like the right value for existing records in SDS table. Context: CreateTableDesc.java in HIVE-3649.patch.9 - Mark Grover On Feb. 7, 2013, 4:22 p.m., Jarek Cecho wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9349/ --- (Updated Feb. 7, 2013, 4:22 p.m.) Review request for hive. Description --- I found issues in three files: * 010-HIVE-3072.postgres.sql Just escaping double quotes were missing. * 011-HIVE-3649.postgres.sql This alter statement tries to add new not null column without any default value. The statement will fail if the table already contains some data. I've checked similar script in mysql/ folder and that script is allowing null values. Thus I've allowed null values also here. * 012-HIVE-1362.postgres Just syntax fixes. This addresses bug HIVE-3995. https://issues.apache.org/jira/browse/HIVE-3995 Diffs - /trunk/metastore/scripts/upgrade/postgres/010-HIVE-3072.postgres.sql 1443292 /trunk/metastore/scripts/upgrade/postgres/011-HIVE-3649.postgres.sql 1443292 /trunk/metastore/scripts/upgrade/postgres/012-HIVE-1362.postgres.sql 1443292 Diff: https://reviews.apache.org/r/9349/diff/ Testing --- I've tested the upgrade scripts by loading file hive-schema-0.9.0.postgres.sql with some data and executing upgrade-0.9.0-to-0.10.0.postgres.sql on top of that. Thanks, Jarek Cecho
Re: Review Request: HIVE-3995 PostgreSQL upgrade scripts are not valid
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9349/#review16333 --- Ship it! Ship It! - Mark Grover On Feb. 7, 2013, 4:22 p.m., Jarek Cecho wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9349/ --- (Updated Feb. 7, 2013, 4:22 p.m.) Review request for hive. Description --- I found issues in three files: * 010-HIVE-3072.postgres.sql Just escaping double quotes were missing. * 011-HIVE-3649.postgres.sql This alter statement tries to add new not null column without any default value. The statement will fail if the table already contains some data. I've checked similar script in mysql/ folder and that script is allowing null values. Thus I've allowed null values also here. * 012-HIVE-1362.postgres Just syntax fixes. This addresses bug HIVE-3995. https://issues.apache.org/jira/browse/HIVE-3995 Diffs - /trunk/metastore/scripts/upgrade/postgres/010-HIVE-3072.postgres.sql 1443292 /trunk/metastore/scripts/upgrade/postgres/011-HIVE-3649.postgres.sql 1443292 /trunk/metastore/scripts/upgrade/postgres/012-HIVE-1362.postgres.sql 1443292 Diff: https://reviews.apache.org/r/9349/diff/ Testing --- I've tested the upgrade scripts by loading file hive-schema-0.9.0.postgres.sql with some data and executing upgrade-0.9.0-to-0.10.0.postgres.sql on top of that. Thanks, Jarek Cecho
[jira] [Commented] (HIVE-3995) PostgreSQL upgrade scripts are not valid
[ https://issues.apache.org/jira/browse/HIVE-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574249#comment-13574249 ] Mark Grover commented on HIVE-3995: --- +1 (non-committer) Thanks Jarcec! I left a comment (no action required from your side) on Reviewboard that false seems to be the right value to put for existing rows. Perhaps, [~gangtimliu] or [~namitjain], who were involved with HIVE-3649 can confirm. I like the new approach of adding the new column as Nullable, updating the existing null values and then setting the column to not-null. The only other approach I can think of is to use a default value (of false) but that seems more risky in the long run. PostgreSQL upgrade scripts are not valid Key: HIVE-3995 URL: https://issues.apache.org/jira/browse/HIVE-3995 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jarek Jarcec Cecho Attachments: postgre_update_issue.patch, postgre_update_issue.patch I've noticed that scripts for upgrading metastore backed up on PostgreSQL are not valid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3998) Oracle metastore update script will fail when upgrading from 0.9.0 to 0.10.0
[ https://issues.apache.org/jira/browse/HIVE-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574251#comment-13574251 ] Mark Grover commented on HIVE-3998: --- +1 (non-committer) Oracle metastore update script will fail when upgrading from 0.9.0 to 0.10.0 Key: HIVE-3998 URL: https://issues.apache.org/jira/browse/HIVE-3998 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jarek Jarcec Cecho Attachments: oracle_update_issue.patch The problem is in following file {{011-HIVE-3649.oracle.sql}} that contains following line: {code} alter table SDS add IS_STOREDASSUBDIRECTORIES NUMBER(1) NOT NULL; {code} This query will however fail if the table SDS have at least one row. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-3999 Mysql metastore upgrade script will end up with different schema than the full schema load
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9365/#review16334 --- Ship it! Ship It! - Mark Grover On Feb. 7, 2013, 9:24 p.m., Jarek Cecho wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/9365/ --- (Updated Feb. 7, 2013, 9:24 p.m.) Review request for hive. Description --- I've done similar fix as in HIVE-3998. This addresses bug HIVE-3999. https://issues.apache.org/jira/browse/HIVE-3999 Diffs - /trunk/metastore/scripts/upgrade/mysql/011-HIVE-3649.mysql.sql 1443292 Diff: https://reviews.apache.org/r/9365/diff/ Testing --- Tested on mysql metastore. Thanks, Jarek Cecho