[jira] Commented: (HIVE-842) Authentication Infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765362#action_12765362 ] Min Zhou commented on HIVE-842: --- @Edward Kerberos for authethication is a good way I think, user/password is no need here. This issue would be implemented in the future. btw, we've finished the development of authorization infrastructure for Hive. Authentication Infrastructure for Hive -- Key: HIVE-842 URL: https://issues.apache.org/jira/browse/HIVE-842 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Edward Capriolo This issue deals with the authentication (user name,password) infrastructure. Not the authorization components that specify what a user should be able to do. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-78) Authorization infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12758112#action_12758112 ] Min Zhou commented on HIVE-78: -- @Namit Got your meaning. We are maintaining a version of our own, it needs couples of weeks for adapting to the trunk. Authorization infrastructure for Hive - Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Ashish Thusoo Assignee: Edward Capriolo Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-78) Authorization infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12757616#action_12757616 ] Min Zhou commented on HIVE-78: -- sorry, {nofromat} public class GenericAuthenticator extends Authenticator { public GenericAuthenticator (Hive db, User user); ... } {nofromat} Authorization infrastructure for Hive - Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Ashish Thusoo Assignee: Edward Capriolo Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-78) Authorization infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12757622#action_12757622 ] Min Zhou commented on HIVE-78: -- oops, my code wasn't in my machine. I just pasted yours and modified it into mine. here is a patch show my code on that. Authorization infrastructure for Hive - Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Ashish Thusoo Assignee: Edward Capriolo Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-78) Authorization infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-78: - Attachment: createuser-v1.patch Authorization infrastructure for Hive - Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Ashish Thusoo Assignee: Edward Capriolo Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-78) Authentication infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756904#action_12756904 ] Min Zhou commented on HIVE-78: -- Let me guess, you are all talking about CLI. But we are using HiveServer as a multi-user server, not just support only one user like mysqld does. Authentication infrastructure for Hive -- Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Ashish Thusoo Assignee: Edward Capriolo Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-78) Authentication infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756949#action_12756949 ] Min Zhou commented on HIVE-78: -- I do not think the HiveServer in your mind is the same as mine, which support multiple users, not only one. Authentication infrastructure for Hive -- Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Ashish Thusoo Assignee: Edward Capriolo Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-78) Authentication infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756951#action_12756951 ] Min Zhou commented on HIVE-78: -- From the words you commented: {noformat} Daemons like HiveService and HiveWebInterface will have to run as supergroup or a hive group? {noformat} Authentication infrastructure for Hive -- Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Ashish Thusoo Assignee: Edward Capriolo Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-78) Authentication infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-78: - Attachment: hive-78-metadata-v1.patch Authentication infrastructure for Hive -- Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Ashish Thusoo Assignee: Edward Capriolo Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-78) Authentication infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756335#action_12756335 ] Min Zhou commented on HIVE-78: -- @Edward Sorry for my abuse of some words, I hope this will not affect our work. Can you give me the jiras you decided not to store username/password information in hive and hadoop will? I think most companies are using hadoop versions from 0.17 to 0.20 , which don't have good password securities. Once a company takes a particular version, upgrades for them is a very important issue, many companies will adopt a more stable version. Moreover, now hadoop still do not have that feature, which may cost a very long time to implement. Why should we are waiting for, rather than accomplish it? I think Hive is necessary to support user/password at least for current versions of hadoop. There are many companies who are using hive reflected that current hive is inconvenient for multi-user, as long as environment isolation, table sharing, security, etc. We must try to meet the requirements of most of them. Regarding the syntax, I guess we can do it in two steps. # support GRANT/REVOKE privileges to users. # support some sort of server administration privileges as Ashish metioned. The GRANT statement enables system administrators to create Hive user accounts and to grant rights to accounts. To use GRANT, you must have the GRANT OPTION privilege, and you must have the privileges that you are grantingad. The REVOKE statement is related and enables ministrators to remove account privileges. File hive-78-syntax-v1.patch modifies the syntax. Any comments on that? Authentication infrastructure for Hive -- Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Ashish Thusoo Assignee: Edward Capriolo Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-78) Authentication infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755876#action_12755876 ] Min Zhou commented on HIVE-78: -- we will take over this issue, it would be finished in two weeks. Here are the sql statements will be added: {noformat} CREATE USER, DROP USER; ALTER USER SET PASSOWRD; GRANT; REVOKE {noformat} Metadata is stored at some sort of persistent media such as mysql DBMS through jdo. We will add three tables for this issue, they are USER, DBS_PRIV, TABLES_PRIV. Privileges can be granted at several levels, each table above are corresponding to a privilege level. # Global level Global privileges apply to all databases on a given server. These privileges are stored in the USER table. GRANT ALL ON *.* and REVOKE ALL ON *.* grant and revoke only global privileges. GRANT ALL ON *.* TO 'someuser'; GRANT SELECT, INSERT ON *.* TO 'someuser'; # Database level Database privileges apply to all objects in a given database. These privileges are stored in the DBS_PRIV table. GRANT ALL ON db_name.* and REVOKE ALL ON db_name.* grant and revoke only database privileges. GRANT ALL ON mydb.* TO 'someuser'; GRANT SELECT, INSERT ON mydb.* TO 'someuser'; Although we can't create DBs currently, it would take a reserved place till hive support. # Table level Table privileges apply to all columns in a given table. These privileges are stored in the TABLES_PRIV table. GRANT ALL ON db_name.tbl_name and REVOKE ALL ON db_name.tbl_name grant and revoke only table privileges. GRANT ALL ON mydb.mytbl TO 'someuser'; GRANT SELECT, INSERT ON mydb.mytbl TO 'someuser'; Hive account information is stored in USER table, includes username, password and kinds of privileges. User who has been granted any privilege to, such as select/insert/drop on a particular table, always have a right to show that table. Authentication infrastructure for Hive -- Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Ashish Thusoo Assignee: Edward Capriolo Attachments: hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-78) Authentication infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-78: - Attachment: hive-78-syntax-v1.patch Authentication infrastructure for Hive -- Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Ashish Thusoo Assignee: Edward Capriolo Attachments: hive-78-syntax-v1.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-78) Authentication infrastructure for Hive
[ https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755882#action_12755882 ] Min Zhou commented on HIVE-78: -- We currently use seperated mysql dbs for achieving an isolated CLI environment, which is not practical. An authentication infrastructure is urgently needed for us. Almost all statements would be influenced, for example SELECT INSERT SHOW TABLES SHOW PARTITIONS DESCRIBE TABLE MSCK CREATE TABLE CREATE FUNCTION -- we are considering how to control people creating udfs. DROP TABLE DROP FUNCTION LOAD added with GRANT/REVOKE themselft, and CREATE USER/DROP USER/SET PASSWORD. Even includes some non-sql commands like set , add file ,add jar. Authentication infrastructure for Hive -- Key: HIVE-78 URL: https://issues.apache.org/jira/browse/HIVE-78 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Reporter: Ashish Thusoo Assignee: Edward Capriolo Attachments: hive-78-syntax-v1.patch, hive-78.diff Allow hive to integrate with existing user repositories for authentication and authorization infromation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-818) Create a Hive CLI that connects to hive ThriftServer
[ https://issues.apache.org/jira/browse/HIVE-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752852#action_12752852 ] Min Zhou commented on HIVE-818: --- this feature looks pretty good for us, we were looking for a CLI mode client of hive server. Create a Hive CLI that connects to hive ThriftServer Key: HIVE-818 URL: https://issues.apache.org/jira/browse/HIVE-818 Project: Hadoop Hive Issue Type: New Feature Components: Clients, Server Infrastructure Reporter: Edward Capriolo Assignee: Edward Capriolo We should have an alternate CLI that works by interacting with the HiveServer, in this way it will be ready when/if we deprecate the current CLI. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-814) exception alter a int typed column to date/datetime/timestamp
exception alter a int typed column to date/datetime/timestamp - Key: HIVE-814 URL: https://issues.apache.org/jira/browse/HIVE-814 Project: Hadoop Hive Issue Type: Bug Reporter: Min Zhou As fas as i know, time types can only be used in partitions, normal columns is not allowed to be set as those types . But it's found can alter a no time types column to date/datetime/timestamp,but exceptions will be throwed when describing. hive create table pokes(foo int, bar string); OK Time taken: 0.894 seconds hive alter table pokes replace columns(foo date, bar string); OK Time taken: 0.266 seconds hive describe pokes; FAILED: Error in metadata: MetaException(message:java.lang.IllegalArgumentException Error: type expected at the position 0 of 'date:string' but 'date' is found.) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-607) Create statistical UDFs.
[ https://issues.apache.org/jira/browse/HIVE-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736473#action_12736473 ] Min Zhou commented on HIVE-607: --- @Namit I implemented group_cat() in a rush, and found something difficult slove: 1. function group_cat() has a internal order by clause, currently, we can't such aggregation in hive. 2. when the string will be group concated is too large, in another is appears data skew, there is ofen not enough memory to store such a big string. Create statistical UDFs. Key: HIVE-607 URL: https://issues.apache.org/jira/browse/HIVE-607 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: S. Alex Smith Assignee: Emil Ibrishimov Priority: Minor Fix For: 0.4.0 Attachments: HIVE-607.1.patch, UDAFStddev.java Create UDFs replicating: STD() Return the population standard deviation STDDEV_POP()(v5.0.3) Return the population standard deviation STDDEV_SAMP()(v5.0.3) Return the sample standard deviation STDDEV() Return the population standard deviation SUM() Return the sum VAR_POP()(v5.0.3) Return the population standard variance VAR_SAMP()(v5.0.3)Return the sample variance VARIANCE()(v4.1) Return the population standard variance as found at http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-607) Create statistical UDFs.
[ https://issues.apache.org/jira/browse/HIVE-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736475#action_12736475 ] Min Zhou commented on HIVE-607: --- sorry, some typo @Namit I've implemented group_cat() in a rush, and found something difficult to slove: 1. function group_cat() has a internal order by clause, currently, we can't implement such an aggregation in hive. 2. when the strings will be group concated are too large, in another words, if data skew appears, there is ofen not enough memory to store such a big result. Create statistical UDFs. Key: HIVE-607 URL: https://issues.apache.org/jira/browse/HIVE-607 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: S. Alex Smith Assignee: Emil Ibrishimov Priority: Minor Fix For: 0.4.0 Attachments: HIVE-607.1.patch, UDAFStddev.java Create UDFs replicating: STD() Return the population standard deviation STDDEV_POP()(v5.0.3) Return the population standard deviation STDDEV_SAMP()(v5.0.3) Return the sample standard deviation STDDEV() Return the population standard deviation SUM() Return the sum VAR_POP()(v5.0.3) Return the population standard variance VAR_SAMP()(v5.0.3)Return the sample variance VARIANCE()(v4.1) Return the population standard variance as found at http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-702) DROP TEMPORARY FUNCTION should not drop builtin functions
[ https://issues.apache.org/jira/browse/HIVE-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-702: -- Attachment: HIVE-702.1.patch patch DROP TEMPORARY FUNCTION should not drop builtin functions - Key: HIVE-702 URL: https://issues.apache.org/jira/browse/HIVE-702 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Min Zhou Attachments: HIVE-702.1.patch Only temporary functions should be dropped. It should error out if the user tries to drop built-in functions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-702) DROP TEMPORARY FUNCTION should not drop builtin functions
[ https://issues.apache.org/jira/browse/HIVE-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736762#action_12736762 ] Min Zhou commented on HIVE-702: --- pls wait a moment, I haven't deal with the conflict you mentioned. DROP TEMPORARY FUNCTION should not drop builtin functions - Key: HIVE-702 URL: https://issues.apache.org/jira/browse/HIVE-702 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Min Zhou Attachments: HIVE-702.1.patch Only temporary functions should be dropped. It should error out if the user tries to drop built-in functions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-702) DROP TEMPORARY FUNCTION should not drop builtin functions
[ https://issues.apache.org/jira/browse/HIVE-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-702: -- Attachment: HIVE-702.2.patch done DROP TEMPORARY FUNCTION should not drop builtin functions - Key: HIVE-702 URL: https://issues.apache.org/jira/browse/HIVE-702 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Min Zhou Attachments: HIVE-702.1.patch, HIVE-702.2.patch Only temporary functions should be dropped. It should error out if the user tries to drop built-in functions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-702) DROP TEMPORARY FUNCTION should not drop builtin functions
[ https://issues.apache.org/jira/browse/HIVE-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736767#action_12736767 ] Min Zhou commented on HIVE-702: --- that patch hasn't been tested, cuz I stay at home, can not connect to the company's vpn. DROP TEMPORARY FUNCTION should not drop builtin functions - Key: HIVE-702 URL: https://issues.apache.org/jira/browse/HIVE-702 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Min Zhou Attachments: HIVE-702.1.patch, HIVE-702.2.patch Only temporary functions should be dropped. It should error out if the user tries to drop built-in functions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-700) Fix test error by adding DROP FUNCTION
[ https://issues.apache.org/jira/browse/HIVE-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou reassigned HIVE-700: - Assignee: Min Zhou Fix test error by adding DROP FUNCTION Key: HIVE-700 URL: https://issues.apache.org/jira/browse/HIVE-700 Project: Hadoop Hive Issue Type: Bug Reporter: Zheng Shao Assignee: Min Zhou Since we added Show Functions in HIVE-580, test results will depend on what temporary functions are added to the system. We should add the capability of DROP FUNCTION, and do that at the end of those create function tests to make sure the show functions results are deterministic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-649) [UDF] now() for getting current time
[ https://issues.apache.org/jira/browse/HIVE-649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-649: -- Attachment: HIVE-649.patch patch [UDF] now() for getting current time Key: HIVE-649 URL: https://issues.apache.org/jira/browse/HIVE-649 Project: Hadoop Hive Issue Type: New Feature Reporter: Min Zhou Attachments: HIVE-649.patch http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_now -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-700) Fix test error by adding DROP FUNCTION
[ https://issues.apache.org/jira/browse/HIVE-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-700: -- Attachment: HIVE-700.1.patch usage: drop function function_name Fix test error by adding DROP FUNCTION Key: HIVE-700 URL: https://issues.apache.org/jira/browse/HIVE-700 Project: Hadoop Hive Issue Type: Bug Reporter: Zheng Shao Assignee: Min Zhou Attachments: HIVE-700.1.patch Since we added Show Functions in HIVE-580, test results will depend on what temporary functions are added to the system. We should add the capability of DROP FUNCTION, and do that at the end of those create function tests to make sure the show functions results are deterministic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-700) Fix test error by adding DROP FUNCTION
[ https://issues.apache.org/jira/browse/HIVE-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736435#action_12736435 ] Min Zhou commented on HIVE-700: --- Sorry for my late. we have a training today, I will update a new patch for hive-700 related jiras. Fix test error by adding DROP FUNCTION Key: HIVE-700 URL: https://issues.apache.org/jira/browse/HIVE-700 Project: Hadoop Hive Issue Type: Bug Reporter: Zheng Shao Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-700.1.patch, hive.700.2.patch Since we added Show Functions in HIVE-580, test results will depend on what temporary functions are added to the system. We should add the capability of DROP FUNCTION, and do that at the end of those create function tests to make sure the show functions results are deterministic. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-702) DROP TEMPORARY FUNCTION should not drop builtin functions
[ https://issues.apache.org/jira/browse/HIVE-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou reassigned HIVE-702: - Assignee: Min Zhou DROP TEMPORARY FUNCTION should not drop builtin functions - Key: HIVE-702 URL: https://issues.apache.org/jira/browse/HIVE-702 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Min Zhou Only temporary functions should be dropped. It should error out if the user tries to drop built-in functions. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-642) udf equivalent to string split
[ https://issues.apache.org/jira/browse/HIVE-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733641#action_12733641 ] Min Zhou commented on HIVE-642: --- It's very useful for us . some comments: # Can you implement it directly with Text ? Avoiding string decoding and encoding would be faster. Of course that trick may lead to another problem, as String.split uses a regular expression for splitting. # getDisplayString() always return a string in lowercase. udf equivalent to string split -- Key: HIVE-642 URL: https://issues.apache.org/jira/browse/HIVE-642 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Emil Ibrishimov Fix For: 0.4.0 Attachments: HIVE-642.1.patch, HIVE-642.2.patch It would be very useful to have a function equivalent to string split in java -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-599) Embedded Hive SQL into Python
[ https://issues.apache.org/jira/browse/HIVE-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733429#action_12733429 ] Min Zhou commented on HIVE-599: --- I agree with Namit and Yongqiang. I was thinking about creating function with a format like below: {noformat} create function function_name (arguments list ) as python { python udf code } create function function_name (arguments list ) as java{ java udf code } {noformat} we can dynamiclly compile those kinds of code above, use jython com.sun.tools.javac respectively. It's better store python or java udf byte code into the persistent metastore typically mysql after creation. We can call that function again w/o a second function creation. Embedded Hive SQL into Python - Key: HIVE-599 URL: https://issues.apache.org/jira/browse/HIVE-599 Project: Hadoop Hive Issue Type: New Feature Reporter: Ashish Thusoo Assignee: Ashish Thusoo While Hive does SQL it would be very powerful to be able to embed that SQL in languages like python in such a way that the hive query is also able to invoke python functions seemlessly. One possibility is to explore integration with Dumbo. Another is to see if the internal map_reduce.py tool can be open sourced as a Hive contrib. Other thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-512) [GenericUDF] new string function ELT(N,str1,str2,str3,...)
[ https://issues.apache.org/jira/browse/HIVE-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733015#action_12733015 ] Min Zhou commented on HIVE-512: --- can you answer me about this queries? [GenericUDF] new string function ELT(N,str1,str2,str3,...) --- Key: HIVE-512 URL: https://issues.apache.org/jira/browse/HIVE-512 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-512.2.patch, HIVE-512.patch ELT(N,str1,str2,str3,...) Returns str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less than 1 or greater than the number of arguments. ELT() is the complement of FIELD(). {noformat} mysql SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo'); - 'ej' mysql SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo'); - 'foo' {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-512) [GenericUDF] new string function ELT(N,str1,str2,str3,...)
[ https://issues.apache.org/jira/browse/HIVE-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733016#action_12733016 ] Min Zhou commented on HIVE-512: --- select(1, '2', 3) select(2, '2', 3) select(1, true, 3) select(2, 2.0, cast(3 as double)) if we don't uniformly return strings, it would be confused to user detemining which type will return. [GenericUDF] new string function ELT(N,str1,str2,str3,...) --- Key: HIVE-512 URL: https://issues.apache.org/jira/browse/HIVE-512 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-512.2.patch, HIVE-512.patch ELT(N,str1,str2,str3,...) Returns str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less than 1 or greater than the number of arguments. ELT() is the complement of FIELD(). {noformat} mysql SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo'); - 'ej' mysql SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo'); - 'foo' {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-512) [GenericUDF] new string function ELT(N,str1,str2,str3,...)
[ https://issues.apache.org/jira/browse/HIVE-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733103#action_12733103 ] Min Zhou commented on HIVE-512: --- If you inspected the implementation of case, you will know it's unacceptable to case with different types of arguments. see: GenericUDFCase.java , GenericUDFWhen.java {code} hive select case when true then '2' else 3 end from pokes limit 1; FAILED: Error in semantic analysis: line 1:36 Argument Type Mismatch 3: The expression after ELSE should have the same type as those after THEN: string is expected but int is found {code} elt is a string function, confusion will be caused if we casually change its behavior. It no need make things more complex. [GenericUDF] new string function ELT(N,str1,str2,str3,...) --- Key: HIVE-512 URL: https://issues.apache.org/jira/browse/HIVE-512 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-512.2.patch, HIVE-512.patch ELT(N,str1,str2,str3,...) Returns str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less than 1 or greater than the number of arguments. ELT() is the complement of FIELD(). {noformat} mysql SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo'); - 'ej' mysql SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo'); - 'foo' {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-512) [GenericUDF] new string function ELT(N,str1,str2,str3,...)
[ https://issues.apache.org/jira/browse/HIVE-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12732911#action_12732911 ] Min Zhou commented on HIVE-512: --- Here is the definition of elt: Return string at index number. It's essentially a string function select elt(1, 2, 3) will return a varbinary in mysql, rather than a int. I still insist returning string is better. Even if do it as you said, what type of results will return when doing queries like below? select(1, '2', 3) select(2, '2', 3) select(1, true, 3) select(2, 2.0, cast(3 as double)) [GenericUDF] new string function ELT(N,str1,str2,str3,...) --- Key: HIVE-512 URL: https://issues.apache.org/jira/browse/HIVE-512 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-512.2.patch, HIVE-512.patch ELT(N,str1,str2,str3,...) Returns str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less than 1 or greater than the number of arguments. ELT() is the complement of FIELD(). {noformat} mysql SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo'); - 'ej' mysql SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo'); - 'foo' {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-512) [GenericUDF] new string function ELT(N,str1,str2,str3,...)
[ https://issues.apache.org/jira/browse/HIVE-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12732828#action_12732828 ] Min Zhou commented on HIVE-512: --- actually, elt return only two types of results in mysql : varbinary, varchar. varchar will be returned if all arguments are varchars, or varbinarys will be returned. mysql create table t3 as select elt(1, 'a', 3); Query OK, 1 row affected (0.01 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql describe t3; +-+--+--+-+-+---+ | Field | Type | Null | Key | Default | Extra | +-+--+--+-+-+---+ | elt(1, 'a', 3) | varbinary(1) | YES | | NULL| | +-+--+--+-+-+---+ 1 row in set (0.00 sec) mysql create table t4 as select elt(1, true, false); Query OK, 1 row affected (0.00 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql describe t4; +--+--+--+-+-+---+ | Field| Type | Null | Key | Default | Extra | +--+--+--+-+-+---+ | elt(1, true, false) | varbinary(1) | YES | | NULL| | +--+--+--+-+-+---+ 1 row in set (0.00 sec) mysql create table t5 as select elt(1, 2.0, false); Query OK, 1 row affected (0.01 sec) Records: 1 Duplicates: 0 Warnings: 0 mysql describe t5; +-+--+--+-+-+---+ | Field | Type | Null | Key | Default | Extra | +-+--+--+-+-+---+ | elt(1, 2.0, false) | varbinary(4) | YES | | NULL| | +-+--+--+-+-+---+ 1 row in set (0.00 sec) Based on the above, I think it better return string as binary is commonly used in hive. [GenericUDF] new string function ELT(N,str1,str2,str3,...) --- Key: HIVE-512 URL: https://issues.apache.org/jira/browse/HIVE-512 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-512.2.patch, HIVE-512.patch ELT(N,str1,str2,str3,...) Returns str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less than 1 or greater than the number of arguments. ELT() is the complement of FIELD(). {noformat} mysql SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo'); - 'ej' mysql SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo'); - 'foo' {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-541) Implement UDFs: INSTR and LOCATE
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12731858#action_12731858 ] Min Zhou commented on HIVE-541: --- all test cases passed on my side, how's yours? Implement UDFs: INSTR and LOCATE Key: HIVE-541 URL: https://issues.apache.org/jira/browse/HIVE-541 Project: Hadoop Hive Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Zheng Shao Assignee: Min Zhou Attachments: HIVE-541.1.patch, HIVE-541.2.patch http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-515) [UDF] new string function INSTR(str,substr)
[ https://issues.apache.org/jira/browse/HIVE-515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou resolved HIVE-515. --- Resolution: Duplicate duplicates [#HIVE-541] [UDF] new string function INSTR(str,substr) --- Key: HIVE-515 URL: https://issues.apache.org/jira/browse/HIVE-515 Project: Hadoop Hive Issue Type: New Feature Reporter: Min Zhou Assignee: Min Zhou Attachments: HIVE-515-2.patch, HIVE-515.patch UDF for string function INSTR(str,substr) This extends the function from MySQL http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_instr usage: INSTR(str, substr) INSTR(str, substr, start) example: {code:sql} select instr('abcd', 'abc') from pokes; // all result are '1' select instr('abcabc', 'ccc') from pokes; // all result are '0' select instr('abcabc', 'abc', 2) from pokes; // all result are '4' {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-649) [UDF] now() for getting current time
[UDF] now() for getting current time Key: HIVE-649 URL: https://issues.apache.org/jira/browse/HIVE-649 Project: Hadoop Hive Issue Type: New Feature Reporter: Min Zhou http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_now -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-541) Implement UDFs: INSTR and LOCATE
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12731764#action_12731764 ] Min Zhou commented on HIVE-541: --- hmm, It's may be a good way. I will try it soon. Implement UDFs: INSTR and LOCATE Key: HIVE-541 URL: https://issues.apache.org/jira/browse/HIVE-541 Project: Hadoop Hive Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Zheng Shao Assignee: Min Zhou Attachments: HIVE-541.1.patch http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-541) Implement UDFs: INSTR and LOCATE
[ https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-541: -- Attachment: HIVE-541.2.patch Added a GenericUDFUtils.findText() where string encoding and decoding is avoided, faster execution will be gained. Implement UDFs: INSTR and LOCATE Key: HIVE-541 URL: https://issues.apache.org/jira/browse/HIVE-541 Project: Hadoop Hive Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Zheng Shao Assignee: Min Zhou Attachments: HIVE-541.1.patch, HIVE-541.2.patch http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate These functions can be directly implemented with Text (instead of String). This will make the test of whether one string contains another string much faster. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-329) start and stop hive thrift server in daemon mode
[ https://issues.apache.org/jira/browse/HIVE-329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou reassigned HIVE-329: - Assignee: Min Zhou start and stop hive thrift server in daemon mode - Key: HIVE-329 URL: https://issues.apache.org/jira/browse/HIVE-329 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Affects Versions: 0.3.0 Reporter: Min Zhou Assignee: Min Zhou Attachments: daemon.patch I write two shell script to start and stop hive thrift server more convenient. usage: bin/hive --service start-hive [HIVE_PORT] bin/hive --service stop-hive -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-555) create temporary function support not only udf, but also udaf, genericudf, etc.
[ https://issues.apache.org/jira/browse/HIVE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-555: -- Attachment: HIVE-555-4.patch Add a copy of UDAF to avoid [HIVE-620|http://issues.apache.org/jira/browse/HIVE-620] for passing all test cases. create temporary function support not only udf, but also udaf, genericudf, etc. Key: HIVE-555 URL: https://issues.apache.org/jira/browse/HIVE-555 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-555-1.patch, HIVE-555-2.patch, HIVE-555-3.patch, HIVE-555-4.patch Right now, command 'create temporary function' only support udf. we can also let user write their udaf, generic udf, and write generic udaf in the future. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-618) More human-readable error prompt of FunctionTask
More human-readable error prompt of FunctionTask Key: HIVE-618 URL: https://issues.apache.org/jira/browse/HIVE-618 Project: Hadoop Hive Issue Type: Improvement Reporter: Min Zhou current prompt: {noformat} FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask {noformat} Zheng suggested that somethings like below would be better {noformat} Class not found Class does not implement UDF, GenericUDF, or UDAF {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Work started: (HIVE-512) [GenericUDF] new string function ELT(N,str1,str2,str3,...)
[ https://issues.apache.org/jira/browse/HIVE-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-512 started by Min Zhou. [GenericUDF] new string function ELT(N,str1,str2,str3,...) --- Key: HIVE-512 URL: https://issues.apache.org/jira/browse/HIVE-512 Project: Hadoop Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-512.patch ELT(N,str1,str2,str3,...) Returns str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less than 1 or greater than the number of arguments. ELT() is the complement of FIELD(). {noformat} mysql SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo'); - 'ej' mysql SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo'); - 'foo' {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-555) create temporary function support not only udf, but also udaf, genericudf, etc.
[ https://issues.apache.org/jira/browse/HIVE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-555: -- Attachment: HIVE-555-2.patch with unit tests. create temporary function support not only udf, but also udaf, genericudf, etc. Key: HIVE-555 URL: https://issues.apache.org/jira/browse/HIVE-555 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-555-1.patch, HIVE-555-2.patch Right now, command 'create temporary function' only support udf. we can also let user write their udaf, generic udf, and write generic udaf in the future. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-555) create temporary function support not only udf, but also udaf, genericudf, etc.
[ https://issues.apache.org/jira/browse/HIVE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12728989#action_12728989 ] Min Zhou commented on HIVE-555: --- 1. I thought it would be a common function for generic udf error prompt. 2. It that required for an existing generic udf? but regardless whatever, i'll do it. create temporary function support not only udf, but also udaf, genericudf, etc. Key: HIVE-555 URL: https://issues.apache.org/jira/browse/HIVE-555 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-555-1.patch, HIVE-555-2.patch Right now, command 'create temporary function' only support udf. we can also let user write their udaf, generic udf, and write generic udaf in the future. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-555) create temporary function support not only udf, but also udaf, genericudf, etc.
[ https://issues.apache.org/jira/browse/HIVE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-555: -- Attachment: HIVE-555-3.patch patch followed namit's comments. create temporary function support not only udf, but also udaf, genericudf, etc. Key: HIVE-555 URL: https://issues.apache.org/jira/browse/HIVE-555 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-555-1.patch, HIVE-555-2.patch, HIVE-555-3.patch Right now, command 'create temporary function' only support udf. we can also let user write their udaf, generic udf, and write generic udaf in the future. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-555) create temporary function support not only udf, but also udaf, genericudf, etc.
[ https://issues.apache.org/jira/browse/HIVE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12729009#action_12729009 ] Min Zhou commented on HIVE-555: --- @Zheng It would involved some logic out of the FuctionTask. Actually , execute methods of all Task classes is defined to return an integer stand for status code. So create another jira for that issue is better. Agree? create temporary function support not only udf, but also udaf, genericudf, etc. Key: HIVE-555 URL: https://issues.apache.org/jira/browse/HIVE-555 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-555-1.patch, HIVE-555-2.patch, HIVE-555-3.patch Right now, command 'create temporary function' only support udf. we can also let user write their udaf, generic udf, and write generic udaf in the future. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-329) start and stop hive thrift server in daemon mode
[ https://issues.apache.org/jira/browse/HIVE-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12729012#action_12729012 ] Min Zhou commented on HIVE-329: --- start need a port number, but stop needn't. start and stop hive thrift server in daemon mode - Key: HIVE-329 URL: https://issues.apache.org/jira/browse/HIVE-329 Project: Hadoop Hive Issue Type: New Feature Components: Server Infrastructure Affects Versions: 0.3.0 Reporter: Min Zhou Attachments: daemon.patch I write two shell script to start and stop hive thrift server more convenient. usage: bin/hive --service start-hive [HIVE_PORT] bin/hive --service stop-hive -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)
[ https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12727999#action_12727999 ] Min Zhou commented on HIVE-537: --- Zheng, how would you get field value from an object without a ordinal? Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map) --- Key: HIVE-537 URL: https://issues.apache.org/jira/browse/HIVE-537 Project: Hadoop Hive Issue Type: New Feature Reporter: Zheng Shao Assignee: Min Zhou Attachments: HIVE-537.1.patch There are already some cases inside the code that we use heterogeneous data: JoinOperator, and UnionOperator (in the sense that different parents can pass in records with different ObjectInspectors). We currently use Operator's parentID to distinguish that. However that approach does not extend to more complex plans that might be needed in the future. We will support the union type like this: {code} TypeDefinition: type: primitivetype | structtype | arraytype | maptype | uniontype uniontype: union tag : type (, tag : type)* Example: union0:int,1:double,2:arraystring,3:structa:int,b:string Example of serialized data format: We will first store the tag byte before we serialize the object. On deserialization, we will first read out the tag byte, then we know what is the current type of the following object, so we can deserialize it successfully. Interface for ObjectInspector: interface UnionObjectInspector { /** Returns the array of OIs that are for each of the tags */ ObjectInspector[] getObjectInspectors(); /** Return the tag of the object. */ byte getTag(Object o); /** Return the field based on the tag value associated with the Object. */ Object getField(Object o); }; An example serialization format (Using deliminated format, with ' ' as first-level delimitor and '=' as second-level delimitor) userid:int,log:union0:structtouserid:int,message:string,1:string 123 1=login 123 0=243=helloworld 123 1=logout {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)
[ https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-537: -- Attachment: HIVE-537.1.patch HIVE-537.1.patch Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map) --- Key: HIVE-537 URL: https://issues.apache.org/jira/browse/HIVE-537 Project: Hadoop Hive Issue Type: New Feature Reporter: Zheng Shao Assignee: Zheng Shao Attachments: HIVE-537.1.patch There are already some cases inside the code that we use heterogeneous data: JoinOperator, and UnionOperator (in the sense that different parents can pass in records with different ObjectInspectors). We currently use Operator's parentID to distinguish that. However that approach does not extend to more complex plans that might be needed in the future. We will support the union type like this: {code} TypeDefinition: type: primitivetype | structtype | arraytype | maptype | uniontype uniontype: union tag : type (, tag : type)* Example: union0:int,1:double,2:arraystring,3:structa:int,b:string Example of serialized data format: We will first store the tag byte before we serialize the object. On deserialization, we will first read out the tag byte, then we know what is the current type of the following object, so we can deserialize it successfully. Interface for ObjectInspector: interface UnionObjectInspector { /** Returns the array of OIs that are for each of the tags */ ObjectInspector[] getObjectInspectors(); /** Return the tag of the object. */ byte getTag(Object o); /** Return the field based on the tag value associated with the Object. */ Object getField(Object o); }; An example serialization format (Using deliminated format, with ' ' as first-level delimitor and '=' as second-level delimitor) userid:int,log:union0:structtouserid:int,message:string,1:string 123 1=login 123 0=243=helloworld 123 1=logout {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)
[ https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12725532#action_12725532 ] Min Zhou commented on HIVE-537: --- Even if UnionObjectInspector has been implemented, the DynamicSerDe seems don't support the schema with a union type which thrift can't recoginze. We must find a way solving it, any suggestions? Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map) --- Key: HIVE-537 URL: https://issues.apache.org/jira/browse/HIVE-537 Project: Hadoop Hive Issue Type: New Feature Reporter: Zheng Shao Assignee: Zheng Shao There are already some cases inside the code that we use heterogeneous data: JoinOperator, and UnionOperator (in the sense that different parents can pass in records with different ObjectInspectors). We currently use Operator's parentID to distinguish that. However that approach does not extend to more complex plans that might be needed in the future. We will support the union type like this: {code} TypeDefinition: type: primitivetype | structtype | arraytype | maptype | uniontype uniontype: union tag : type (, tag : type)* Example: union0:int,1:double,2:arraystring,3:structa:int,b:string Example of serialized data format: We will first store the tag byte before we serialize the object. On deserialization, we will first read out the tag byte, then we know what is the current type of the following object, so we can deserialize it successfully. Interface for ObjectInspector: interface UnionObjectInspector { /** Returns the array of OIs that are for each of the tags */ ObjectInspector[] getObjectInspectors(); /** Return the tag of the object. */ byte getTag(Object o); /** Return the field based on the tag value associated with the Object. */ Object getField(Object o); }; An example serialization format (Using deliminated format, with ' ' as first-level delimitor and '=' as second-level delimitor) userid:int,log:union0:structtouserid:int,message:string,1:string 123 1=login 123 0=243=helloworld 123 1=logout {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
[ https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12725564#action_12725564 ] Min Zhou commented on HIVE-577: --- Passed all testcase in hadoop 0.17.0 -0.19.1. return correct comment of a column from ThriftHiveMetastore.Iface.get_fields Key: HIVE-577 URL: https://issues.apache.org/jira/browse/HIVE-577 Project: Hadoop Hive Issue Type: Sub-task Reporter: Min Zhou Assignee: Min Zhou Attachments: HIVE-577.1.patch, HIVE-577.2.patch comment of each column hasnot been retrieved correct right now , FieldSchema.getComment() will return a string from derserializer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
[ https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-577: -- Attachment: HIVE-577.1.patch can retrieve all columns' comments now. return correct comment of a column from ThriftHiveMetastore.Iface.get_fields Key: HIVE-577 URL: https://issues.apache.org/jira/browse/HIVE-577 Project: Hadoop Hive Issue Type: Sub-task Reporter: Min Zhou Assignee: Min Zhou Attachments: HIVE-577.1.patch comment of each column hasnot been retrieved correct right now , FieldSchema.getComment() will return a string from derserializer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
[ https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-577: -- Attachment: HIVE-577.2.patch @Prasad I considered that case you mentioned before uploaded a that patch, just didn't know what is the meaning of code. this patch would cope the issue. return correct comment of a column from ThriftHiveMetastore.Iface.get_fields Key: HIVE-577 URL: https://issues.apache.org/jira/browse/HIVE-577 Project: Hadoop Hive Issue Type: Sub-task Reporter: Min Zhou Assignee: Min Zhou Attachments: HIVE-577.1.patch, HIVE-577.2.patch comment of each column hasnot been retrieved correct right now , FieldSchema.getComment() will return a string from derserializer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
[ https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12725450#action_12725450 ] Min Zhou commented on HIVE-577: --- I guessed it's cumbersome to deal with custom tables from current api provided by hive currently. ddl for schema should changed from struct{ type1 col1, type2 col2} to some format like struct{ struct{type1 col1, string comment1}, struct{type2 col2, string comment2}} however, MetaStoreUtils.getDDLFromFieldSchema(structName, fieldSchemas) is not only for getSchema(table). return correct comment of a column from ThriftHiveMetastore.Iface.get_fields Key: HIVE-577 URL: https://issues.apache.org/jira/browse/HIVE-577 Project: Hadoop Hive Issue Type: Sub-task Reporter: Min Zhou Assignee: Min Zhou Attachments: HIVE-577.1.patch, HIVE-577.2.patch comment of each column hasnot been retrieved correct right now , FieldSchema.getComment() will return a string from derserializer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
[ https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12725450#action_12725450 ] Min Zhou edited comment on HIVE-577 at 6/29/09 8:15 PM: I guessed it's cumbersome to deal with custom tables from the api provided by hive currently. DDL for table schema should changed from struct{ type1 col1, type2 col2} to some format like struct{ struct{type1 col1, string comment1}, struct{type2 col2, string comment2}} however, MetaStoreUtils.getDDLFromFieldSchema(structName, fieldSchemas) is not only for getSchema(table). was (Author: coderplay): I guessed it's cumbersome to deal with custom tables from current api provided by hive currently. ddl for schema should changed from struct{ type1 col1, type2 col2} to some format like struct{ struct{type1 col1, string comment1}, struct{type2 col2, string comment2}} however, MetaStoreUtils.getDDLFromFieldSchema(structName, fieldSchemas) is not only for getSchema(table). return correct comment of a column from ThriftHiveMetastore.Iface.get_fields Key: HIVE-577 URL: https://issues.apache.org/jira/browse/HIVE-577 Project: Hadoop Hive Issue Type: Sub-task Reporter: Min Zhou Assignee: Min Zhou Attachments: HIVE-577.1.patch, HIVE-577.2.patch comment of each column hasnot been retrieved correct right now , FieldSchema.getComment() will return a string from derserializer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
[ https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12725473#action_12725473 ] Min Zhou commented on HIVE-577: --- Any suggestions on this or accepting the 2nd patch, Prasad? return correct comment of a column from ThriftHiveMetastore.Iface.get_fields Key: HIVE-577 URL: https://issues.apache.org/jira/browse/HIVE-577 Project: Hadoop Hive Issue Type: Sub-task Reporter: Min Zhou Assignee: Min Zhou Attachments: HIVE-577.1.patch, HIVE-577.2.patch comment of each column hasnot been retrieved correct right now , FieldSchema.getComment() will return a string from derserializer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)
[ https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12724916#action_12724916 ] Min Zhou commented on HIVE-537: --- we've done a test about this issue, dataset: 700m records. first approach, each distinct count needs 119 seconds, that's means 10 distinct count needs at least 1190 seconds. second approach where distinct keys were distinguished by a tag, 10 distinct count need 148 seconds. Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map) --- Key: HIVE-537 URL: https://issues.apache.org/jira/browse/HIVE-537 Project: Hadoop Hive Issue Type: New Feature Reporter: Zheng Shao Assignee: Zheng Shao There are already some cases inside the code that we use heterogeneous data: JoinOperator, and UnionOperator (in the sense that different parents can pass in records with different ObjectInspectors). We currently use Operator's parentID to distinguish that. However that approach does not extend to more complex plans that might be needed in the future. We will support the union type like this: {code} TypeDefinition: type: primitivetype | structtype | arraytype | maptype | uniontype uniontype: union tag : type (, tag : type)* Example: union0:int,1:double,2:arraystring,3:structa:int,b:string Example of serialized data format: We will first store the tag byte before we serialize the object. On deserialization, we will first read out the tag byte, then we know what is the current type of the following object, so we can deserialize it successfully. Interface for ObjectInspector: interface UnionObjectInspector { /** Returns the array of OIs that are for each of the tags */ ObjectInspector[] getObjectInspectors(); /** Return the tag of the object. */ byte getTag(Object o); /** Return the field based on the tag value associated with the Object. */ Object getField(Object o); }; {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-576) complete jdbc driver
[ https://issues.apache.org/jira/browse/HIVE-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12724060#action_12724060 ] Min Zhou commented on HIVE-576: --- Dones To Dos : # removed all useless comments auto-gened by eclipse. # added APL statements for each file # fixed a bug SemanticAnalyzer.getSchema() fails after doing select-all queries on tables have partitions, where queries like select * from tbl where partition_name=value # implemented HiveResultSetMetadata, HiveDatabaseMetadata # HiveResultSet supported getXXX(columnName) now # removed JdbcSessionState hasnot been used # supported SQL Explorer for manipulate hive data by GUI # todo: implement HivePreparedStatement HiveCallableStatement complete jdbc driver Key: HIVE-576 URL: https://issues.apache.org/jira/browse/HIVE-576 Project: Hadoop Hive Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-576.1.patch, HIVE-576.2.patch, sqlexplorer.jpg hive only support a few interfaces of jdbc, let's complete it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer
[ https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-567: -- Attachment: (was: tables.jpg) jdbc: integrate hive with pentaho report designer - Key: HIVE-567 URL: https://issues.apache.org/jira/browse/HIVE-567 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: Raghotham Murthy Assignee: Raghotham Murthy Fix For: 0.4.0 Attachments: hive-567-server-output.txt, hive-567.1.patch, hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz Instead of trying to get a complete implementation of jdbc, its probably more useful to pick reporting/analytics software out there and implement the jdbc methods necessary to get them working. This jira is a first attempt at this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer
[ https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-567: -- Attachment: (was: sqlexplorer.jpg) jdbc: integrate hive with pentaho report designer - Key: HIVE-567 URL: https://issues.apache.org/jira/browse/HIVE-567 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: Raghotham Murthy Assignee: Raghotham Murthy Fix For: 0.4.0 Attachments: hive-567-server-output.txt, hive-567.1.patch, hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz Instead of trying to get a complete implementation of jdbc, its probably more useful to pick reporting/analytics software out there and implement the jdbc methods necessary to get them working. This jira is a first attempt at this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-576) complete jdbc driver
complete jdbc driver Key: HIVE-576 URL: https://issues.apache.org/jira/browse/HIVE-576 Project: Hadoop Hive Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Min Zhou Fix For: 0.4.0 hive only support a few interfaces of jdbc, let's complete it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-567) jdbc: integrate hive with pentaho report designer
[ https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723526#action_12723526 ] Min Zhou commented on HIVE-567: --- It's not elegant getting schema from hiveserver by the means of adding a function getFullDDLFromFieldSchema. jdbc: integrate hive with pentaho report designer - Key: HIVE-567 URL: https://issues.apache.org/jira/browse/HIVE-567 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: Raghotham Murthy Assignee: Raghotham Murthy Fix For: 0.4.0 Attachments: hive-567-server-output.txt, hive-567.1.patch, hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz Instead of trying to get a complete implementation of jdbc, its probably more useful to pick reporting/analytics software out there and implement the jdbc methods necessary to get them working. This jira is a first attempt at this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
return correct comment of a column from ThriftHiveMetastore.Iface.get_fields Key: HIVE-577 URL: https://issues.apache.org/jira/browse/HIVE-577 Project: Hadoop Hive Issue Type: Sub-task Reporter: Min Zhou comment of each column hasnot been retrieved correct right now , FieldSchema.getComment() will return a string from derserializer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-573) TestHiveServer broken
[ https://issues.apache.org/jira/browse/HIVE-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723841#action_12723841 ] Min Zhou commented on HIVE-573: --- it's a good way use json through Avro here, but making things more complex. serde(although is not a rpc), thrift, avro, 3 duplications of works. TestHiveServer broken - Key: HIVE-573 URL: https://issues.apache.org/jira/browse/HIVE-573 Project: Hadoop Hive Issue Type: Bug Components: Server Infrastructure Reporter: Raghotham Murthy Assignee: Raghotham Murthy Fix For: 0.4.0 Attachments: hive-573.1.patch This was after the change to HIVE-567 was committed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Work started: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
[ https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-577 started by Min Zhou. return correct comment of a column from ThriftHiveMetastore.Iface.get_fields Key: HIVE-577 URL: https://issues.apache.org/jira/browse/HIVE-577 Project: Hadoop Hive Issue Type: Sub-task Reporter: Min Zhou Assignee: Min Zhou comment of each column hasnot been retrieved correct right now , FieldSchema.getComment() will return a string from derserializer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration
[ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723874#action_12723874 ] Min Zhou commented on HIVE-574: --- +1 for Zheng, thanks It worked fine here, nothing abnormal. Hive should use ClassLoader from hadoop Configuration - Key: HIVE-574 URL: https://issues.apache.org/jira/browse/HIVE-574 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.3.0, 0.3.1 Reporter: Zheng Shao Assignee: Zheng Shao Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch See HIVE-338. Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea: {code} package org.apache.hadoop.conf; public class Configuration implements IterableMap.EntryString,String { ... /** * Load a class by name. * * @param name the class name. * @return the class object. * @throws ClassNotFoundException if the class is not found. */ public Class? getClassByName(String name) throws ClassNotFoundException { return Class.forName(name, true, classLoader); } {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-559) Support JDBC ResultSetMetadata
[ https://issues.apache.org/jira/browse/HIVE-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-559: -- Issue Type: Sub-task (was: New Feature) Parent: HIVE-576 Support JDBC ResultSetMetadata -- Key: HIVE-559 URL: https://issues.apache.org/jira/browse/HIVE-559 Project: Hadoop Hive Issue Type: Sub-task Components: Clients Reporter: Bill Graham Assignee: Min Zhou Support ResultSetMetadata for JDBC ResultSets. The getColumn* methods would be particularly useful I'd expect: http://java.sun.com/javase/6/docs/api/java/sql/ResultSetMetaData.html The challenge as I see it though, is that the JDBC client only has access to the raw query string and the result data when running in standalone mode. Therefore, it will need to get the column metadata one of two way: 1. By parsing the query to determine the tables/columns involved and then making a request to the metastore to get the metadata for the columns. This certainly feels like duplicate work, since the query of course gets properly parsed on the server. 2. By returning the column metadata from the server. My thrift knowledge is limited, but I suspect adding this to the response would present other challenges. Any thoughts or suggestions? Option #1 feels clunkier, yet safer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer
[ https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-567: -- Attachment: sqlexplorer.jpg jdbc: integrate hive with pentaho report designer - Key: HIVE-567 URL: https://issues.apache.org/jira/browse/HIVE-567 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: Raghotham Murthy Assignee: Raghotham Murthy Fix For: 0.4.0 Attachments: hive-567-server-output.txt, hive-567.1.patch, hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, result.jpg, sqlexplorer.jpg, tables.jpg Instead of trying to get a complete implementation of jdbc, its probably more useful to pick reporting/analytics software out there and implement the jdbc methods necessary to get them working. This jira is a first attempt at this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-338) Executing cli commands into thrift server
[ https://issues.apache.org/jira/browse/HIVE-338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723400#action_12723400 ] Min Zhou commented on HIVE-338: --- Can you exlain why you made a change at FunctionTask .java? It caused a java.lang.ClassNotFoundException when I executing my udf. ClassLoader didnot work. Executing cli commands into thrift server - Key: HIVE-338 URL: https://issues.apache.org/jira/browse/HIVE-338 Project: Hadoop Hive Issue Type: Improvement Components: Server Infrastructure Affects Versions: 0.3.0 Reporter: Min Zhou Assignee: Zheng Shao Fix For: 0.4.0 Attachments: hive-338.final.patch, HIVE-338.postfix.1.patch, hiveserver-v1.patch, hiveserver-v2.patch, hiveserver-v3.patch Let thrift server support set, add/delete file/jar and normal HSQL query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (HIVE-338) Executing cli commands into thrift server
[ https://issues.apache.org/jira/browse/HIVE-338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723400#action_12723400 ] Min Zhou edited comment on HIVE-338 at 6/23/09 7:28 PM: Can you exlain why you made a change at FunctionTask .java? It caused a java.lang.ClassNotFoundException when I executing my udf where mr jobs were submitted by hive cli. ClassLoader didnot work. was (Author: coderplay): Can you exlain why you made a change at FunctionTask .java? It caused a java.lang.ClassNotFoundException when I executing my udf. ClassLoader didnot work. Executing cli commands into thrift server - Key: HIVE-338 URL: https://issues.apache.org/jira/browse/HIVE-338 Project: Hadoop Hive Issue Type: Improvement Components: Server Infrastructure Affects Versions: 0.3.0 Reporter: Min Zhou Assignee: Zheng Shao Fix For: 0.4.0 Attachments: hive-338.final.patch, HIVE-338.postfix.1.patch, hiveserver-v1.patch, hiveserver-v2.patch, hiveserver-v3.patch Let thrift server support set, add/delete file/jar and normal HSQL query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer
[ https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-567: -- Attachment: tables.jpg jdbc: integrate hive with pentaho report designer - Key: HIVE-567 URL: https://issues.apache.org/jira/browse/HIVE-567 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: Raghotham Murthy Assignee: Raghotham Murthy Fix For: 0.4.0 Attachments: hive-567-server-output.txt, hive-567.1.patch, hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, tables.jpg Instead of trying to get a complete implementation of jdbc, its probably more useful to pick reporting/analytics software out there and implement the jdbc methods necessary to get them working. This jira is a first attempt at this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-567) jdbc: integrate hive with pentaho report designer
[ https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou reassigned HIVE-567: - Assignee: Min Zhou (was: Raghotham Murthy) jdbc: integrate hive with pentaho report designer - Key: HIVE-567 URL: https://issues.apache.org/jira/browse/HIVE-567 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: Raghotham Murthy Assignee: Min Zhou Fix For: 0.4.0 Attachments: hive-567-server-output.txt, hive-567.1.patch, hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, result.jpg, tables.jpg Instead of trying to get a complete implementation of jdbc, its probably more useful to pick reporting/analytics software out there and implement the jdbc methods necessary to get them working. This jira is a first attempt at this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer
[ https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-567: -- Attachment: (was: result.jpg) jdbc: integrate hive with pentaho report designer - Key: HIVE-567 URL: https://issues.apache.org/jira/browse/HIVE-567 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: Raghotham Murthy Assignee: Min Zhou Fix For: 0.4.0 Attachments: hive-567-server-output.txt, hive-567.1.patch, hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, tables.jpg Instead of trying to get a complete implementation of jdbc, its probably more useful to pick reporting/analytics software out there and implement the jdbc methods necessary to get them working. This jira is a first attempt at this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer
[ https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-567: -- Attachment: result.jpg jdbc: integrate hive with pentaho report designer - Key: HIVE-567 URL: https://issues.apache.org/jira/browse/HIVE-567 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: Raghotham Murthy Assignee: Min Zhou Fix For: 0.4.0 Attachments: hive-567-server-output.txt, hive-567.1.patch, hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, result.jpg, tables.jpg Instead of trying to get a complete implementation of jdbc, its probably more useful to pick reporting/analytics software out there and implement the jdbc methods necessary to get them working. This jira is a first attempt at this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-567) jdbc: integrate hive with pentaho report designer
[ https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou reassigned HIVE-567: - Assignee: Raghotham Murthy (was: Min Zhou) incorrect manipulation jdbc: integrate hive with pentaho report designer - Key: HIVE-567 URL: https://issues.apache.org/jira/browse/HIVE-567 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: Raghotham Murthy Assignee: Raghotham Murthy Fix For: 0.4.0 Attachments: hive-567-server-output.txt, hive-567.1.patch, hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, result.jpg, tables.jpg Instead of trying to get a complete implementation of jdbc, its probably more useful to pick reporting/analytics software out there and implement the jdbc methods necessary to get them working. This jira is a first attempt at this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer
[ https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-567: -- Comment: was deleted (was: incorrect manipulation) jdbc: integrate hive with pentaho report designer - Key: HIVE-567 URL: https://issues.apache.org/jira/browse/HIVE-567 Project: Hadoop Hive Issue Type: Improvement Components: Clients Reporter: Raghotham Murthy Assignee: Raghotham Murthy Fix For: 0.4.0 Attachments: hive-567-server-output.txt, hive-567.1.patch, hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, result.jpg, tables.jpg Instead of trying to get a complete implementation of jdbc, its probably more useful to pick reporting/analytics software out there and implement the jdbc methods necessary to get them working. This jira is a first attempt at this. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12721169#action_12721169 ] Min Zhou commented on HIVE-521: --- I didn't think all tests would pass due to the shortage of a class BinaryComparable. The reason why failing has nothing to do with this jira. you can check out the trunk,and do ant -Dhadoop.version=0.17.0 test -Doverwrite=true then error message will be displayed. ... [junit] Exception: org/apache/hadoop/io/BinaryComparable [junit] java.lang.NoClassDefFoundError: org/apache/hadoop/io/BinaryComparable [junit] at java.lang.Class.getDeclaredConstructors0(Native Method) [junit] at java.lang.Class.privateGetDeclaredConstructors(Class.java:2389) [junit] at java.lang.Class.getConstructor0(Class.java:2699) [junit] at java.lang.Class.newInstance0(Class.java:326) [junit] at java.lang.Class.newInstance(Class.java:308) [junit] at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getUDFMethod(FunctionRegistry.java:309) [junit] at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:451) [junit] at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:558) [junit] at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:653) [junit] at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80) [junit] at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83) [junit] at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:116) [junit] at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95) [junit] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:3922) [junit] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1000) [junit] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:986) [junit] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:3163) [junit] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:3610) [junit] at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:3840) [junit] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:76) [junit] at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:44) [junit] at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:76) [junit] at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:177) [junit] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:209) [junit] at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:176) [junit] at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:216) [junit] at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:471) [junit] at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_case_sensitivity(TestCliDriver.java:726) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) [junit] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) [junit] at java.lang.reflect.Method.invoke(Method.java:597) [junit] at junit.framework.TestCase.runTest(TestCase.java:154) [junit] at junit.framework.TestCase.runBare(TestCase.java:127) [junit] at junit.framework.TestResult$1.protect(TestResult.java:106) [junit] at junit.framework.TestResult.runProtected(TestResult.java:124) [junit] at junit.framework.TestResult.run(TestResult.java:109) [junit] at junit.framework.TestCase.run(TestCase.java:118) [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208) [junit] at junit.framework.TestSuite.run(TestSuite.java:203) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672) [junit] at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567) [junit] Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.BinaryComparable [junit] at java.net.URLClassLoader$1.run(URLClassLoader.java:200) [junit] at
[jira] Commented: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12721231#action_12721231 ] Min Zhou commented on HIVE-521: --- @ HIVE-521-all-v7.patch # {code:java} boolean conditionTypeIsOk = (arguments[0].getCategory() == ObjectInspector.Category.PRIMITIVE); if (conditionTypeIsOk) { PrimitiveObjectInspector poi = ((PrimitiveObjectInspector)arguments[0]); conditionTypeIsOk = (poi.getPrimitiveCategory() == PrimitiveObjectInspector.PrimitiveCategory.BOOLEAN || poi.getPrimitiveCategory() == PrimitiveObjectInspector.PrimitiveCategory.VOID); } if (!conditionTypeIsOk) { throw new UDFArgumentTypeException(0, The first argument of function IF should be \ + Constants.BOOLEAN_TYPE_NAME + \, but \ + arguments[0].getTypeName() + \ is found); } {code} # {code:java} String typeName = arguments[0].getTypeName(); if (!typeName.equals(Constants.BOOLEAN_TYPE_NAME) || !typeName.equals(Constants.VOID_TYPE_NAME)) { throw new UDFArgumentTypeException(0, The first expression of function IF is expected to \ + Constants.BOOLEAN_TYPE_NAME + \, but \ + arguments[0].getTypeName() + \ is found); } {code} I though the 2nd approach is more concise, do you think so? Move size, if, isnull, isnotnull to GenericUDF -- Key: HIVE-521 URL: https://issues.apache.org/jira/browse/HIVE-521 Project: Hadoop Hive Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Zheng Shao Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-521-all-v1.patch, HIVE-521-all-v2.patch, HIVE-521-all-v3.patch, HIVE-521-all-v4.patch, HIVE-521-all-v5.patch, HIVE-521-all-v6.patch, HIVE-521-all-v7.patch, HIVE-521-IF-2.patch, HIVE-521-IF-3.patch, HIVE-521-IF-4.patch, HIVE-521-IF-5.patch, HIVE-521-IF.patch See HIVE-511 for an example of the move. size, if, isnull, isnotnull are all implemented with UDF but they are actually working on variable types of objects. We should move them to GenericUDF for better type handling. This also helps to clean up the hack in doing type matching/type conversion in UDF. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12721595#action_12721595 ] Min Zhou commented on HIVE-521: --- ok, we are hairsplitting. passed all tests here, let commit it . +1 Move size, if, isnull, isnotnull to GenericUDF -- Key: HIVE-521 URL: https://issues.apache.org/jira/browse/HIVE-521 Project: Hadoop Hive Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Zheng Shao Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-521-all-v1.patch, HIVE-521-all-v2.patch, HIVE-521-all-v3.patch, HIVE-521-all-v4.patch, HIVE-521-all-v5.patch, HIVE-521-all-v6.patch, HIVE-521-all-v7.patch, HIVE-521-IF-2.patch, HIVE-521-IF-3.patch, HIVE-521-IF-4.patch, HIVE-521-IF-5.patch, HIVE-521-IF.patch See HIVE-511 for an example of the move. size, if, isnull, isnotnull are all implemented with UDF but they are actually working on variable types of objects. We should move them to GenericUDF for better type handling. This also helps to clean up the hack in doing type matching/type conversion in UDF. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-521: -- Attachment: HIVE-521-all-v5.patch Move size, if, isnull, isnotnull to GenericUDF -- Key: HIVE-521 URL: https://issues.apache.org/jira/browse/HIVE-521 Project: Hadoop Hive Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Zheng Shao Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-521-all-v1.patch, HIVE-521-all-v2.patch, HIVE-521-all-v3.patch, HIVE-521-all-v4.patch, HIVE-521-all-v5.patch, HIVE-521-IF-2.patch, HIVE-521-IF-3.patch, HIVE-521-IF-4.patch, HIVE-521-IF-5.patch, HIVE-521-IF.patch See HIVE-511 for an example of the move. size, if, isnull, isnotnull are all implemented with UDF but they are actually working on variable types of objects. We should move them to GenericUDF for better type handling. This also helps to clean up the hack in doing type matching/type conversion in UDF. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-521: -- Attachment: HIVE-521-all-v6.patch Move size, if, isnull, isnotnull to GenericUDF -- Key: HIVE-521 URL: https://issues.apache.org/jira/browse/HIVE-521 Project: Hadoop Hive Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Zheng Shao Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-521-all-v1.patch, HIVE-521-all-v2.patch, HIVE-521-all-v3.patch, HIVE-521-all-v4.patch, HIVE-521-all-v5.patch, HIVE-521-all-v6.patch, HIVE-521-IF-2.patch, HIVE-521-IF-3.patch, HIVE-521-IF-4.patch, HIVE-521-IF-5.patch, HIVE-521-IF.patch See HIVE-511 for an example of the move. size, if, isnull, isnotnull are all implemented with UDF but they are actually working on variable types of objects. We should move them to GenericUDF for better type handling. This also helps to clean up the hack in doing type matching/type conversion in UDF. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-564) sweep the non-open source elements from hive
sweep the non-open source elements from hive Key: HIVE-564 URL: https://issues.apache.org/jira/browse/HIVE-564 Project: Hadoop Hive Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Min Zhou Fix For: 0.4.0 There are some non-open source things from facebook in current version of Hive, we should replace them with an open-source version of fb303.jar, libthrift.jar, etc, this open-source community are more likely to amend the relevant code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-521: -- Attachment: HIVE-521-all-v4.patch passed tests on hadoop version 0.17.0. Move size, if, isnull, isnotnull to GenericUDF -- Key: HIVE-521 URL: https://issues.apache.org/jira/browse/HIVE-521 Project: Hadoop Hive Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Zheng Shao Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-521-all-v1.patch, HIVE-521-all-v2.patch, HIVE-521-all-v3.patch, HIVE-521-all-v4.patch, HIVE-521-IF-2.patch, HIVE-521-IF-3.patch, HIVE-521-IF-4.patch, HIVE-521-IF-5.patch, HIVE-521-IF.patch See HIVE-511 for an example of the move. size, if, isnull, isnotnull are all implemented with UDF but they are actually working on variable types of objects. We should move them to GenericUDF for better type handling. This also helps to clean up the hack in doing type matching/type conversion in UDF. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-338) Executing cli commands into thrift server
[ https://issues.apache.org/jira/browse/HIVE-338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720462#action_12720462 ] Min Zhou commented on HIVE-338: --- I think you should take a look at these lines of org.apache.hadoop.conf.Configuration {code:java} private ClassLoader classLoader; { classLoader = Thread.currentThread().getContextClassLoader(); if (classLoader == null) { classLoader = Configuration.class.getClassLoader(); } } ... public Class? getClassByName(String name) throws ClassNotFoundException { return Class.forName(name, true, classLoader); } {code} ClassLoader of current thread changed when adding jars into ClassPath, conf hasnot synchronously get that change. Executing cli commands into thrift server - Key: HIVE-338 URL: https://issues.apache.org/jira/browse/HIVE-338 Project: Hadoop Hive Issue Type: Improvement Components: Server Infrastructure Affects Versions: 0.3.0 Reporter: Min Zhou Assignee: Min Zhou Attachments: hiveserver-v1.patch, hiveserver-v2.patch, hiveserver-v3.patch Let thrift server support set, add/delete file/jar and normal HSQL query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-556) let hive support theta join
[ https://issues.apache.org/jira/browse/HIVE-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12719518#action_12719518 ] Min Zhou commented on HIVE-556: --- I didn't see any filter there, hive will put all fields of my small table into HTree. {noformat} hiveexplain select /*+ MAPJOIN(a) */ a.url_pattern, w.url from application a join web_log w where w.logdate='20090611' and w.url rlike a.url_pattern and a.dt='20090609'; Common Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {bussiness_id} {subclass_id} {class_id} {note} {name} {url_pattern} {dt} 1 {noformat} We only put a.url_pattern into a HashMap in our raw map-reduce implemenation. let hive support theta join --- Key: HIVE-556 URL: https://issues.apache.org/jira/browse/HIVE-556 Project: Hadoop Hive Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Min Zhou Fix For: 0.4.0 Right now , hive only support equal joins . Sometimes it's not enough, we must consider implementing theta joins like {code:sql} SELECT a.subid, a.id, t.url FROM tbl t JOIN aux_tbl a ON t.url rlike a.url_pattern WHERE t.dt='20090609' AND a.dt='20090609'; {code} any condition expression following 'ON' is appropriate. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-559) Support JDBC ResultSetMetadata
[ https://issues.apache.org/jira/browse/HIVE-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou reassigned HIVE-559: - Assignee: Min Zhou Support JDBC ResultSetMetadata -- Key: HIVE-559 URL: https://issues.apache.org/jira/browse/HIVE-559 Project: Hadoop Hive Issue Type: New Feature Components: Clients Reporter: Bill Graham Assignee: Min Zhou Support ResultSetMetadata for JDBC ResultSets. The getColumn* methods would be particularly useful I'd expect: http://java.sun.com/javase/6/docs/api/java/sql/ResultSetMetaData.html The challenge as I see it though, is that the JDBC client only has access to the raw query string and the result data when running in standalone mode. Therefore, it will need to get the column metadata one of two way: 1. By parsing the query to determine the tables/columns involved and then making a request to the metastore to get the metadata for the columns. This certainly feels like duplicate work, since the query of course gets properly parsed on the server. 2. By returning the column metadata from the server. My thrift knowledge is limited, but I suspect adding this to the response would present other challenges. Any thoughts or suggestions? Option #1 feels clunkier, yet safer. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (HIVE-474) Support for distinct selection on two or more columns
[ https://issues.apache.org/jira/browse/HIVE-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12719368#action_12719368 ] Min Zhou edited comment on HIVE-474 at 6/14/09 7:02 PM: I thought there is another special case here. If the query has multiple distinct operations on the same column , we can push down the evaluation of those expressions into reducers. {code} Query: select a, count(distinct if(codition, b, null)) as col1, count(distinct if(!condition, null, b)) as col2, count(distinct b) as col3 Plan: Job : Map side: Emit: distribution_key: a, sort_key: a, b, value: nothing Reduce side: Group By a, count col1, col2, col3 by evaluating their expressions {code} was (Author: coderplay): I thought there is another special case here. If the query has multiple distinct operations on the same column , we can push down the evaluation of those expressions into reducers. Query: select a, count(distinct if(codition, b, null)) as col1, count(distinct if(!condition, null, b)) as col2, count(distinct b) as col3 Plan: Job : Map side: Emit: distribution_key: a, sort_key: a, b, value: nothing Reduce side: Group By a, count col1, col2, col3 by evaluating their expressions Support for distinct selection on two or more columns - Key: HIVE-474 URL: https://issues.apache.org/jira/browse/HIVE-474 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Alexis Rondeau The ability to select distinct several, individual columns as by example: select count(distinct user), count(distinct session) from actions; Currently returns the following failure: FAILED: Error in semantic analysis: line 2:7 DISTINCT on Different Columns not Supported user -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-474) Support for distinct selection on two or more columns
[ https://issues.apache.org/jira/browse/HIVE-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12719368#action_12719368 ] Min Zhou commented on HIVE-474: --- I thought there is another special case here. If the query has multiple distinct operations on the same column , we can push down the evaluation of those expressions into reducers. Query: select a, count(distinct if(codition, b, null)) as col1, count(distinct if(!condition, null, b)) as col2, count(distinct b) as col3 Plan: Job : Map side: Emit: distribution_key: a, sort_key: a, b, value: nothing Reduce side: Group By a, count col1, col2, col3 by evaluating their expressions Support for distinct selection on two or more columns - Key: HIVE-474 URL: https://issues.apache.org/jira/browse/HIVE-474 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Reporter: Alexis Rondeau The ability to select distinct several, individual columns as by example: select count(distinct user), count(distinct session) from actions; Currently returns the following failure: FAILED: Error in semantic analysis: line 2:7 DISTINCT on Different Columns not Supported user -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (HIVE-338) Executing cli commands into thrift server
[ https://issues.apache.org/jira/browse/HIVE-338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12717651#action_12717651 ] Min Zhou edited comment on HIVE-338 at 6/10/09 11:37 PM: - * exec/FunctionTask.java: is it necessary to specify the loader in the Class.forName call? I thought that that the current thread context loader was the always the first loader to be tried anyway during name resolution. Yes, of course. the class loader holding by HiveConf is older than that of current thread. this pacth support dfs, add/delete file/jar, set now. btw, Joydeep, would you do me a favor writing some test code that I am not familiar with? you know, ' add jar' need a separate jar, and i not quite sure how to organize them. was (Author: coderplay): * exec/FunctionTask.java: is it necessary to specify the loader in the Class.forName call? I thought that that the current thread context loader was the always the first loader to be tried anyway during name resolution. Yes, of course. the class loader holding by HiveConf is older than that of current thread. this pacth support dfs, add/delete file/jar, set now. btw, Joydeep, would you do me a favor writing some test code that I' am not familiar with it ? you know, ' add jar' need a separate jar, and i not quite sure how to organize them. Executing cli commands into thrift server - Key: HIVE-338 URL: https://issues.apache.org/jira/browse/HIVE-338 Project: Hadoop Hive Issue Type: Improvement Components: Server Infrastructure Affects Versions: 0.3.0 Reporter: Min Zhou Attachments: hiveserver-v1.patch, hiveserver-v2.patch Let thrift server support set, add/delete file/jar and normal HSQL query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)
[ https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12718373#action_12718373 ] Min Zhou commented on HIVE-537: --- first approach: O(mN/p) + O(m(N/p log (N/p))) + O(mN/r) + O(m) I don't agree with you about this O(m). It would be indeed very large cost. and meanwhile, you should adding the cost in the end joining all results into one. for the second approach, I think it should be O(N/p) + O(mN/p log (mN/p)) + O(mN/r) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map) --- Key: HIVE-537 URL: https://issues.apache.org/jira/browse/HIVE-537 Project: Hadoop Hive Issue Type: New Feature Reporter: Zheng Shao Assignee: Zheng Shao There are already some cases inside the code that we use heterogeneous data: JoinOperator, and UnionOperator (in the sense that different parents can pass in records with different ObjectInspectors). We currently use Operator's parentID to distinguish that. However that approach does not extend to more complex plans that might be needed in the future. We will support the union type like this: {code} TypeDefinition: type: primitivetype | structtype | arraytype | maptype | uniontype uniontype: union tag : type (, tag : type)* Example: union0:int,1:double,2:arraystring,3:structa:int,b:string Example of serialized data format: We will first store the tag byte before we serialize the object. On deserialization, we will first read out the tag byte, then we know what is the current type of the following object, so we can deserialize it successfully. Interface for ObjectInspector: interface UnionObjectInspector { /** Returns the array of OIs that are for each of the tags */ ObjectInspector[] getObjectInspectors(); /** Return the tag of the object. */ byte getTag(Object o); /** Return the field based on the tag value associated with the Object. */ Object getField(Object o); }; {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-556) let hive support theta join
[ https://issues.apache.org/jira/browse/HIVE-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12718687#action_12718687 ] Min Zhou commented on HIVE-556: --- it's very common for us, and blocked us badly. we ofen have one or more aux tables with about 10k records, which the major table would do theta joins on. I don't think current solution by the means of cartesian product is a good way, it would bring so terrible sorting and i/o overhead to us. let hive support theta join --- Key: HIVE-556 URL: https://issues.apache.org/jira/browse/HIVE-556 Project: Hadoop Hive Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Min Zhou Fix For: 0.4.0 Right now , hive only support equal joins . Sometimes it's not enough, we must consider implementing theta joins like {code:sql} SELECT a.subid, a.id, t.url FROM tbl t JOIN aux_tbl a ON t.url rlike a.url_pattern WHERE t.dt='20090609' AND a.dt='20090609'; {code} any condition expression following 'ON' is appropriate. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-556) let hive support theta join
[ https://issues.apache.org/jira/browse/HIVE-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12718699#action_12718699 ] Min Zhou commented on HIVE-556: --- @Ashish I agree with you, map-side joins is okay. however, it doesnot support theta joins right now. we used to load aux tables into the memory of each map node, scan major tables and do our joins. let hive support theta join --- Key: HIVE-556 URL: https://issues.apache.org/jira/browse/HIVE-556 Project: Hadoop Hive Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Min Zhou Fix For: 0.4.0 Right now , hive only support equal joins . Sometimes it's not enough, we must consider implementing theta joins like {code:sql} SELECT a.subid, a.id, t.url FROM tbl t JOIN aux_tbl a ON t.url rlike a.url_pattern WHERE t.dt='20090609' AND a.dt='20090609'; {code} any condition expression following 'ON' is appropriate. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-555) create temporary function support not only udf, but also udaf, genericudf, etc.
create temporary function support not only udf, but also udaf, genericudf, etc. Key: HIVE-555 URL: https://issues.apache.org/jira/browse/HIVE-555 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Right now, command 'create temporary function' only support udf. we can also let user write their udaf, generic udf, and write generic udaf in the future. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-555) create temporary function support not only udf, but also udaf, genericudf, etc.
[ https://issues.apache.org/jira/browse/HIVE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-555: -- Attachment: HIVE-555-1.patch patch w/o testcase create temporary function support not only udf, but also udaf, genericudf, etc. Key: HIVE-555 URL: https://issues.apache.org/jira/browse/HIVE-555 Project: Hadoop Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.4.0 Reporter: Min Zhou Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-555-1.patch Right now, command 'create temporary function' only support udf. we can also let user write their udaf, generic udf, and write generic udaf in the future. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-521: -- Attachment: HIVE-521-all-v2.patch fixed issues commented by Zheng, UDFArgumentException and UDFArgumentLengthException added. Move size, if, isnull, isnotnull to GenericUDF -- Key: HIVE-521 URL: https://issues.apache.org/jira/browse/HIVE-521 Project: Hadoop Hive Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Zheng Shao Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-521-all-v1.patch, HIVE-521-all-v2.patch, HIVE-521-IF-2.patch, HIVE-521-IF-3.patch, HIVE-521-IF-4.patch, HIVE-521-IF-5.patch, HIVE-521-IF.patch See HIVE-511 for an example of the move. size, if, isnull, isnotnull are all implemented with UDF but they are actually working on variable types of objects. We should move them to GenericUDF for better type handling. This also helps to clean up the hack in doing type matching/type conversion in UDF. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-521: -- Attachment: HIVE-521-all-v3.patch catch UDFArgumentLengthException. Move size, if, isnull, isnotnull to GenericUDF -- Key: HIVE-521 URL: https://issues.apache.org/jira/browse/HIVE-521 Project: Hadoop Hive Issue Type: Improvement Affects Versions: 0.4.0 Reporter: Zheng Shao Assignee: Min Zhou Fix For: 0.4.0 Attachments: HIVE-521-all-v1.patch, HIVE-521-all-v2.patch, HIVE-521-all-v3.patch, HIVE-521-IF-2.patch, HIVE-521-IF-3.patch, HIVE-521-IF-4.patch, HIVE-521-IF-5.patch, HIVE-521-IF.patch See HIVE-511 for an example of the move. size, if, isnull, isnotnull are all implemented with UDF but they are actually working on variable types of objects. We should move them to GenericUDF for better type handling. This also helps to clean up the hack in doing type matching/type conversion in UDF. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-556) let hive support theta join
let hive support theta join --- Key: HIVE-556 URL: https://issues.apache.org/jira/browse/HIVE-556 Project: Hadoop Hive Issue Type: New Feature Affects Versions: 0.4.0 Reporter: Min Zhou Fix For: 0.4.0 Right now , hive only support equal joins . Somethings it's not enough, we must consider implementing theta joins like {code:sql} SELECT a.subid, a.id, t.url FROM tbl t JOIN aux_tbl a ON t.url rlike a.url_pattern WHERE t.dt='20090609' AND a.dt='20090609'; {code} any condition expression following 'ON' is appropriate. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-338) Executing cli commands into thrift server
[ https://issues.apache.org/jira/browse/HIVE-338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-338: -- Attachment: hiveserver-v2.patch * exec/FunctionTask.java: is it necessary to specify the loader in the Class.forName call? I thought that that the current thread context loader was the always the first loader to be tried anyway during name resolution. Yes, of course. the class loader holding by HiveConf is older than that of current thread. this pacth support dfs, add/delete file/jar, set now. btw, Joydeep, would you do me a favor writing some test code that I' am not familiar with it ? you know, ' add jar' need a separate jar, and i not quite sure how to organize them. Executing cli commands into thrift server - Key: HIVE-338 URL: https://issues.apache.org/jira/browse/HIVE-338 Project: Hadoop Hive Issue Type: Improvement Components: Server Infrastructure Affects Versions: 0.3.0 Reporter: Min Zhou Attachments: hiveserver-v1.patch, hiveserver-v2.patch Let thrift server support set, add/delete file/jar and normal HSQL query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-338) Executing cli commands into thrift server
[ https://issues.apache.org/jira/browse/HIVE-338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-338: -- Attachment: hiveserver-v2.patch oops, made a mistake when uploading former one. Executing cli commands into thrift server - Key: HIVE-338 URL: https://issues.apache.org/jira/browse/HIVE-338 Project: Hadoop Hive Issue Type: Improvement Components: Server Infrastructure Affects Versions: 0.3.0 Reporter: Min Zhou Attachments: hiveserver-v1.patch, hiveserver-v2.patch Let thrift server support set, add/delete file/jar and normal HSQL query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-338) Executing cli commands into thrift server
[ https://issues.apache.org/jira/browse/HIVE-338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Min Zhou updated HIVE-338: -- Attachment: (was: hiveserver-v2.patch) Executing cli commands into thrift server - Key: HIVE-338 URL: https://issues.apache.org/jira/browse/HIVE-338 Project: Hadoop Hive Issue Type: Improvement Components: Server Infrastructure Affects Versions: 0.3.0 Reporter: Min Zhou Attachments: hiveserver-v1.patch, hiveserver-v2.patch Let thrift server support set, add/delete file/jar and normal HSQL query. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.