[jira] [Commented] (HIVE-24040) Slightly odd behaviour with CHAR comparisons and string literals
[ https://issues.apache.org/jira/browse/HIVE-24040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209672#comment-17209672 ] Tim Armstrong commented on HIVE-24040: -- [~kgyrtkirk] I'd recommend reading http://databasearchitects.blogspot.com/2015/01/fun-with-char.html for an interesting perspective on this (one of its conclusions is that Postgres and other systems do not implement the spec exactly, and that may be a good thing). > Slightly odd behaviour with CHAR comparisons and string literals > > > Key: HIVE-24040 > URL: https://issues.apache.org/jira/browse/HIVE-24040 > Project: Hive > Issue Type: Bug >Reporter: Tim Armstrong >Priority: Major > > If t is a char column, this statement behaves a bit strangely - since the RHS > is a STRING, I would have expected it to behave consistently with other > CHAR/STRING comparisons, where the CHAR column has its trailing spaces > removed and the STRING does not have its trailing spaces removed. > {noformat} > select count(*) from ax where t = cast('a ' as string); > {noformat} > Instead it seems to be treated the same as if it was a plain literal, > interpreted as CHAR, i.e. > {noformat} > select count(*) from ax where t = 'a '; > {noformat} > Here are some more experiments I did based on > https://github.com/apache/hive/blob/master/ql/src/test/queries/clientpositive/in_typecheck_char.q > that seem to show some inconsistencies. > {noformat} > -- Hive version 3.1.3000.7.2.1.0-287 r4e72e59f1c2a51a64e0ff37b14bd396cd4e97b98 > create table ax(s char(1),t char(10)); > insert into ax values ('a','a'),('a','a '),('b','bb'); > -- varchar literal preserves trailing space > select count(*) from ax where t = cast('a ' as varchar(50)); > +--+ > | _c0 | > +--+ > | 0| > +--+ > -- explicit cast of literal to string removes trailing space > select count(*) from ax where t = cast('a ' as string); > +--+ > | _c0 | > +--+ > | 2| > +--+ > -- other string expressions preserve trailing space > select count(*) from ax where t = concat('a', ' '); > +--+ > | _c0 | > +--+ > | 0| > +--+ > -- varchar col preserves trailing space > create table stringv as select cast('a ' as varchar(50)); > select count(*) from ax, stringv where t = `_c0`; > +--+ > | _c0 | > +--+ > | 0| > +--+ > -- string col preserves trailing space > create table stringa as select 'a '; > select count(*) from ax, stringa where t = `_c0`; > +--+ > | _c0 | > +--+ > | 0| > +--+ > {noformat} > [~jcamachorodriguez] [~kgyrtkirk] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-9452) Use HBase to store Hive metadata
[ https://issues.apache.org/jira/browse/HIVE-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong reassigned HIVE-9452: --- Assignee: Tim Armstrong (was: Alan Gates) > Use HBase to store Hive metadata > > > Key: HIVE-9452 > URL: https://issues.apache.org/jira/browse/HIVE-9452 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: hbase-metastore-branch >Reporter: Alan Gates >Assignee: Tim Armstrong >Priority: Major > Attachments: HBaseMetastoreApproach.pdf > > > qThis is an umbrella JIRA for a project to explore using HBase to store the > Hive data catalog (ie the metastore). This project has several goals: > # The current metastore implementation is slow when tables have thousands or > more partitions. With Tez and Spark engines we are pushing Hive to a point > where queries only take a few seconds to run. But planning the query can > take as long as running it. Much of this time is spent in metadata > operations. > # Due to scale limitations we have never allowed tasks to communicate > directly with the metastore. However, with the development of LLAP this > requirement will have to be relaxed. If we can relax this there are other > use cases that could benefit from this. > # Eating our own dogfood. Rather than using external systems to store our > metadata there are benefits to using other components in the Hadoop system. > The proposal is to create a new branch and work on the prototype there. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-9452) Use HBase to store Hive metadata
[ https://issues.apache.org/jira/browse/HIVE-9452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Armstrong reassigned HIVE-9452: --- Assignee: Alan Gates (was: Tim Armstrong) > Use HBase to store Hive metadata > > > Key: HIVE-9452 > URL: https://issues.apache.org/jira/browse/HIVE-9452 > Project: Hive > Issue Type: Improvement > Components: Metastore >Affects Versions: hbase-metastore-branch >Reporter: Alan Gates >Assignee: Alan Gates >Priority: Major > Attachments: HBaseMetastoreApproach.pdf > > > qThis is an umbrella JIRA for a project to explore using HBase to store the > Hive data catalog (ie the metastore). This project has several goals: > # The current metastore implementation is slow when tables have thousands or > more partitions. With Tez and Spark engines we are pushing Hive to a point > where queries only take a few seconds to run. But planning the query can > take as long as running it. Much of this time is spent in metadata > operations. > # Due to scale limitations we have never allowed tasks to communicate > directly with the metastore. However, with the development of LLAP this > requirement will have to be relaxed. If we can relax this there are other > use cases that could benefit from this. > # Eating our own dogfood. Rather than using external systems to store our > metadata there are benefits to using other components in the Hadoop system. > The proposal is to create a new branch and work on the prototype there. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-18863) trunc() calls itself trunk() in an error message
[ https://issues.apache.org/jira/browse/HIVE-18863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386886#comment-16386886 ] Tim Armstrong commented on HIVE-18863: -- The name of the function in the error message is wrong. trunc != trunk. > trunc() calls itself trunk() in an error message > > > Key: HIVE-18863 > URL: https://issues.apache.org/jira/browse/HIVE-18863 > Project: Hive > Issue Type: Bug > Components: UDF >Reporter: Tim Armstrong >Priority: Minor > Labels: newbie > > {noformat} > > select trunc('millennium', cast('2001-02-16 20:38:40' as timestamp)) > FAILED: SemanticException Line 0:-1 Argument type mismatch ''2001-02-16 > 20:38:40'': trunk() only takes STRING/CHAR/VARCHAR types as second argument, > got TIMESTAMP > {noformat} > I saw this on a derivative of Hive 1.1.0 (cdh5.15.0), but the string still > seems to be present on master: > https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTrunc.java#L262 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HIVE-18863) trunc() calls itself trunk() in an error message
[ https://issues.apache.org/jira/browse/HIVE-18863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386579#comment-16386579 ] Tim Armstrong commented on HIVE-18863: -- The bug is in the JIRA title. > trunc() calls itself trunk() in an error message > > > Key: HIVE-18863 > URL: https://issues.apache.org/jira/browse/HIVE-18863 > Project: Hive > Issue Type: Bug > Components: UDF >Reporter: Tim Armstrong >Priority: Minor > Labels: newbie > > {noformat} > > select trunc('millennium', cast('2001-02-16 20:38:40' as timestamp)) > FAILED: SemanticException Line 0:-1 Argument type mismatch ''2001-02-16 > 20:38:40'': trunk() only takes STRING/CHAR/VARCHAR types as second argument, > got TIMESTAMP > {noformat} > I saw this on a derivative of Hive 1.1.0 (cdh5.15.0), but the string still > seems to be present on master: > https://github.com/apache/hive/blob/6d890faf22fd1ede3658a5eed097476eab3c67e9/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFTrunc.java#L262 -- This message was sent by Atlassian JIRA (v7.6.3#76005)