[
https://issues.apache.org/jira/browse/HIVE-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13793658#comment-13793658
]
N Campbell commented on HIVE-3745:
----------------------------------
The ISO-SQL standard is very clear as to what a vendor may choose to do re
blank padding semantics. Similarly, how operations such as min, max, distinct
etc operate on a variable length character type. Persons simply comparing to
another RDBMS need to compare what ISO states, where it allows 'vendor
implementation' and to see what a given vendor claims.
For example, if you were to use Postgres and other vendors derived from it. You
will find various differences with respect to
length( char (n) ) vs varchar(n)
group by
min
distinct/union ....
To some persons trailing spaces are of no interest and they may assume that one
general string type will 'ignore' spaces. Others may state to their business
application that trailing spaces are significant. That is distinct from what a
given standard states or perhaps what a vendor chooses to implement
irrespective of any given standard.
It would help either way if the Hive QL documentation could be improved to
state intent of a given construct/feature.
> Hive does improper "=" based string comparisons for strings with trailing
> whitespaces
> -------------------------------------------------------------------------------------
>
> Key: HIVE-3745
> URL: https://issues.apache.org/jira/browse/HIVE-3745
> Project: Hive
> Issue Type: Bug
> Components: SQL
> Affects Versions: 0.9.0
> Reporter: Harsh J
> Assignee: Kevin Wilfong
>
> Compared to other systems such as DB2, MySQL, etc., which disregard trailing
> whitespaces in a string used when comparing two strings with the "{{=}}"
> relational operator, Hive does not do this.
> For example, note the following line from the MySQL manual:
> http://dev.mysql.com/doc/refman/5.1/en/char.html
> {quote}
> All MySQL collations are of type PADSPACE. This means that all CHAR and
> VARCHAR values in MySQL are compared without regard to any trailing spaces.
> {quote}
> Hive still is whitespace sensitive and regards trailing spaces of a string as
> worthy elements when comparing. Ideally {{LIKE}} should consider this
> strongly, but {{=}} should not.
> Is there a specific reason behind this difference of implementation in Hive's
> SQL?
--
This message was sent by Atlassian JIRA
(v6.1#6144)