Hi.

I think there are issues with the way hive can currently do LIKE
operator JDO pushdown and it the code should be removed for partitions
and tables.
Are there objections to removing LIKE from Filter.g and related areas?
If no I will file a JIRA and do it.

Details:
There's code in metastore that is capable of pushing down LIKE
expression into JDO for string partition keys, as well as tables.
The code for tables doesn't appear used, and partition code definitely
doesn't run in Hive proper because metastore client doesn't send LIKE
expressions to server. It may be used in e.g. HCat and other places,
but after asking some people here, I found out it probably isn't.
I was trying to make it run and noticed some problems:
1) For partitions, Hive sends SQL patterns in a filter for like, e.g.
"%foo%", whereas metastore passes them into matches() JDOQL method
which expects Java regex.
2) Converting the pattern to Java regex via UDFLike method, I found
out that not all regexes appear to work in DN. ".*foo" seems to work
but anything complex (such as escaping the pattern using
Pattern.quote, which UDFLike does) breaks and no longer matches
properly.
3) I tried to implement common cases using JDO methods
startsWith/endsWith/indexOf (I will file a JIRA), but when I run tests
on Derby, they also appear to have problems with some strings (for
example, partition with backslash in the name cannot be matched by
LIKE "%\%" (single backslash in a string), after being converted to
.indexOf(param) where param is "\" (escaping the backslash once again
doesn't work either, and anyway there's no documented reason why it
shouldn't work properly), while other characters match correctly, even
e.g. "%".

For tables, there's no SQL-like, it expects Java regex, but I am not
convinced all Java regexes are going to work.

So, I think that for future correctness sake it's better to remove this code.

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Reply via email to