[
https://issues.apache.org/jira/browse/PHOENIX-3046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478953#comment-15478953
]
ASF GitHub Bot commented on PHOENIX-3046:
-----------------------------------------
Github user kliewkliew commented on a diff in the pull request:
https://github.com/apache/phoenix/pull/208#discussion_r78269568
--- Diff:
phoenix-core/src/main/java/org/apache/phoenix/compile/ExpressionCompiler.java
---
@@ -523,7 +523,12 @@ public Expression visitLeave(LikeParseNode node,
List<Expression> children) thro
byte[] wildcard = {StringUtil.MULTI_CHAR_LIKE};
StringUtil.fill(nullExpressionString, 0, pattern.length(),
wildcard, 0, 1, false);
if (pattern.equals(new String (nullExpressionString))) {
- return IsNullExpression.create(lhs, true,
context.getTempPtr());
+ if (node.isNegate()) {
+ return LiteralExpression.newConstant(false,
Determinism.ALWAYS);
--- End diff --
The specification is:
```
5) Case:
a) If M and P are character strings whose lengths are variable
and if the lengths of both M and P are 0, then
M LIKE P
is true.
b) The <predicate>
M LIKE P
is true if there exists a partitioning of M into substrings
such that:
i) A substring of M is a sequence of 0 or more contiguous
<character representation>s of M and each <character repre-
sentation> of M is part of exactly one substring.
ii) If the i-th substring specifier of P is an arbitrary char-
acter specifier, the i-th substring of M is any single
<character representation>.
iii) If the i-th substring specifier of P is an arbitrary string
specifier, then the i-th substring of M is any sequence of
0 or more <character representation>s.
iv) If the i-th substring specifier of P is neither an arbi-
trary character specifier nor an arbitrary string speci-
fier, then the i-th substring of M is equal to that sub-
string specifier according to the collating sequence of
the <like predicate>, without the appending of <space>
characters to M, and has the same length as that substring
specifier.
v) The number of substrings of M is equal to the number of
substring specifiers of P.
c) Otherwise,
M LIKE P
is false.
```
Given that `LEN(NULL)` is `NULL`, `WHERE col IS NOT LIKE '%'` fails cases
`a`, `b.i`, and `b.iii`; defaulting to case `c` and always returning false.
However, I looked through the docs again and noticed the following:
```
3) "M NOT LIKE P" is equivalent to "NOT (M LIKE P)".
```
in which case `WHERE col IS *NOT LIKE* '%'` should return the inverse
result set of `WHERE col IS *LIKE* '%'` (and should compile to `WHERE col IS
NULL`).
I might have misinterpreted something but the specification seems to
contradict itself.
> `NOT LIKE '%'` unexpectedly returns results
> -------------------------------------------
>
> Key: PHOENIX-3046
> URL: https://issues.apache.org/jira/browse/PHOENIX-3046
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.7.0
> Reporter: Kevin Liew
> Assignee: Kevin Liew
> Priority: Minor
> Labels: like, like-predicate, phoenix, regex, wildcard, wildcards
> Fix For: 4.9.0, 4.8.1
>
>
> The following returns all rows in the table when it should return no rows:
> {code}select * from emp where first_name not like '%'{code}
> The following returns no rows as expected:
> {code}select * from emp where first_name not like '%%'{code}
> first_name is a VARCHAR column
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)