No, you can translate these expressions, but you have to evaluate the
entire expression. For example:
"col1 = 'x' and col2 in (1,2)" becomes col1 = 'x' and col2 in (1,2)
"not(col1 = 'x' and col2 in (1,2))" becomes (col1 != 'x' or col2 not in
(1,2)) and col1 is not null and col2 is not null
I'm working on an update to the spec. We've completed the Java library
implementation end-to-end, so now we have working code that will be
released in 0.10.0. Next step is the spec update to document everything now
that we're confident that it works as expected.
Look for a PR in the next few
Hi Yi,
I think Iceberg could work for you without too much trouble.
You might want to look more into partitioning that Iceberg provides. I
agree that most users want the storage layer to handle partitioning for
them. That's exactly what Iceberg does: it makes data partitioning part of
table
Jungtaek,
I agree with you that we'd ideally get that Spark issue in upstream as soon
as we can. I'm currently porting it to our 2.4 build so we can test it out
with the new sort orders that were added to Iceberg. Once that's done and I
understand the patch a bit better, I'll work on the review
Are you saying that we can't fix this by rewriting expressions to translate
from SQL to more natural semantics?
On Fri, Sep 18, 2020 at 3:28 PM Owen O'Malley
wrote:
> In the SQL world, the second point isn't right. It is still the case that
> not(equal("col", "x")) is notEqual("col", "x").
In the SQL world, the second point isn't right. It is still the case that
not(equal("col", "x")) is notEqual("col", "x"). Boolean logic (well, three
valued logic) in SQL is just strange relative to programming languages:
- null *=* "x" -> null
- null *is distinct from* "x" -> true
-
It would be nice to avoid the problem by changing the semantics of
Iceberg’s notNull, but I don’t think that’s a good idea for 2 main reasons.
First, I think that API users creating expressions directly expect the
current behavior. It would be surprising to a user if a notEqual expression
didn’t
I think that we should follow the SQL semantics to prevent surprises when
SQL engines integrate with Iceberg.
.. Owen
On Thu, Sep 17, 2020 at 9:08 PM Shardul Mahadik
wrote:
> Hi all,
>
> I noticed that Iceberg's predicates are not compatible with SQL predicates
> when it comes to handling NULL