snuyanzin merged PR #24526:
URL: https://github.com/apache/flink/pull/24526
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail:
snuyanzin commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1630300887
##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/ArrayIntersectFunction.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed
liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1629113958
##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/ArrayIntersectFunction.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed
liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1629112803
##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/ArrayIntersectFunction.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed
snuyanzin commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1629035418
##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/util/ObjectContainer.java:
##
@@ -31,21 +31,25 @@
@Internal
public class
snuyanzin commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1629032823
##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/ArrayIntersectFunction.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed
snuyanzin commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1629030390
##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/ArrayIntersectFunction.java:
##
@@ -0,0 +1,105 @@
+/*
+ * Licensed
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2151255284
ci passed @snuyanzin @dawidwys @MartijnVisser will you have a look again?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2149197172
@MartijnVisser have rebased to fix ci
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
MartijnVisser commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2149077614
@liuyongvs I think you need to rebase in order to get the CI to pass
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2148729746
@snuyanzin fix your review, thanks very much
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1626831821
##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/CollectionFunctionsITCase.java:
##
@@ -1723,6 +1724,83 @@ private Stream
liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1626831685
##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/CollectionFunctionsITCase.java:
##
@@ -1723,6 +1724,83 @@ private Stream
snuyanzin commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1625447988
##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/CollectionFunctionsITCase.java:
##
@@ -1723,6 +1724,83 @@ private Stream
snuyanzin commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1625448219
##
flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/functions/CollectionFunctionsITCase.java:
##
@@ -1723,6 +1724,83 @@ private Stream
liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1615393266
##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/ArrayIntersectFunction.java:
##
@@ -0,0 +1,101 @@
+/*
+ * Licensed
davidradl commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1613069596
##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/ArrayIntersectFunction.java:
##
@@ -0,0 +1,101 @@
+/*
+ * Licensed
davidradl commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1613069596
##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/ArrayIntersectFunction.java:
##
@@ -0,0 +1,101 @@
+/*
+ * Licensed
davidradl commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1613069596
##
flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/scalar/ArrayIntersectFunction.java:
##
@@ -0,0 +1,101 @@
+/*
+ * Licensed
liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1612588115
##
.idea/vcs.xml:
##
@@ -22,4 +22,4 @@
Review Comment:
remoted @davidradl
--
This is an automated message from the Apache Git Service.
To
liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1612588115
##
.idea/vcs.xml:
##
@@ -22,4 +22,4 @@
Review Comment:
remoted
--
This is an automated message from the Apache Git Service.
To respond to the
liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1612587871
##
flink-python/pyflink/table/expression.py:
##
@@ -1618,6 +1618,15 @@ def array_except(self, array) -> 'Expression':
"""
return
liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1612586021
##
flink-python/pyflink/table/expression.py:
##
@@ -1618,6 +1618,15 @@ def array_except(self, array) -> 'Expression':
"""
return
liuyongvs commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1612586021
##
flink-python/pyflink/table/expression.py:
##
@@ -1618,6 +1618,15 @@ def array_except(self, array) -> 'Expression':
"""
return
davidradl commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1611987986
##
.idea/vcs.xml:
##
@@ -22,4 +22,4 @@
Review Comment:
can we remove this file from the pr
--
This is an automated message from the Apache
davidradl commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1611983897
##
flink-python/pyflink/table/expression.py:
##
@@ -1618,6 +1618,15 @@ def array_except(self, array) -> 'Expression':
"""
return
davidradl commented on code in PR #24526:
URL: https://github.com/apache/flink/pull/24526#discussion_r1611982523
##
docs/data/sql_functions.yml:
##
@@ -688,6 +688,9 @@ collection:
- sql: ARRAY_EXCEPT(array1, array2)
table: arrayOne.arrayExcept(arrayTwo)
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2126150641
fix conflicts, @twalthr @dawidwys @snuyanzin and will you help review this
pr?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2126150525
Conclusion:
Since there are no objections, then we will support it with deduplication
semantics.
--
This is an automated message from the Apache Git Service.
To respond to the
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2124700300
+1 for deduplicate
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
twalthr commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2124641140
This is indeed a tricky one. I also spend a significant amount of time here.
Every vendor does it differently. I looked also at programming languages such
as PHP, C#. For example, C#
snuyanzin commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2124050180
so far I see this kind of trade-off
### multi-set semantics (with duplicates)
pros
1. It covers more cases as mentioned above
2. there is already implemented
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2123939160
@snuyanzin Indeed, while this might only result in a difference in function
behavior, it would generally be best to align with the practices of the
majority of other engines. If in the
snuyanzin commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2123898530
>@snuyanzin @MartijnVisser Why do we necessarily have to align our semantics
with Snowflake?
as it was mentioned above the main reason is that multi-set semantics
(Snowflake)
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2123732921
@snuyanzin @MartijnVisser Why do we necessarily have to align our semantics
with Snowflake?
i found spark/presto/doris/max_compute all follow the semantics without
duplicate.
snuyanzin commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2122481259
Based on what I mentioned above how about following multi-set semantics like
Snowflake by default
then it will allow to cover more cases (with and without duplicates)?
MartijnVisser commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2122443374
> I failed to find standard approach for collections like arrays (I mean in
SQL Standard).
Same. I don't think it's documented anywhere. So how do we come to a
conclusion
snuyanzin commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2105169563
May be it is an unpopular opinion
however I tend to think that `INTERSECT` vs `INTERSECT ALL` and the same for
others for set and bag semantics is defined for rows and hardly could
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2104468195
hi @dawidwys @snuyanzin WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
MartijnVisser commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2060771945
> Furthermore, array_except was merged in version 1.20, and since 1.20 is
currently only a snapshot version and not officially released, there’s no
concern of causing compatibility
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2041370638
hi @MartijnVisser
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2026479264
@MartijnVisser @dawidwys @snuyanzin I agree with you. That is to say, this
semantic alignment with Spark's is clear, with no duplicate elements involved.
Consequently, I believe it
MartijnVisser commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2025213724
@liuyongvs I disagree: I think that we're looking at what the definition of
INTERSECT in general is, not from a functional or implementation perspective,
but more if there's a
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2024262258
hi @MartijnVisser INTERSECT is different with array_intersect. instersect is
a set RelNode, like union/union all, which indeed is a SQL standard.. while
array_intersect array_union
MartijnVisser commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2022804099
> What is your opinion on how the function should behave?
I've taken a look at how INTERSECT is defined in the SQL standard. Based on
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2022537218
hi @snuyanzin @dawidwys what is your opinion?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2017314639
>
https://clickhouse.com/docs/en/sql-reference/functions/array-functions#arrayintersectarr
from my side, it is not a good idea.
because we can use
snuyanzin commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2014912924
one more idea
some vendors allow to calculate intersections for arbitrary amount of arrays
e.g.
Clickhouse[1]
[1]
dawidwys commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2014857862
Before I review the code let's settle on the behaviour first.
@MartijnVisser What is your opinion on how the function should behave?
Especially in the context of
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2007045444
hi @dawidwys will you help review this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
flinkbot commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2006346388
## CI report:
* 55306725ea542f506a4938e2d437e59687dbec13 UNKNOWN
Bot commands
The @flinkbot bot supports the following commands:
- `@flinkbot run azure`
liuyongvs commented on PR #24526:
URL: https://github.com/apache/flink/pull/24526#issuecomment-2006330781
after discussion with @dawidwys here
https://github.com/apache/flink/pull/23171#issuecomment-1956501651
--
This is an automated message from the Apache Git Service.
To respond to
liuyongvs opened a new pull request, #24526:
URL: https://github.com/apache/flink/pull/24526
- What is the purpose of the change
This is an implementation of ARRAY_INTERSECT
- Brief change log
ARRAY_INTERSECT for Table API and SQL
```
Returns an array of the
53 matches
Mail list logo