Re: [PR] [FLINK-33449][table]Support array_contains_seq function [flink]

2023-12-13 Thread via GitHub


xuyangzhong commented on PR #23656:
URL: https://github.com/apache/flink/pull/23656#issuecomment-1855089363

   Hi, @leoyy0316 and @MartijnVisser . Let's talk in the jira instead of this 
pr. I shared my view there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33449][table]Support array_contains_seq function [flink]

2023-11-12 Thread via GitHub


leoyy0316 commented on PR #23656:
URL: https://github.com/apache/flink/pull/23656#issuecomment-1807423595

   > > I have a question, why we always follow hive and spark? if we thinks is 
a feature need to support, we can implement it firstly. i think having rich 
functions is better
   > 
   > Because we want to make sure that we have a consistent set of functions 
available, with function names that are consistent. If we pick random function 
names from different tools, that because more problematic. It might be that 
Calcite or any of the other tech solves the same problem but with a different 
function and/or different function signature. That's why we always try to look 
at what others are also doing. Then again, this discussion should have happened 
before a PR was opened. Either in the Jira, or on the Dev mailing list. See 
https://flink.apache.org/how-to-contribute/contribute-code/#1-create-jira-ticket-and-reach-consensus
   
   yeah, it is good. trino ,clickhouse, starrocks has the same function. if 
flink teams think we not need to support this function now. pls close this jira 
and pr. thanks 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33449][table]Support array_contains_seq function [flink]

2023-11-10 Thread via GitHub


MartijnVisser commented on PR #23656:
URL: https://github.com/apache/flink/pull/23656#issuecomment-1805337202

   > I have a question, why we always follow hive and spark? if we thinks is a 
feature need to support, we can implement it firstly. i think having rich 
functions is better
   
   Because we want to make sure that we have a consistent set of functions 
available, with function names that are consistent. If we pick random function 
names from different tools, that because more problematic. It might be that 
Calcite or any of the other tech solves the same problem but with a different 
function and/or different function signature. That's why we always try to look 
at what others are also doing. Then again, this discussion should have happened 
before a PR was opened. Either in the Jira, or on the Dev mailing list. See 
https://flink.apache.org/how-to-contribute/contribute-code/#1-create-jira-ticket-and-reach-consensus


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33449][table]Support array_contains_seq function [flink]

2023-11-10 Thread via GitHub


leoyy0316 commented on PR #23656:
URL: https://github.com/apache/flink/pull/23656#issuecomment-1805324411

   > > trino support this function, we need use this function, so can we 
suuport this function?
   > 
   > I don't think Trino is the primary engine we compare ourselves against. 
Does the equivalent of this function exist in other databases/tech like 
Postgres, MySQL, Hive or Spark?
   
   I have a question, why we always follow hive and spark? if we thinks is a 
feature need to support, we can implement it firstly. i think having rich 
functions is better 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33449][table]Support array_contains_seq function [flink]

2023-11-10 Thread via GitHub


MartijnVisser commented on PR #23656:
URL: https://github.com/apache/flink/pull/23656#issuecomment-1805286458

   > trino support this function, we need use this function, so can we suuport 
this function?
   
   I don't think Trino is the primary engine we compare ourselves against. Does 
the equivalent of this function exist in other databases/tech like Postgres, 
MySQL, Hive or Spark?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33449][table]Support array_contains_seq function [flink]

2023-11-04 Thread via GitHub


leoyy0316 commented on PR #23656:
URL: https://github.com/apache/flink/pull/23656#issuecomment-1793390742

   > > I'm not sure that we want to support this function (I have no strong 
opinion there, but I don't see this in other technologies like Hive or Spark). 
However, without tests, this PR shouldn't get merged
   > 
   > i will add ut. trino support this function, we need use this function, so 
can we suuport this function?
   
   also add ut, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33449][table]Support array_contains_seq function [flink]

2023-11-03 Thread via GitHub


MartijnVisser commented on PR #23656:
URL: https://github.com/apache/flink/pull/23656#issuecomment-1792398513

   > we need use this function, so can we suuport this function?
   
   That's not something that we should decide over a PR, but before opening a 
Jira ticket and a PR should be brought up on the Dev mailing list


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33449][table]Support array_contains_seq function [flink]

2023-11-03 Thread via GitHub


leoyy0316 commented on PR #23656:
URL: https://github.com/apache/flink/pull/23656#issuecomment-1792384316

   > I'm not sure that we want to support this function (I have no strong 
opinion there, but I don't see this in other technologies like Hive or Spark). 
However, without tests, this PR shouldn't get merged
   
   i will add ut. trino support this function, we need use this function, so 
can we suuport this function?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-33449][table]Support array_contains_seq function [flink]

2023-11-03 Thread via GitHub


flinkbot commented on PR #23656:
URL: https://github.com/apache/flink/pull/23656#issuecomment-1792360510

   
   ## CI report:
   
   * 31d5c1e7a3896f8b5958dc56486f510c9bfdd263 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-33449][table]Support array_contains_seq function [flink]

2023-11-03 Thread via GitHub


leoyy0316 opened a new pull request, #23656:
URL: https://github.com/apache/flink/pull/23656

   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   This is an implementation of ARRAY_Contains_seq
   
   ## Brief change log
   ARRAY_CONTAINS_SEQ for Table API and SQL
   
   Syntax:
   ARRAY_CONTAINS_SEQ (arr1, arr2) 
   
   Returns:
   The function returns boolean
   
   Examples:
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   Flink SQL> select array_contains_seq(array[1,1,2,3],array[1,2]);
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluster with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org