date:20220519

[GitHub] [flink-kubernetes-operator] morhidi commented on pull request #231: [FLINK-27647] Improve Metrics documentation to include newly added metrics

2022-05-19 Thread GitBox



morhidi commented on PR #231:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/231#issuecomment-1132539384

   cc @mbalassi @wangyang0918 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-27647) Improve Metrics documentation to include newly added metrics

2022-05-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27647:
---
Labels: pull-request-available  (was: )

> Improve Metrics documentation to include newly added metrics
> 
>
> Key: FLINK-27647
> URL: https://issues.apache.org/jira/browse/FLINK-27647
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Matyas Orhidi
>Priority: Major
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.0.0
>
>
> We now support a few operator specific metrics out of the box, we should 
> improve the metrics documentation to highlight these



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[GitHub] [flink-kubernetes-operator] morhidi opened a new pull request, #231: [FLINK-27647] Improve Metrics documentation to include newly added metrics

2022-05-19 Thread GitBox



morhidi opened a new pull request, #231:
URL: https://github.com/apache/flink-kubernetes-operator/pull/231

   Added Operator specific metrics description


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] KarmaGYZ commented on pull request #17873: [FLINK-25009][CLI] Output slotSharingGroup as part of JsonGraph

2022-05-19 Thread GitBox



KarmaGYZ commented on PR #17873:
URL: https://github.com/apache/flink/pull/17873#issuecomment-1132533635

   > > @xinbinhuang Thanks for the update. Could you explain why you need to 
derive the total task slots required from the JSON graph?
   > 
   > Our team wants to derive the total slots required for the job before 
redeploying. This help avoid the situation where the old job is stopped and the 
new job doesn't have enough resources to start. With this change we can make 
sure the cluster has enough resources for the new job first before stopping the 
old one. In general, I think slots sharing groups is an useful information to 
better understand the job for debugging and performance tuning. WDYT?
   
   I'm afraid that even with this information, you cannot calculate the exact 
number of slots your job requires because the slot sharing group will not 
restrict the scheduling of operators[1]. It is just a hint for the scheduling 
and might not be obeyed in the future. However, I agree that this is indeed 
helpful information. @godfreyhe WDYT about the compatibility issue introduced 
by this PR?
   
   [1] 
https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/finegrained_resource/#notice


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] Vancior commented on a diff in pull request #19743: [FLINK-27657][python] Implement remote operator state backend in PyFlink

2022-05-19 Thread GitBox



Vancior commented on code in PR #19743:
URL: https://github.com/apache/flink/pull/19743#discussion_r877785593


##
flink-python/pyflink/datastream/state.py:
##
@@ -281,6 +283,110 @@ def __iter__(self) -> Iterator[K]:
 return iter(self.keys())
 
 
+class ReadOnlyBroadcastState(State, Generic[K, V]):
+"""
+A read-only view of the :class:`BroadcastState`.
+Although read-only, the user code should not modify the value returned by 
the :meth:`get` or the
+items returned by :meth:`items`, as this can lead to inconsistent states. 
The reason for this is
+that we do not create extra copies of the elements for performance reasons.
+"""
+
+@abstractmethod
+def get(self, key: K) -> V:
+"""
+Returns the current value associated with the given key.
+"""
+pass
+
+@abstractmethod
+def contains(self, key: K) -> bool:
+"""
+Returns whether there exists the given mapping.
+"""
+pass
+
+@abstractmethod
+def items(self) -> Iterable[Tuple[K, V]]:
+"""
+Returns all the mappings in the state.
+"""
+pass
+
+@abstractmethod
+def keys(self) -> Iterable[K]:

Review Comment:
   Since users already have `items()` interface, `keys()` and `values()` just 
making it more like `dict` in python, without not much extra code giving that 
the underlying state runtime implementation is actually map state too.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a diff in pull request #230: [build] Add doc build for new release branch

2022-05-19 Thread GitBox



wangyang0918 commented on code in PR #230:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/230#discussion_r877783426


##
.github/workflows/docs.yaml:
##
@@ -41,9 +42,11 @@ jobs:
   echo "flink_branch=${currentBranch}"
   echo "flink_branch=${currentBranch}" >> ${GITHUB_ENV}
   if [ "${currentBranch} = "main" ]; then
-echo "flink_alias=release-1.0" >> ${GITHUB_ENV}
-  elif [ "${currentBranch} = "release-0.1" ]; then
+echo "flink_alias=release-1.1" >> ${GITHUB_ENV}

Review Comment:
   It seems that the step `upload documentation alias` not really works.
   
   
https://github.com/apache/flink-kubernetes-operator/runs/6516829906?check_suite_focus=true



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] Vancior commented on a diff in pull request #19743: [FLINK-27657][python] Implement remote operator state backend in PyFlink

2022-05-19 Thread GitBox



Vancior commented on code in PR #19743:
URL: https://github.com/apache/flink/pull/19743#discussion_r83328


##
flink-python/pyflink/datastream/__init__.py:
##
@@ -141,6 +141,11 @@
 - :class:`state.AggregatingState`:
   Interface for aggregating state, based on an :class:`AggregateFunction`. 
Elements that are
   added to this type of state will be eagerly pre-aggregated using a given 
AggregateFunction.
+- :class:`state.BroadcastState`:
+  A type of state that can be created to store the state of a 
:class:`BroadcastStream`. This
+  state assumes that the same elements are sent to all instances of an 
operator.
+- :class:`state.ReadOnlyBroadcastState`:

Review Comment:
   Might be convenient for finding what operations can be done?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a diff in pull request #229: [hotfix] Change version of documentation in update_branch_version.sh

2022-05-19 Thread GitBox



wangyang0918 commented on code in PR #229:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/229#discussion_r89395


##
tools/releasing/update_branch_version.sh:
##
@@ -64,7 +64,11 @@ perl -pi -e "s#^ENV OPERATOR_VERSION=.*#ENV 
OPERATOR_VERSION=${NEW_VERSION}#" Do
 # change Helm chart version info
 perl -pi -e "s#^version: .*#version: ${NEW_VERSION}#" 
helm/flink-kubernetes-operator/Chart.yaml
 perl -pi -e "s#^appVersion: .*#appVersion: ${NEW_VERSION}#" 
helm/flink-kubernetes-operator/Chart.yaml
-
+#change version of documentation
+cd docs
+perl -pi -e "s#^  Version = .*#  Version = \"${NEW_VERSION}\"#" config.toml
+perl -pi -e "s#^  VersionTitle = .*#  VersionTitle = \"${NEW_VERSION}\"#" 
config.toml
+cd ..

Review Comment:
   I do not mean to update the release branch, but the main branch. It is still 
`1.0-SNAPSHOT`, not `1.1-SNAPSHOT`.
   
   
https://github.com/apache/flink-kubernetes-operator/blob/main/docs/config.toml#L37



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-27504) State compaction not happening with sliding window and incremental RocksDB backend

2022-05-19 Thread Yun Tang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yun Tang closed FLINK-27504.

Resolution: Information Provided

> State compaction not happening with sliding window and incremental RocksDB 
> backend
> --
>
> Key: FLINK-27504
> URL: https://issues.apache.org/jira/browse/FLINK-27504
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.14.4
> Environment: Local Flink cluster on Arch Linux.
>Reporter: Alexis Sarda-Espinosa
>Priority: Major
> Attachments: duration_trend_52ca77c.png, duration_trend_67c76bb.png, 
> duration_trend_c5dd5d2.png, image-2022-05-06-10-34-35-007.png, 
> size_growth_52ca77c.png, size_growth_67c76bb.png, size_growth_c5dd5d2.png
>
>
> Hello,
> I'm trying to estimate an upper bound for RocksDB's state size in my 
> application. For that purpose, I have created a small job with faster timings 
> whose code you can find on GitHub: 
> [https://github.com/asardaes/flink-rocksdb-ttl-test]. You can see some of the 
> results there, but I summarize here as well:
>  * Approximately 20 events per second, 10 unique keys for partitioning are 
> pre-specified.
>  * Sliding window of 11 seconds with a 1-second slide.
>  * Allowed lateness of 11 seconds.
>  * State TTL configured to 1 minute and compaction after 1000 entries.
>  * Both window-specific and window-global state used.
>  * Checkpoints every 2 seconds.
>  * Parallelism of 4 in stateful tasks.
> The goal is to let the job run and analyze state compaction behavior with 
> RocksDB. I should note that global state is cleaned manually inside the 
> functions, TTL for those is in case some keys are no longer seen in the 
> actual production environment.
> I have been running the job on a local cluster (outside IDE), the 
> configuration YAML is also available in the repository. After running for 
> approximately 1.6 days, state size is currently 2.3 GiB (see attachments). I 
> understand state can retain expired data for a while, but since TTL is 1 
> minute, this seems excessive to me.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-27504) State compaction not happening with sliding window and incremental RocksDB backend

2022-05-19 Thread Yun Tang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539944#comment-17539944
 ] 

Yun Tang commented on FLINK-27504:
--

[~asardaes], one specific round of compaction is just for one specific column 
family (would not involve other column family). Thus, different column familes 
could have different compaction settings.

> State compaction not happening with sliding window and incremental RocksDB 
> backend
> --
>
> Key: FLINK-27504
> URL: https://issues.apache.org/jira/browse/FLINK-27504
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / State Backends
>Affects Versions: 1.14.4
> Environment: Local Flink cluster on Arch Linux.
>Reporter: Alexis Sarda-Espinosa
>Priority: Major
> Attachments: duration_trend_52ca77c.png, duration_trend_67c76bb.png, 
> duration_trend_c5dd5d2.png, image-2022-05-06-10-34-35-007.png, 
> size_growth_52ca77c.png, size_growth_67c76bb.png, size_growth_c5dd5d2.png
>
>
> Hello,
> I'm trying to estimate an upper bound for RocksDB's state size in my 
> application. For that purpose, I have created a small job with faster timings 
> whose code you can find on GitHub: 
> [https://github.com/asardaes/flink-rocksdb-ttl-test]. You can see some of the 
> results there, but I summarize here as well:
>  * Approximately 20 events per second, 10 unique keys for partitioning are 
> pre-specified.
>  * Sliding window of 11 seconds with a 1-second slide.
>  * Allowed lateness of 11 seconds.
>  * State TTL configured to 1 minute and compaction after 1000 entries.
>  * Both window-specific and window-global state used.
>  * Checkpoints every 2 seconds.
>  * Parallelism of 4 in stateful tasks.
> The goal is to let the job run and analyze state compaction behavior with 
> RocksDB. I should note that global state is cleaned manually inside the 
> functions, TTL for those is in case some keys are no longer seen in the 
> actual production environment.
> I have been running the job on a local cluster (outside IDE), the 
> configuration YAML is also available in the repository. After running for 
> approximately 1.6 days, state size is currently 2.3 GiB (see attachments). I 
> understand state can retain expired data for a while, but since TTL is 1 
> minute, this seems excessive to me.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Resolved] (FLINK-25950) Delete retry mechanism from ZooKeeperUtils.deleteZNode

2022-05-19 Thread Matthias Pohl (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias Pohl resolved FLINK-25950.
---
Resolution: Fixed

master: c57b84921447bb0ade5e1ff77a05ebd8bbbe71b7

> Delete retry mechanism from ZooKeeperUtils.deleteZNode
> --
>
> Key: FLINK-25950
> URL: https://issues.apache.org/jira/browse/FLINK-25950
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Coordination
>Affects Versions: 1.15.0
>Reporter: Matthias Pohl
>Assignee: jackwangcs
>Priority: Major
>  Labels: pull-request-available
>
> {{ZooKeeperUtils.deleteZNode}} implements a retry loop that is not necessary 
> for curator version 4.0.1+. This code can be cleaned up



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[GitHub] [flink] XComp merged pull request #19666: [FLINK-25950][Runtime] Delete retry mechanism from ZooKeeperUtils.deleteZNode

2022-05-19 Thread GitBox



XComp merged PR #19666:
URL: https://github.com/apache/flink/pull/19666


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] Vancior commented on a diff in pull request #19743: [FLINK-27657][python] Implement remote operator state backend in PyFlink

2022-05-19 Thread GitBox



Vancior commented on code in PR #19743:
URL: https://github.com/apache/flink/pull/19743#discussion_r83328


##
flink-python/pyflink/datastream/__init__.py:
##
@@ -141,6 +141,11 @@
 - :class:`state.AggregatingState`:
   Interface for aggregating state, based on an :class:`AggregateFunction`. 
Elements that are
   added to this type of state will be eagerly pre-aggregated using a given 
AggregateFunction.
+- :class:`state.BroadcastState`:
+  A type of state that can be created to store the state of a 
:class:`BroadcastStream`. This
+  state assumes that the same elements are sent to all instances of an 
operator.
+- :class:`state.ReadOnlyBroadcastState`:

Review Comment:
   Might be convenient for type hint?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] Vancior commented on a diff in pull request #19743: [FLINK-27657][python] Implement remote operator state backend in PyFlink

2022-05-19 Thread GitBox



Vancior commented on code in PR #19743:
URL: https://github.com/apache/flink/pull/19743#discussion_r82504


##
flink-python/src/main/java/org/apache/flink/streaming/api/runners/python/beam/state/BeamStateHandler.java:
##
@@ -23,6 +23,35 @@
 /** Interface for doing actual operations on Flink state based on {@link 
BeamFnApi.StateRequest}. */
 public interface BeamStateHandler {
 
-BeamFnApi.StateResponse.Builder handle(BeamFnApi.StateRequest request, S 
state)
+/**
+ * Dispatches {@link BeamFnApi.StateRequest} to different handle functions 
base on request case.
+ */
+default BeamFnApi.StateResponse.Builder handle(BeamFnApi.StateRequest 
request, S state)

Review Comment:
   That's a good point.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] Vancior commented on a diff in pull request #19743: [FLINK-27657][python] Implement remote operator state backend in PyFlink

2022-05-19 Thread GitBox



Vancior commented on code in PR #19743:
URL: https://github.com/apache/flink/pull/19743#discussion_r82151


##
flink-python/pyflink/datastream/functions.py:
##
@@ -119,6 +118,22 @@ def get_aggregating_state(
 pass
 
 
+class OperatorStateStore(ABC):

Review Comment:
   I'm just following where `KeyedStateStore` is, what do you think?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] Vancior commented on a diff in pull request #19743: [FLINK-27657][python] Implement remote operator state backend in PyFlink

2022-05-19 Thread GitBox



Vancior commented on code in PR #19743:
URL: https://github.com/apache/flink/pull/19743#discussion_r80129


##
flink-python/pyflink/datastream/state.py:
##
@@ -281,6 +283,110 @@ def __iter__(self) -> Iterator[K]:
 return iter(self.keys())
 
 
+class ReadOnlyBroadcastState(State, Generic[K, V]):
+"""
+A read-only view of the :class:`BroadcastState`.
+Although read-only, the user code should not modify the value returned by 
the :meth:`get` or the
+items returned by :meth:`items`, as this can lead to inconsistent states. 
The reason for this is
+that we do not create extra copies of the elements for performance reasons.
+"""
+
+@abstractmethod
+def get(self, key: K) -> V:
+"""
+Returns the current value associated with the given key.
+"""
+pass
+
+@abstractmethod
+def contains(self, key: K) -> bool:
+"""
+Returns whether there exists the given mapping.
+"""
+pass
+
+@abstractmethod
+def items(self) -> Iterable[Tuple[K, V]]:
+"""
+Returns all the mappings in the state.
+"""
+pass
+
+@abstractmethod
+def keys(self) -> Iterable[K]:

Review Comment:
   We don't have the exactly same interfaces for `MapState`, which might be 
more intuitive for python users, so I just align `BroadcastState` with 
`MapState`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-27060) Extending /jars/:jarid/run API to support setting Flink configs

2022-05-19 Thread ConradJam (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539937#comment-17539937
 ] 

ConradJam commented on FLINK-27060:
---

Hi [~martijnvisser] , I would like to move this work forward , It seems that 
the author who asked this question is no longer concerned with this issue . can 
i take this ticket ? 

> Extending /jars/:jarid/run API to support setting Flink configs
> ---
>
> Key: FLINK-27060
> URL: https://issues.apache.org/jira/browse/FLINK-27060
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / REST
>Reporter: Zhanghao Chen
>Priority: Major
>
> *Background*
> Users want to submit job via Flink REST API instead of Flink CLI which is 
> more heavy-weight in certain scenarios, for example, a lightweight data 
> processing workflow system that has Flink related systems.
> Currently, the /jars/:jarid/run API 
> ([https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jars-jarid-run)]
>  only supports a few selected Flink config options listed in the doc 
> (parallelism、savepoint path and allow non-restored state), which is 
> insufficient for practical use.
> *Proposed Changes*
> Extending the /jars/:jarid/run API with an additional request body parameter 
> "configs", which is a map of flink configuration option-value pairs set by 
> users.
> For backward compatibility, we can retain the existing body parameters like 
> "allowNonRestoredState", and when there's conflicting configurations, let the 
> values set explicitly with existing body parameters take higher precedence 
> over the values set by configs.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Closed] (FLINK-26895) Improve cancelWithSavepoint timeout error message

2022-05-19 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-26895.
--
Resolution: Won't Fix

Not relevant with 1.15 anymore

> Improve cancelWithSavepoint timeout error message
> -
>
> Key: FLINK-26895
> URL: https://issues.apache.org/jira/browse/FLINK-26895
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Priority: Major
> Fix For: kubernetes-operator-1.0.0
>
>
> If a cancelwithsavepoint operation times out during a savepoint upgrade step, 
> the user gets an error but the savepoint information is not persisted in the 
> status.
> We should add an informative error telling the user how to resolve this 
> situation (switch to last-state mode or look up the savepoint manually and 
> recreate the deployment)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Closed] (FLINK-27686) Only patch status when the status actually changed

2022-05-19 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-27686.
--
Resolution: Fixed

merged to
main 0e537308975f29a6dc129d047f3db337122cc3ca
release-1.0 8f4cc547b898804852f82b5ee058e46e5c853537

> Only patch status when the status actually changed
> --
>
> Key: FLINK-27686
> URL: https://issues.apache.org/jira/browse/FLINK-27686
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Critical
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.0.0
>
>
> The StatusHelper class currently always patches the status regardless if it 
> changed or not.
> We should use an ObjectMapper and simply compare the ObjectNode 
> representations and only patch if there is any change.
>  
> (I think we cannot directly compare the status objects because some of the 
> content comes from getters and are not part of the equals implementation)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[GitHub] [flink-kubernetes-operator] gyfora merged pull request #228: [FLINK-27686] Only patch status on change

2022-05-19 Thread GitBox



gyfora merged PR #228:
URL: https://github.com/apache/flink-kubernetes-operator/pull/228


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #230: [build] Add doc build for new release branch

2022-05-19 Thread GitBox



gyfora commented on PR #230:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/230#issuecomment-1132511577

   cc @mbalassi @wangyang0918 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] deadwind4 commented on pull request #19774: [FLINK-27711][python][connector/pulsar] Align setTopicPattern for Pulsar Connector

2022-05-19 Thread GitBox



deadwind4 commented on PR #19774:
URL: https://github.com/apache/flink/pull/19774#issuecomment-1132506576

   I will update the doc in this ticket 
https://issues.apache.org/jira/browse/FLINK-27690


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #19774: [FLINK-27711][python][connector/pulsar] Align setTopicPattern for Pulsar Connector

2022-05-19 Thread GitBox



flinkbot commented on PR #19774:
URL: https://github.com/apache/flink/pull/19774#issuecomment-1132505009

   
   ## CI report:
   
   * 9071427ed48c3e005d42d64318945fa6a44f6eb0 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] gyfora merged pull request #229: [hotfix] Change version of documentation in update_branch_version.sh

2022-05-19 Thread GitBox



gyfora merged PR #229:
URL: https://github.com/apache/flink-kubernetes-operator/pull/229


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-27711) Align setTopicPattern for Pulsar Connector

2022-05-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27711:
---
Labels: pull-request-available  (was: )

> Align setTopicPattern for Pulsar Connector
> --
>
> Key: FLINK-27711
> URL: https://issues.apache.org/jira/browse/FLINK-27711
> Project: Flink
>  Issue Type: Improvement
>  Components: API / DataStream, API / Python
>Affects Versions: 1.15.0
>Reporter: LuNng Wang
>Priority: Major
>  Labels: pull-request-available
>
> Update set_topics_pattern to set_topic_pattern



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-27712) Job failed to start due to "Time should be non negative"

2022-05-19 Thread Sharon Xie (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sharon Xie updated FLINK-27712:
---
Description: 
Happened intermittently. A restart made the issue go away.

Stack trace:
{code:java}
Time should be non negative
at org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:138)
at 
org.apache.flink.runtime.throughput.ThroughputEMA.calculateThroughput(ThroughputEMA.java:44)
at 
org.apache.flink.runtime.throughput.ThroughputCalculator.calculateThroughput(ThroughputCalculator.java:80)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.debloat(StreamTask.java:792)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$scheduleBufferDebloater$4(StreamTask.java:784)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:338)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:324)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:809)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:761)
at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:937)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
{code}
 

JobManager error log is attached.

 

Maybe related to [Flink-25454|https://issues.apache.org/jira/browse/FLINK-25454]

  was:
Happened intermittently. A restart made the issue go away.

Stack trace:
Time should be non negative
at org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:138)
at 
org.apache.flink.runtime.throughput.ThroughputEMA.calculateThroughput(ThroughputEMA.java:44)
at 
org.apache.flink.runtime.throughput.ThroughputCalculator.calculateThroughput(ThroughputCalculator.java:80)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.debloat(StreamTask.java:792)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$scheduleBufferDebloater$4(StreamTask.java:784)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:338)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:324)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:809)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:761)
at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:937)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)


 

JobManager error log is attached.

 

Maybe related to Flink-25465


> Job failed to start due to "Time should be non negative"
> 
>
> Key: FLINK-27712
> URL: https://issues.apache.org/jira/browse/FLINK-27712
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.14.4
>Reporter: Sharon Xie
>Priority: Major
> Attachments: flink_error.log.txt
>
>
> Happened intermittently. A restart made the issue go away.
> Stack trace:
> {code:java}
> Time should be non negative
> at org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:138)
> at 
> org.apache.flink.runtime.throughput.ThroughputEMA.calculateThroughput(ThroughputEMA.java:44)
> at 
> org.apache.flink.runtime.throughput.ThroughputCalculator.calculateThroughput(ThroughputCalculator.java:80)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.debloat(StreamTask.java:792)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$scheduleBufferDebloater$4(StreamTask.java:784)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
> at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
> at 
>

[GitHub] [flink] deadwind4 opened a new pull request, #19774: [FLINK-27711][python][connector/pulsar] Align setTopicPattern for Pulsar Connector

2022-05-19 Thread GitBox



deadwind4 opened a new pull request, #19774:
URL: https://github.com/apache/flink/pull/19774

   ## What is the purpose of the change
   
   set_topics_pattern is a typo, so I update set_topics_pattern to 
set_topic_pattern in pulsar.py
   
   ## Brief change log
   
 - *Add set_topic_pattern method in pulsar.py*
 - *set set_topics_pattern deprecated*
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
 - *Added test that validates set_topic_pattern*
 - *Added test that validates set_topics_pattern is deprecated *
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (PyDocs)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-27712) Job failed to start due to "Time should be non negative"

2022-05-19 Thread Sharon Xie (Jira)

Sharon Xie created FLINK-27712:
--

 Summary: Job failed to start due to "Time should be non negative"
 Key: FLINK-27712
 URL: https://issues.apache.org/jira/browse/FLINK-27712
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Network
Affects Versions: 1.14.4
Reporter: Sharon Xie
 Attachments: flink_error.log.txt

Happened intermittently. A restart made the issue go away.

Stack trace:
Time should be non negative
at org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:138)
at 
org.apache.flink.runtime.throughput.ThroughputEMA.calculateThroughput(ThroughputEMA.java:44)
at 
org.apache.flink.runtime.throughput.ThroughputCalculator.calculateThroughput(ThroughputCalculator.java:80)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.debloat(StreamTask.java:792)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$scheduleBufferDebloater$4(StreamTask.java:784)
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsWhenDefaultActionUnavailable(MailboxProcessor.java:338)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:324)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:809)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:761)
at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:937)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)


 

JobManager error log is attached.

 

Maybe related to Flink-25465



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[GitHub] [flink] alpreu commented on pull request #19655: [FLINK-27527] Create a file-based upsert sink for testing internal components

2022-05-19 Thread GitBox



alpreu commented on PR #19655:
URL: https://github.com/apache/flink/pull/19655#issuecomment-1132500442

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-27711) Align setTopicPattern for Pulsar Connector

2022-05-19 Thread LuNng Wang (Jira)

LuNng Wang created FLINK-27711:
--

 Summary: Align setTopicPattern for Pulsar Connector
 Key: FLINK-27711
 URL: https://issues.apache.org/jira/browse/FLINK-27711
 Project: Flink
  Issue Type: Improvement
  Components: API / DataStream, API / Python
Affects Versions: 1.15.0
Reporter: LuNng Wang


Update set_topics_pattern to set_topic_pattern



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-27646) Create Roadmap page for Flink Kubernetes operator

2022-05-19 Thread Gyula Fora (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539930#comment-17539930
 ] 

Gyula Fora commented on FLINK-27646:


I think we have two alternatives here:

1. Include the roadmap as part of the Apache Flink Roadmap Page: 
[https://flink.apache.org/roadmap.html#stateful-functions]
2. Create our own roadmap page in kubernetes operator docs and link it to the 
Flink roadmap page

 

What do you think?

> Create Roadmap page for Flink Kubernetes operator
> -
>
> Key: FLINK-27646
> URL: https://issues.apache.org/jira/browse/FLINK-27646
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: ConradJam
>Priority: Major
> Fix For: kubernetes-operator-1.0.0
>
>
> We should create a dedicated wiki page for the current roadmap of the 
> operator and link it to the overview page in our docs.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[GitHub] [flink] Sxnan commented on a diff in pull request #19653: [FLINK-27523] Runtime supports producing and consuming cached intermediate results

2022-05-19 Thread GitBox



Sxnan commented on code in PR #19653:
URL: https://github.com/apache/flink/pull/19653#discussion_r877747182


##
flink-runtime/src/main/java/org/apache/flink/runtime/shuffle/ShuffleMaster.java:
##
@@ -84,6 +86,33 @@ CompletableFuture registerPartitionWithProducer(
 PartitionDescriptor partitionDescriptor,
 ProducerDescriptor producerDescriptor);
 
+/**
+ * Returns all the shuffle descriptors for the partitions in the 
intermediate data set with the
+ * given id.
+ *
+ * @param intermediateDataSetID The id of the intermediate data set.
+ * @return all the shuffle descriptors for the partitions in the 
intermediate data set. Null if
+ * not exist.
+ */
+default Collection getClusterPartitionShuffleDescriptors(
+IntermediateDataSetID intermediateDataSetID) {
+return Collections.emptyList();
+}
+
+/**
+ * Promote the given partition to cluster partition.
+ *
+ * @param shuffleDescriptor The shuffle descriptors of the partition to 
promote.
+ */
+default void promotePartition(ShuffleDescriptor shuffleDescriptor) {}
+
+/**
+ * Remove the given partition from cluster partition.
+ *
+ * @param shuffleDescriptor The shuffle descriptors of the cluster 
partition to be removed.
+ */
+default void removeClusterPartition(ShuffleDescriptor shuffleDescriptor) {}

Review Comment:
   Thanks for the comment. You are right that we can call 
releasePartitionExternally to release the cluster partitions as well. 
   
   The `removeClusterPartition` method is removed. 



##
flink-runtime/src/main/java/org/apache/flink/runtime/shuffle/ShuffleMaster.java:
##
@@ -84,6 +86,33 @@ CompletableFuture registerPartitionWithProducer(
 PartitionDescriptor partitionDescriptor,
 ProducerDescriptor producerDescriptor);
 
+/**
+ * Returns all the shuffle descriptors for the partitions in the 
intermediate data set with the
+ * given id.
+ *
+ * @param intermediateDataSetID The id of the intermediate data set.
+ * @return all the shuffle descriptors for the partitions in the 
intermediate data set. Null if
+ * not exist.
+ */
+default Collection getClusterPartitionShuffleDescriptors(

Review Comment:
   By "partition tracker", I assume it means the `JobMasterPartitionTracker`. 
Currently, the `JobMasterPartitionTracker` is only used to keep track of the 
partition during job execution and issue release calls to task executors and 
shuffle masters. It does not keep track of the ShuffleDescriptor of the cluster 
partition. And its lifecycle is bound to a job. Therefore, it is not suitable 
for keeping track of the mapping from IntermediateDataSetID to 
ShuffleDescriptor across job. 
   
   This method is used by the Job that consumes the cluster partitions produced 
by some previous jobs to get the Shuffle Descriptors of the cluster partitions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #229: [hotfix] Change version of documentation in update_branch_version.sh

2022-05-19 Thread GitBox



gyfora commented on code in PR #229:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/229#discussion_r877749413


##
tools/releasing/update_branch_version.sh:
##
@@ -64,7 +64,11 @@ perl -pi -e "s#^ENV OPERATOR_VERSION=.*#ENV 
OPERATOR_VERSION=${NEW_VERSION}#" Do
 # change Helm chart version info
 perl -pi -e "s#^version: .*#version: ${NEW_VERSION}#" 
helm/flink-kubernetes-operator/Chart.yaml
 perl -pi -e "s#^appVersion: .*#appVersion: ${NEW_VERSION}#" 
helm/flink-kubernetes-operator/Chart.yaml
-
+#change version of documentation
+cd docs
+perl -pi -e "s#^  Version = .*#  Version = \"${NEW_VERSION}\"#" config.toml
+perl -pi -e "s#^  VersionTitle = .*#  VersionTitle = \"${NEW_VERSION}\"#" 
config.toml
+cd ..

Review Comment:
   On a second thought, it wouldnt really hurt (it's currently also out of date 
on the main branch). For rc branches it should not change anything if the 
release branch is already updated correctly



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #229: [hotfix] Change version of documentation in update_branch_version.sh

2022-05-19 Thread GitBox



gyfora commented on code in PR #229:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/229#discussion_r877748770


##
tools/releasing/update_branch_version.sh:
##
@@ -64,7 +64,11 @@ perl -pi -e "s#^ENV OPERATOR_VERSION=.*#ENV 
OPERATOR_VERSION=${NEW_VERSION}#" Do
 # change Helm chart version info
 perl -pi -e "s#^version: .*#version: ${NEW_VERSION}#" 
helm/flink-kubernetes-operator/Chart.yaml
 perl -pi -e "s#^appVersion: .*#appVersion: ${NEW_VERSION}#" 
helm/flink-kubernetes-operator/Chart.yaml
-
+#change version of documentation
+cd docs
+perl -pi -e "s#^  Version = .*#  Version = \"${NEW_VERSION}\"#" config.toml
+perl -pi -e "s#^  VersionTitle = .*#  VersionTitle = \"${NEW_VERSION}\"#" 
config.toml
+cd ..

Review Comment:
   I don't think we should do this here. The docs are built from the release 
branch (which otherwise has the snapshot version). I think we need to update 
the config manually when cutting the release branch like you did



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #228: [FLINK-27686] Only patch status on change

2022-05-19 Thread GitBox



gyfora commented on code in PR #228:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/228#discussion_r877733469


##
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/utils/StatusHelper.java:
##
@@ -64,6 +67,14 @@ public > void 
patchAndCacheStatus(T resource
 // in the meantime
 resource.getMetadata().setResourceVersion(null);
 
+ObjectNode newStatus = objectMapper.convertValue(resource.getStatus(), 
ObjectNode.class);
+ObjectNode previousStatus = statusCache.put(getKey(resource), 
newStatus);
+
+if (previousStatus == null || newStatus.equals(previousStatus)) {

Review Comment:
   Very good catch you are right @Aitozi 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-17295) Refactor the ExecutionAttemptID to consist of ExecutionVertexID and attemptNumber

2022-05-19 Thread Zhu Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539914#comment-17539914
 ] 

Zhu Zhu edited comment on FLINK-17295 at 5/20/22 4:31 AM:
--

I was not concerning about it because PartitionRequests are sent and processed 
by TMs in a distributed way, unlike TDD shipping which relies on the master 
node.

To further eliminating concerns, I did an experiment to see the job execution 
time with/without the proposed change.

The testing job is a word count job of no data, parallelism=1000, run 3 times 
for each case. The measured execution time is from deploying started to job 
finishes (to exclude container allocation time to stabilize the result).

The average execution time with the change is 7.4s (7.6,7.5,7.0). The average 
execution time without the change is 7.9s(8.7,6.9,8.0). Therefore, I think the 
change is acceptable.


was (Author: zhuzh):
I was not concerning about it because PartitionRequests are sent and processed 
by TMs in a distributed way, unlike TDD shipping which relies on the master 
node.

To further eliminating concerns, I did an experiment to see the job execution 
time with/without the proposed change. The testing job is a word count job of 
no data, parallelism=1000, run 3 times for each case. The average execution 
time with the change is 7.4s (7.6,7.5,7.0). The average execution time without 
the change is 7.9s(8.7,6.9,8.0). Therefore, I think the change is acceptable.

> Refactor the ExecutionAttemptID to consist of ExecutionVertexID and 
> attemptNumber
> -
>
> Key: FLINK-17295
> URL: https://issues.apache.org/jira/browse/FLINK-17295
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
>
> Make the ExecutionAttemptID being composed of (ExecutionVertexID, 
> attemptNumber).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[GitHub] [flink] flinkbot commented on pull request #19773: [FLINK-24735][sql-client] LocalExecutor should catch Throwable rathe…

2022-05-19 Thread GitBox



flinkbot commented on PR #19773:
URL: https://github.com/apache/flink/pull/19773#issuecomment-1132447231

   
   ## CI report:
   
   * 32a8b3baba8200dd3b3f5b91cf9c35288e7383fd UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuzhurk commented on a diff in pull request #19747: [FLINK-17295][runtime] Refactor the ExecutionAttemptID to consist of ExecutionGraphID, ExecutionVertexID and attemptNumber

2022-05-19 Thread GitBox



zhuzhurk commented on code in PR #19747:
URL: https://github.com/apache/flink/pull/19747#discussion_r877719574


##
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/DefaultExecutionGraph.java:
##
@@ -364,6 +373,8 @@ public DefaultExecutionGraph(
 this.resultPartitionsById = new HashMap<>();
 
 this.isDynamic = isDynamic;
+
+LOG.info("Created execution graph {}.", executionGraphId);

Review Comment:
   done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-24735) SQL client crashes with `Cannot add expression of different type to set`

2022-05-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-24735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-24735:
---
Labels: pull-request-available  (was: )

> SQL client crashes with `Cannot add expression of different type to set`
> 
>
> Key: FLINK-24735
> URL: https://issues.apache.org/jira/browse/FLINK-24735
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Client
>Affects Versions: 1.14.0
>Reporter: Martijn Visser
>Assignee: Shengkai Fang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.16.0, 1.14.5
>
>
> Reproductions steps:
> 1. Download airports.csv from https://www.kaggle.com/usdot/flight-delays
> 2. Start Flink SQL client and create table
> {code:sql}
> CREATE TABLE `airports` (
>   `IATA_CODE` CHAR(3),
>   `AIRPORT` STRING,
>   `CITY` STRING,
>   `STATE` CHAR(2),
>   `COUNTRY` CHAR(3),
>   `LATITUDE` DOUBLE NULL,
>   `LONGITUDE` DOUBLE NULL,
>   PRIMARY KEY (`IATA_CODE`) NOT ENFORCED
> ) WITH (
>   'connector' = 'filesystem',
>   'path' = 
> 'file:///flink-sql-cookbook/other-builtin-functions/04_override_table_options/airports.csv',
>   'format' = 'csv'
> );
> {code}
> 3. Run the following SQL statement:
> {code:sql}
> SELECT * FROM `airports` /*+ OPTIONS('csv.ignore-parse-errors'='true') */ 
> WHERE COALESCE(`IATA_CODE`, `AIRPORT`) IS NULL;
> {code}
> Stacktrace:
> {code:bash}
> Exception in thread "main" org.apache.flink.table.client.SqlClientException: 
> Unexpected exception. This is a bug. Please consider filing an issue.
>   at 
> org.apache.flink.table.client.SqlClient.startClient(SqlClient.java:201)
>   at org.apache.flink.table.client.SqlClient.main(SqlClient.java:161)
> Caused by: java.lang.AssertionError: Cannot add expression of different type 
> to set:
> set type is RecordType(CHAR(3) CHARACTER SET "UTF-16LE" NOT NULL IATA_CODE, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" AIRPORT, VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE" CITY, CHAR(2) CHARACTER SET "UTF-16LE" STATE, 
> CHAR(3) CHARACTER SET "UTF-16LE" COUNTRY, DOUBLE LATITUDE, DOUBLE LONGITUDE) 
> NOT NULL
> expression type is RecordType(CHAR(3) CHARACTER SET "UTF-16LE" IATA_CODE, 
> VARCHAR(2147483647) CHARACTER SET "UTF-16LE" AIRPORT, VARCHAR(2147483647) 
> CHARACTER SET "UTF-16LE" CITY, CHAR(2) CHARACTER SET "UTF-16LE" STATE, 
> CHAR(3) CHARACTER SET "UTF-16LE" COUNTRY, DOUBLE LATITUDE, DOUBLE LONGITUDE) 
> NOT NULL
> set is rel#426:LogicalProject.NONE.any.None: 
> 0.[NONE].[NONE](input=HepRelVertex#425,inputs=0..6)
> expression is LogicalProject(IATA_CODE=[null:CHAR(3) CHARACTER SET 
> "UTF-16LE"], AIRPORT=[$1], CITY=[$2], STATE=[$3], COUNTRY=[$4], 
> LATITUDE=[$5], LONGITUDE=[$6])
>   LogicalFilter(condition=[IS NULL(CAST($0):VARCHAR(2147483647) CHARACTER SET 
> "UTF-16LE")])
> LogicalTableScan(table=[[default_catalog, default_database, airports]], 
> hints=[[[OPTIONS inheritPath:[] options:{csv.ignore-parse-errors=true}]]])
>   at 
> org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:381)
>   at 
> org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:58)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:268)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:283)
>   at 
> org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:310)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:333)
>   at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:542)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:407)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:243)
>   at 
> org.apache.calcite.plan.hep.HepInstruction$RuleInstance.execute(HepInstruction.java:127)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:202)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:189)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkHepProgram.optimize(FlinkHepProgram.scala:69)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkHepRuleSetProgram.optimize(FlinkHepRuleSetProgram.scala:87)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram.$anonfun$optimize$1(FlinkChainedProgram.scala:62)
>   at 
> scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:156)
>   at 
> scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:156)
>   at scala.collection.Iterator.foreach(Iterator.scala:937)
>   at scala.collection.

[GitHub] [flink] fsk119 opened a new pull request, #19773: [FLINK-24735][sql-client] LocalExecutor should catch Throwable rathe…

2022-05-19 Thread GitBox



fsk119 opened a new pull request, #19773:
URL: https://github.com/apache/flink/pull/19773

   …r than Exception
   
   
   
   ## What is the purpose of the change
   
   *Catch throwable in the SQL Client to escape the crash.*
   
   ## Verifying this change
   
   Please make sure both new and modified tests in this PR follows the 
conventions defined in our code quality guide: 
https://flink.apache.org/contributing/code-style-and-quality-common.html#testing
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ **not documented**)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuzhurk commented on a diff in pull request #19747: [FLINK-17295][runtime] Refactor the ExecutionAttemptID to consist of ExecutionGraphID, ExecutionVertexID and attemptNumber

2022-05-19 Thread GitBox



zhuzhurk commented on code in PR #19747:
URL: https://github.com/apache/flink/pull/19747#discussion_r877718674


##
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/DefaultExecutionGraph.java:
##
@@ -364,6 +373,8 @@ public DefaultExecutionGraph(
 this.resultPartitionsById = new HashMap<>();
 
 this.isDynamic = isDynamic;
+
+LOG.info("Created execution graph {}.", executionGraphId);

Review Comment:
   Agreed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuzhurk commented on a diff in pull request #19747: [FLINK-17295][runtime] Refactor the ExecutionAttemptID to consist of ExecutionGraphID, ExecutionVertexID and attemptNumber

2022-05-19 Thread GitBox



zhuzhurk commented on code in PR #19747:
URL: https://github.com/apache/flink/pull/19747#discussion_r877718479


##
flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/ResultPartitionID.java:
##
@@ -44,7 +47,12 @@ public final class ResultPartitionID implements Serializable 
{
 
 @VisibleForTesting
 public ResultPartitionID() {
-this(new IntermediateResultPartitionID(), new ExecutionAttemptID());
+this(
+new IntermediateResultPartitionID(),
+new ExecutionAttemptID(
+new ExecutionGraphID(),
+new ExecutionVertexID(new JobVertexID(0, 0), 0),
+0));

Review Comment:
   Ok. I've added an `ExecutionAttemptID#randomId()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuzhurk commented on a diff in pull request #19747: [FLINK-17295][runtime] Refactor the ExecutionAttemptID to consist of ExecutionGraphID, ExecutionVertexID and attemptNumber

2022-05-19 Thread GitBox



zhuzhurk commented on code in PR #19747:
URL: https://github.com/apache/flink/pull/19747#discussion_r877716691


##
flink-runtime/src/main/java/org/apache/flink/runtime/executiongraph/ExecutionAttemptID.java:
##
@@ -68,19 +117,32 @@ public boolean equals(Object obj) {
 return true;
 } else if (obj != null && obj.getClass() == getClass()) {
 ExecutionAttemptID that = (ExecutionAttemptID) obj;
-return that.executionAttemptId.equals(this.executionAttemptId);
+return that.executionGraphId.equals(this.executionGraphId)
+&& that.executionVertexId.equals(this.executionVertexId)
+&& that.attemptNumber == this.attemptNumber;
 } else {
 return false;
 }
 }
 
 @Override
 public int hashCode() {
-return executionAttemptId.hashCode();
+return Objects.hash(executionGraphId, executionVertexId, 
attemptNumber);
 }
 
 @Override
 public String toString() {
-return executionAttemptId.toString();
+return String.format(
+"%s_%s_%d", executionGraphId.toString(), executionVertexId, 
attemptNumber);
+}
+
+public String getLogString() {
+if (DefaultExecutionGraph.LOG.isDebugEnabled()) {
+return toString();
+} else {
+return String.format(
+"%s_%s_%d",
+executionGraphId.toString().substring(0, 4), 
executionVertexId, attemptNumber);

Review Comment:
   FLINK-27710 is created to refine the representation of Execution.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-27710) Improve logs to better display Execution

2022-05-19 Thread Zhu Zhu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-27710:
---

Assignee: Zhu Zhu

> Improve logs to better display Execution
> 
>
> Key: FLINK-27710
> URL: https://issues.apache.org/jira/browse/FLINK-27710
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination, Runtime / Task
>Affects Versions: 1.16.0
>Reporter: Zhu Zhu
>Assignee: Zhu Zhu
>Priority: Major
> Fix For: 1.16.0
>
>
> Currently, an execution is usually represented as "{{{}job vertex name{}}} 
> ({{{}subtaskIndex+1{}}}/{{{}vertex parallelism{}}}) ({{{}attemptId{}}})" in 
> logs, which may be redundant after this refactoring work. With the change of 
> FLINK-17295, the representation of Execution in logs will be redundant. e.g. 
> the subtask index is displayed 2 times.
> Therefore, I'm proposing to change the format to be "{{{}job vertex name{}}} 
> ({{{}short ExecutionGraphID{}}}:{{{}JobVertexID{}}}) 
> ({{{}subtaskIndex+1{}}}/{{{}vertex parallelism{}}}) ({{{}#attemptNumber{}}})" 
> and avoid directly display the {{{}ExecutionAttemptID{}}}. This can increase 
> the log readability.
> Besides that, the displayed {{JobVertexID}} can also help to distinguish job 
> vertices of the same name, which is common in DataStream jobs (e.g. multiple 
> {{{}Map{}}}).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27710) Improve logs to better display Execution

2022-05-19 Thread Zhu Zhu (Jira)

Zhu Zhu created FLINK-27710:
---

 Summary: Improve logs to better display Execution
 Key: FLINK-27710
 URL: https://issues.apache.org/jira/browse/FLINK-27710
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination, Runtime / Task
Affects Versions: 1.16.0
Reporter: Zhu Zhu
 Fix For: 1.16.0


Currently, an execution is usually represented as "{{{}job vertex name{}}} 
({{{}subtaskIndex+1{}}}/{{{}vertex parallelism{}}}) ({{{}attemptId{}}})" in 
logs, which may be redundant after this refactoring work. With the change of 
FLINK-17295, the representation of Execution in logs will be redundant. e.g. 
the subtask index is displayed 2 times.

Therefore, I'm proposing to change the format to be "{{{}job vertex name{}}} 
({{{}short ExecutionGraphID{}}}:{{{}JobVertexID{}}}) 
({{{}subtaskIndex+1{}}}/{{{}vertex parallelism{}}}) ({{{}#attemptNumber{}}})" 
and avoid directly display the {{{}ExecutionAttemptID{}}}. This can increase 
the log readability.

Besides that, the displayed {{JobVertexID}} can also help to distinguish job 
vertices of the same name, which is common in DataStream jobs (e.g. multiple 
{{{}Map{}}}).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-17295) Refactor the ExecutionAttemptID to consist of ExecutionVertexID and attemptNumber

2022-05-19 Thread Zhu Zhu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-17295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539914#comment-17539914
 ] 

Zhu Zhu commented on FLINK-17295:
-

I was not concerning about it because PartitionRequests are sent and processed 
by TMs in a distributed way, unlike TDD shipping which relies on the master 
node.

To further eliminating concerns, I did an experiment to see the job execution 
time with/without the proposed change. The testing job is a word count job of 
no data, parallelism=1000, run 3 times for each case. The average execution 
time with the change is 7.4s (7.6,7.5,7.0). The average execution time without 
the change is 7.9s(8.7,6.9,8.0). Therefore, I think the change is acceptable.

> Refactor the ExecutionAttemptID to consist of ExecutionVertexID and 
> attemptNumber
> -
>
> Key: FLINK-17295
> URL: https://issues.apache.org/jira/browse/FLINK-17295
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / Coordination
>Reporter: Yangze Guo
>Assignee: Zhu Zhu
>Priority: Major
>  Labels: pull-request-available
>
> Make the ExecutionAttemptID being composed of (ExecutionVertexID, 
> attemptNumber).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[GitHub] [flink-table-store] tsreaper commented on a diff in pull request #121: [FLINK-27626] Introduce AggregatuibMergeFunction

2022-05-19 Thread GitBox



tsreaper commented on code in PR #121:
URL: https://github.com/apache/flink-table-store/pull/121#discussion_r877715408


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/FileStoreImpl.java:
##
@@ -201,17 +204,28 @@ public static FileStoreImpl createWithPrimaryKey(
 .collect(Collectors.toList()));
 
 MergeFunction mergeFunction;
+Map rightConfMap =
+options.getFilterConf(e -> 
e.getKey().endsWith(".aggregate-function"));

Review Comment:
   Not sure what you mean. Could you provide an example? What is an `extended 
aggregate function`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-table-store] tsreaper commented on a diff in pull request #121: [FLINK-27626] Introduce AggregatuibMergeFunction

2022-05-19 Thread GitBox



tsreaper commented on code in PR #121:
URL: https://github.com/apache/flink-table-store/pull/121#discussion_r877715115


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/compact/AggregateFunction.java:
##
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.mergetree.compact;
+
+import java.io.Serializable;
+
+/**
+ * 自定义的列聚合抽象类.
+ *
+ * @param 
+ */
+public interface AggregateFunction extends Serializable {
+// T aggregator;
+
+T getResult();
+
+default void init() {
+reset();
+}
+
+void reset();
+
+default void aggregate(Object value) {
+aggregate(value, true);
+}
+
+void aggregate(Object value, boolean add);
+
+void reset(Object value);

Review Comment:
   Oops, I meant to comment under the `init` method.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] wangyang0918 commented on pull request #229: [hotfix] Change version of documentation in update_branch_version.sh

2022-05-19 Thread GitBox



wangyang0918 commented on PR #229:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/229#issuecomment-1132438195

   cc @gyfora 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-kubernetes-operator] wangyang0918 opened a new pull request, #229: [hotfix] Change version of documentation in update_branch_version.sh

2022-05-19 Thread GitBox



wangyang0918 opened a new pull request, #229:
URL: https://github.com/apache/flink-kubernetes-operator/pull/229

   We also need to change version of documentation in 
`update_branch_version.sh`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] luoyuxia commented on a diff in pull request #19182: [FLINK-26771][Hive] Hive dialect supports to compare boolean type with numeric/string type

2022-05-19 Thread GitBox



luoyuxia commented on code in PR #19182:
URL: https://github.com/apache/flink/pull/19182#discussion_r877711250


##
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserSqlFunctionConverter.java:
##
@@ -415,24 +425,73 @@ public static SqlOperator getCalciteFn(
 OperandTypes.PLUS_OPERATOR);
 break;
 default:
-calciteOp = HIVE_TO_CALCITE.get(hiveUdfName);
-if (null == calciteOp) {
-calciteOp =
-new CalciteSqlFn(
-uInf.udfName,
-uInf.identifier,
-SqlKind.OTHER_FUNCTION,
-uInf.returnTypeInference,
-uInf.operandTypeInference,
-uInf.operandTypeChecker,
-SqlFunctionCategory.USER_DEFINED_FUNCTION,
-deterministic);
+// some functions should be handled as Hive UDF for
+// the Hive specific logic
+if (shouldHandleAsHiveUDF(uInf) && canHandledAsHiveUDF(uInf, 
functionConverter)) {
+calciteOp = asCalciteSqlFn(uInf, deterministic);
+} else {
+calciteOp = HIVE_TO_CALCITE.get(hiveUdfName);

Review Comment:
   But about the comments, I think it's fine to not to change it for some some 
comments in this file are started with a capital letter and some are not.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-27709) Add comment to schema

2022-05-19 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-27709:
-
Description: We can add comment to schema for table. See 
`CatalogBaseTable.getComment`.  (was: We can add comment to schema for table.)

> Add comment to schema
> -
>
> Key: FLINK-27709
> URL: https://issues.apache.org/jira/browse/FLINK-27709
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.2.0
>
>
> We can add comment to schema for table. See `CatalogBaseTable.getComment`.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[GitHub] [flink-table-store] JingsongLi opened a new pull request, #130: [FLINK-27709] Add comment to schema

2022-05-19 Thread GitBox



JingsongLi opened a new pull request, #130:
URL: https://github.com/apache/flink-table-store/pull/130

   We can add comment to schema for table.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-27709) Add comment to schema

2022-05-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27709:
---
Labels: pull-request-available  (was: )

> Add comment to schema
> -
>
> Key: FLINK-27709
> URL: https://issues.apache.org/jira/browse/FLINK-27709
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jingsong Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.2.0
>
>
> We can add comment to schema for table.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27709) Add comment to schema

2022-05-19 Thread Jingsong Lee (Jira)

Jingsong Lee created FLINK-27709:


 Summary: Add comment to schema
 Key: FLINK-27709
 URL: https://issues.apache.org/jira/browse/FLINK-27709
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Reporter: Jingsong Lee
Assignee: Jingsong Lee
 Fix For: table-store-0.2.0


We can add comment to schema for table.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[GitHub] [flink-table-store] openinx commented on pull request #99: [FLINK-27307] Flink table store support append-only ingestion without primary keys.

2022-05-19 Thread GitBox



openinx commented on PR #99:
URL: https://github.com/apache/flink-table-store/pull/99#issuecomment-1132426379

   Closing this PR now because we've merge most of the changes. Thanks all for 
the reviewing & checking.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-table-store] openinx closed pull request #99: [FLINK-27307] Flink table store support append-only ingestion without primary keys.

2022-05-19 Thread GitBox



openinx closed pull request #99: [FLINK-27307] Flink table store support 
append-only ingestion without primary keys. 
URL: https://github.com/apache/flink-table-store/pull/99


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] deadwind4 closed pull request #18454: [hotfix][connector/pulsar] Fix typo in JavaDocs example.

2022-05-19 Thread GitBox



deadwind4 closed pull request #18454: [hotfix][connector/pulsar] Fix typo in 
JavaDocs  example.
URL: https://github.com/apache/flink/pull/18454


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] luoyuxia commented on a diff in pull request #19562: [FLINK-27304][hive] Calcite's varbinary type should be converted to Hive's binary type.

2022-05-19 Thread GitBox



luoyuxia commented on code in PR #19562:
URL: https://github.com/apache/flink/pull/19562#discussion_r877701927


##
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/planner/delegation/hive/copy/HiveParserTypeConverter.java:
##
@@ -289,6 +289,7 @@ private static TypeInfo convertPrimitiveType(RelDataType 
rType) {
 case INTERVAL_SECOND:
 return hiveShim.getIntervalDayTimeTypeInfo();
 case BINARY:
+case VARBINARY:
 return TypeInfoFactory.binaryTypeInfo;

Review Comment:
   Thanks for your reminer. 
   Yes, we also should all them. And we have also a jira for 
TIME_WITH_LOCAL_TIME_ZONE 
[FLINK-23224](https://issues.apache.org/jira/browse/FLINK-23224).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-27708) Add background compaction task for append-only table when ingesting.

2022-05-19 Thread Zheng Hu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated FLINK-27708:
-
Fix Version/s: table-store-0.2.0

> Add background compaction task for append-only table when ingesting.
> 
>
> Key: FLINK-27708
> URL: https://issues.apache.org/jira/browse/FLINK-27708
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Zheng Hu
>Priority: Major
> Fix For: table-store-0.2.0
>
>
> We could still execute compaction task to merge small files in the background 
> for append-only table, although it won't reduce any delete markers when 
> comparing to the merge tree table.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27708) Add background compaction task for append-only table when ingesting.

2022-05-19 Thread Zheng Hu (Jira)

Zheng Hu created FLINK-27708:


 Summary: Add background compaction task for append-only table when 
ingesting.
 Key: FLINK-27708
 URL: https://issues.apache.org/jira/browse/FLINK-27708
 Project: Flink
  Issue Type: Sub-task
Reporter: Zheng Hu


We could still execute compaction task to merge small files in the background 
for append-only table, although it won't reduce any delete markers when 
comparing to the merge tree table.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[GitHub] [flink-table-store] ajian2002 commented on a diff in pull request #121: [FLINK-27626] Introduce AggregatuibMergeFunction

2022-05-19 Thread GitBox



ajian2002 commented on code in PR #121:
URL: https://github.com/apache/flink-table-store/pull/121#discussion_r877701752


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/compact/AggregateFunction.java:
##
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.mergetree.compact;
+
+import java.io.Serializable;
+
+/**
+ * 自定义的列聚合抽象类.
+ *
+ * @param 
+ */
+public interface AggregateFunction extends Serializable {
+// T aggregator;
+
+T getResult();
+
+default void init() {
+reset();
+}
+
+void reset();
+
+default void aggregate(Object value) {
+aggregate(value, true);
+}
+
+void aggregate(Object value, boolean add);
+
+void reset(Object value);

Review Comment:
   We should consider keeping some common methods and prepare for future 
features



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-table-store] ajian2002 commented on a diff in pull request #121: [FLINK-27626] Introduce AggregatuibMergeFunction

2022-05-19 Thread GitBox



ajian2002 commented on code in PR #121:
URL: https://github.com/apache/flink-table-store/pull/121#discussion_r877701051


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/FileStoreImpl.java:
##
@@ -201,17 +204,28 @@ public static FileStoreImpl createWithPrimaryKey(
 .collect(Collectors.toList()));
 
 MergeFunction mergeFunction;
+Map rightConfMap =
+options.getFilterConf(e -> 
e.getKey().endsWith(".aggregate-function"));

Review Comment:
   Should I also consider adding extended aggregate functions that may also 
need to use methods to obtain the same ConfMap



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] lsyldliu commented on pull request #19410: [FLINK-15635][table] Now EnvironmentSettings accepts the user ClassLoader

2022-05-19 Thread GitBox



lsyldliu commented on PR #19410:
URL: https://github.com/apache/flink/pull/19410#issuecomment-1132420587

   > 
   
   ping @matriv Can you help to complete this work as soon as possible? 
[FLIP-214](https://cwiki.apache.org/confluence/display/FLINK/FLIP-214+Support+Advanced+Function+DDL)
 is blocked by this issue now, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-27638) failed to join with table function

2022-05-19 Thread Shengkai Fang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shengkai Fang closed FLINK-27638.
-
Resolution: Not A Problem

> failed to join with table function
> --
>
> Key: FLINK-27638
> URL: https://issues.apache.org/jira/browse/FLINK-27638
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.14.3
>Reporter: Spongebob
>Priority: Major
> Attachments: image-2022-05-16-19-40-23-179.png
>
>
> # register two table function named `FUNC_A` and `FUNC_B`
>  # left join with FUNC_A
>  # inner join with FUNC_B
>  # schedule the dml
> after these steps, I found the task of FUNC_A keeped running but the task of 
> FUNC_B turned to be finished in serveral seconds. And I am not sure that the 
> unnormal task lead to empty output of the dml.
> !image-2022-05-16-19-40-23-179.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-27638) failed to join with table function

2022-05-19 Thread Shengkai Fang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539901#comment-17539901
 ] 

Shengkai Fang commented on FLINK-27638:
---

The source in the graph is  Values source? If so, the source outputs all 
data(send 13B), the task is finished. I think it's not a bug.

> failed to join with table function
> --
>
> Key: FLINK-27638
> URL: https://issues.apache.org/jira/browse/FLINK-27638
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.14.3
>Reporter: Spongebob
>Priority: Major
> Attachments: image-2022-05-16-19-40-23-179.png
>
>
> # register two table function named `FUNC_A` and `FUNC_B`
>  # left join with FUNC_A
>  # inner join with FUNC_B
>  # schedule the dml
> after these steps, I found the task of FUNC_A keeped running but the task of 
> FUNC_B turned to be finished in serveral seconds. And I am not sure that the 
> unnormal task lead to empty output of the dml.
> !image-2022-05-16-19-40-23-179.png!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-27706) Refactor all subclasses of FileStoreTableITCase to use the batchSql.

2022-05-19 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-27706:
---
Labels: pull-request-available  (was: )

> Refactor all subclasses of FileStoreTableITCase to use the batchSql.
> 
>
> Key: FLINK-27706
> URL: https://issues.apache.org/jira/browse/FLINK-27706
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Zheng Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.2.0
>
>
> Since we've introduced a batchSql to execute batch query for flink in 
> FileStoreTableITCase.  Then all the subclasses can just use batch sql to 
> submit the flink sql.
> It's a minor issue.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[GitHub] [flink-table-store] openinx opened a new pull request, #129: [FLINK-27706] Refactor all subclasses of FileStoreTableITCase to use the batchSql

2022-05-19 Thread GitBox



openinx opened a new pull request, #129:
URL: https://github.com/apache/flink-table-store/pull/129

   A minor factor for the unit tests to reuse the `batchSql`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-27707) Implement TableStoreFactory#onCompactTable

2022-05-19 Thread Jane Chan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-27707:
--
Description: Perform the latest scan and pick data files to compact.  (was: 
Perform scan and pick data files to compact)

> Implement TableStoreFactory#onCompactTable
> --
>
> Key: FLINK-27707
> URL: https://issues.apache.org/jira/browse/FLINK-27707
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.2.0
>
>
> Perform the latest scan and pick data files to compact.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27707) Implement TableStoreFactory#onCompactTable

2022-05-19 Thread Jane Chan (Jira)

Jane Chan created FLINK-27707:
-

 Summary: Implement TableStoreFactory#onCompactTable
 Key: FLINK-27707
 URL: https://issues.apache.org/jira/browse/FLINK-27707
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Affects Versions: table-store-0.2.0
Reporter: Jane Chan
 Fix For: table-store-0.2.0


Perform scan and pick data files to compact



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-27639) Flink JOIN uses the now() function when inserting data, resulting in data that cannot be deleted

2022-05-19 Thread Shengkai Fang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539899#comment-17539899
 ] 

Shengkai Fang commented on FLINK-27639:
---

Hi. Could you also share the test data with us and we can also test in our 
local environment?

> Flink JOIN uses the now() function when inserting data, resulting in data 
> that cannot be deleted
> 
>
> Key: FLINK-27639
> URL: https://issues.apache.org/jira/browse/FLINK-27639
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / API
>Affects Versions: 1.14.4
>Reporter: lvycc
>Priority: Major
>
> I use the now() function as the field value when I insert data using SQL ，but 
> I can't delete the inserted data，here is my sql：
> {code:java}
> //代码占位符
> CREATE TABLE t_order (
>     order_id INT,
>     order_name STRING,
>     product_id INT,
>     user_id INT,
>     PRIMARY KEY(order_id) NOT ENFORCED
> ) WITH (
>     'connector' = 'mysql-cdc',
>     'hostname' = 'localhost',
>     'port' = '3306',
>     'username' = 'root',
>     'password' = 'ycc123',
>     'database-name' = 'wby_test',
>     'table-name' = 't_order'
> );
> CREATE TABLE t_logistics (
>     logistics_id INT,
>     logistics_target STRING,
>     logistics_source STRING,
>     logistics_time TIMESTAMP(0),
>     order_id INT,
>     PRIMARY KEY(logistics_id) NOT ENFORCED
> ) WITH (
>     'connector' = 'mysql-cdc',
>     'hostname' = 'localhost',
>     'port' = '3306',
>     'username' = 'root',
>     'password' = 'ycc123',
>     'database-name' = 'wby_test',
>     'table-name' = 't_logistics'
> );
> CREATE TABLE t_join_sink (
>     order_id INT,
>     order_name STRING,
>     logistics_id INT,
>     logistics_target STRING,
>     logistics_source STRING,
>     logistics_time timestamp,
>     PRIMARY KEY(order_id) NOT ENFORCED
> ) WITH (
>     'connector' = 'jdbc',
>     'url' = 
> 'jdbc:mysql://localhost:3306/wby_test?characterEncoding=utf8&useUnicode=true&useSSL=false&serverTimezone=Asia/Shanghai',
>     'table-name' = 't_join_sink',
>     'username' = 'root',
>     'password' = 'ycc123'
> );
> INSERT INTO t_join_sink
> SELECT ord.order_id,
> ord.order_name,
> logistics.logistics_id,
> logistics.logistics_target,
> logistics.logistics_source,
> now()
> FROM t_order AS ord
> LEFT JOIN t_logistics AS logistics ON ord.order_id=logistics.order_id; {code}
> The debug finds that SinkUpsertMaterializer causes the problem ，the result of 
> the now() function changes when I delete the data，therefore, the delete 
> operation is ignored
> But what can I do to avoid this problem？



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Comment Edited] (FLINK-27557) Let FileStoreWrite accept pre-planned manifest entries

2022-05-19 Thread Jane Chan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539352#comment-17539352
 ] 

Jane Chan edited comment on FLINK-27557 at 5/20/22 3:18 AM:


{quote}Can we just use overwrite?
 * overwrite specific manifest entries.
 * overwrite whole table or some whole partition. (eg: when rescale in 
compaction){quote}
Overwrite means the writer cannot accept the specified manifest entries as 
restored files, so how to pass them to FileStoreCommit as compact before(mark 
as delete) is a new question. Beyond that, overwrite corresponds to the commit 
kind "OVERWRITE", but it should be more suitable to use "COMPACT" in this 
situation.

I rethink it, and maybe we don't need to generate the new files for the 
non-rescale compaction.
 * The simplest way is to build the levels from the restored files, don't sink 
records to MemTable, and submit compaction when precommit is invoked.
 * In the future, maybe we can introduce a new dedicated compaction source to 
restore LSM, perform compaction and commit, which avoids the shuffle cost.

 

 

 


was (Author: qingyue):
{quote}Can we just use overwrite?
 * overwrite specific manifest entries.
 * overwrite whole table or some whole partition. (eg: when rescale in 
compaction){quote}
Overwrite means the writer cannot accept the specified manifest entries as 
restored files, so how to pass them to FileStoreCommit as compact before(mark 
as delete) is a new question. Beyond that, overwrite corresponds to the commit 
kind "OVERWRITE", but it should be more suitable to use "COMPACT" in this 
situation.

I rethink it and maybe we don't need to generate the new files for the normal 
compaction.
 * The simplest way is to build the levels from the restored files, and don't 
sink record to MemTable, and submit compaction when precommit is invoked.
 * In the future, maybe we can introduce a new dedicated compaction source to 
restore LSM, perform compaction and commit, which avoids the shuffle cost.

 

 

 

> Let FileStoreWrite accept pre-planned manifest entries
> --
>
> Key: FLINK-27557
> URL: https://issues.apache.org/jira/browse/FLINK-27557
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.2.0
>
> Attachments: normal-compaction.png, optimized-normal-compaction.png, 
> rescale-bucket-compaction.png
>
>
> Currently, FileStoreWrite will scan and plan files when creating a non-empty 
> writer. We should also create a non-empty writer for non-rescale compaction 
> cases but use the pre-planned manifest entries instead.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-27557) Let FileStoreWrite accept pre-planned manifest entries

2022-05-19 Thread Jane Chan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-27557:
--
Description: Currently, FileStoreWrite will scan and plan files when 
creating a non-empty writer. We should also create a non-empty writer for 
non-rescale compaction cases but use the pre-planned manifest entries instead.  
(was: Currently, FileStoreWrite will scan and plan files when creating a 
non-empty writer. We should also create a non-empty writer for normal 
compaction cases but use the pre-planned manifest entries instead.)

> Let FileStoreWrite accept pre-planned manifest entries
> --
>
> Key: FLINK-27557
> URL: https://issues.apache.org/jira/browse/FLINK-27557
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.2.0
>
> Attachments: normal-compaction.png, optimized-normal-compaction.png, 
> rescale-bucket-compaction.png
>
>
> Currently, FileStoreWrite will scan and plan files when creating a non-empty 
> writer. We should also create a non-empty writer for non-rescale compaction 
> cases but use the pre-planned manifest entries instead.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-27706) Refactor all subclasses of FileStoreTableITCase to use the batchSql.

2022-05-19 Thread Zheng Hu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated FLINK-27706:
-
Fix Version/s: table-store-0.2.0

> Refactor all subclasses of FileStoreTableITCase to use the batchSql.
> 
>
> Key: FLINK-27706
> URL: https://issues.apache.org/jira/browse/FLINK-27706
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Zheng Hu
>Priority: Major
> Fix For: table-store-0.2.0
>
>
> Since we've introduced a batchSql to execute batch query for flink in 
> FileStoreTableITCase.  Then all the subclasses can just use batch sql to 
> submit the flink sql.
> It's a minor issue.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Created] (FLINK-27706) Refactor all subclasses of FileStoreTableITCase to use the batchSql.

2022-05-19 Thread Zheng Hu (Jira)

Zheng Hu created FLINK-27706:


 Summary: Refactor all subclasses of FileStoreTableITCase to use 
the batchSql.
 Key: FLINK-27706
 URL: https://issues.apache.org/jira/browse/FLINK-27706
 Project: Flink
  Issue Type: Sub-task
Reporter: Zheng Hu


Since we've introduced a batchSql to execute batch query for flink in 
FileStoreTableITCase.  Then all the subclasses can just use batch sql to submit 
the flink sql.

It's a minor issue.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-27557) Let FileStoreWrite accept pre-planned manifest entries

2022-05-19 Thread Jane Chan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-27557:
--
Description: Currently, FileStoreWrite will scan and plan files when 
creating a non-empty writer. We should also create a non-empty writer for 
normal compaction cases but use the pre-planned manifest entries instead.  
(was: Currently, FileStoreWrite will scan and plan files when creating a 
non-empty writer. For normal compaction cases, we should also create a 
non-empty writer but use the pre-planned manifest entries instead.)

> Let FileStoreWrite accept pre-planned manifest entries
> --
>
> Key: FLINK-27557
> URL: https://issues.apache.org/jira/browse/FLINK-27557
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.2.0
>
> Attachments: normal-compaction.png, optimized-normal-compaction.png, 
> rescale-bucket-compaction.png
>
>
> Currently, FileStoreWrite will scan and plan files when creating a non-empty 
> writer. We should also create a non-empty writer for normal compaction cases 
> but use the pre-planned manifest entries instead.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-27557) Let FileStoreWrite accept pre-planned manifest entries

2022-05-19 Thread Jane Chan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-27557:
--
Description: Currently, FileStoreWrite will scan and plan files when 
creating a non-empty writer. For normal compaction cases, we should also create 
a non-empty writer but use the pre-planned manifest entries instead.  (was: 
Currently, FileStoreWrite will scan and plan files when creating a non-empty 
writer. For normal compaction case, we should also create a non-empty writer, 
but use the pre-planned manifest entries instead.)

> Let FileStoreWrite accept pre-planned manifest entries
> --
>
> Key: FLINK-27557
> URL: https://issues.apache.org/jira/browse/FLINK-27557
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.2.0
>
> Attachments: normal-compaction.png, optimized-normal-compaction.png, 
> rescale-bucket-compaction.png
>
>
> Currently, FileStoreWrite will scan and plan files when creating a non-empty 
> writer. For normal compaction cases, we should also create a non-empty writer 
> but use the pre-planned manifest entries instead.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-27557) Let FileStoreWrite accept pre-planned manifest entries

2022-05-19 Thread Jane Chan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-27557:
--
Description: Currently, FileStoreWrite will scan and plan files when 
creating a non-empty writer. For normal compaction case, we should also create 
a non-empty writer, but use the pre-planned manifest entries instead.  (was: 
Currently, FileStoreWrite only creates an empty writer for the \{{INSERT 
OVERWRITE}} clause. We should also create the empty writer for manual 
compaction.)

> Let FileStoreWrite accept pre-planned manifest entries
> --
>
> Key: FLINK-27557
> URL: https://issues.apache.org/jira/browse/FLINK-27557
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.2.0
>
> Attachments: normal-compaction.png, optimized-normal-compaction.png, 
> rescale-bucket-compaction.png
>
>
> Currently, FileStoreWrite will scan and plan files when creating a non-empty 
> writer. For normal compaction case, we should also create a non-empty writer, 
> but use the pre-planned manifest entries instead.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-27679) Support append-only table for log store.

2022-05-19 Thread Zheng Hu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Hu updated FLINK-27679:
-
Fix Version/s: table-store-0.2.0

> Support append-only table for log store.
> 
>
> Key: FLINK-27679
> URL: https://issues.apache.org/jira/browse/FLINK-27679
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Zheng Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.2.0
>
>
> Will publish separate PR to support append-only table for log table.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-27683) Insert into (column1, column2) Values(.....) can't work with sql Hints

2022-05-19 Thread Shengkai Fang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539894#comment-17539894
 ] 

Shengkai Fang commented on FLINK-27683:
---

Yes. I think we can fix this. Are you interested in this issue?

> Insert into (column1, column2) Values(.) can't work with sql Hints
> --
>
> Key: FLINK-27683
> URL: https://issues.apache.org/jira/browse/FLINK-27683
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.14.0, 1.16.0, 1.15.1
>Reporter: Xin Yang
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> When I try to use statement `Insert into (column1, column2) Values(.)` 
> with SQL hints, it throw some exception, which is certainly a bug.
>  
>  * Sql 1
> {code:java}
> INSERT INTO `tidb`.`%s`.`%s` /*+ OPTIONS('tidb.sink.update-columns'='c2, 
> c13')*/   (c2, c13) values(1, 12.12) {code}
>  * 
>  ** result 1
> !screenshot-1.png!
>  * Sql 2
> {code:java}
> INSERT INTO `tidb`.`%s`.`%s` (c2, c13) /*+ 
> OPTIONS('tidb.sink.update-columns'='c2, c13')*/  values(1, 12.12)
> {code}
>  * 
>  ** result 2
> !screenshot-2.png!
>  * Sql 3
> {code:java}
> INSERT INTO `tidb`.`%s`.`%s` (c2, c13)  values(1, 12.12)
> {code}
>  * 
>  ** result3 : success



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-27654) Older jackson-databind found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar

2022-05-19 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539893#comment-17539893
 ] 

Yang Wang commented on FLINK-27654:
---

I am making this ticket as a blocker now since we should fix the known 
vulnerability before releasing.

> Older jackson-databind found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar
> -
>
> Key: FLINK-27654
> URL: https://issues.apache.org/jira/browse/FLINK-27654
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-0.1.0
>Reporter: James Busche
>Priority: Major
> Fix For: kubernetes-operator-1.0.0
>
>
> A twistlock security scan of the latest kubernetes flink operator is showing 
> an older version of jackson-databind in the 
> /flink-kubernetes-shaded-1.0-SNAPSHOT.jar file.  I don't know how to 
> control/update the contents of this snapshot file.  
> I see this in the report (Otherwise, everything else looks good!):
> ==
> severity: High
> cvss: 7.5 
> riskFactors: Attack complexity: low,Attack vector: network,DoS,Has fix,High 
> severity
> cve: CVE-2020-36518
> Link: [https://nvd.nist.gov/vuln/detail/CVE-2020-36518]
> packageName: com.fasterxml.jackson.core_jackson-databind
> packagePath: /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar
> description: jackson-databind before 2.13.0 allows a Java StackOverflow 
> exception and denial of service via a large depth of nested objects.
> =
> I'd be glad to try to fix it, I'm just not sure how the jackson-databind 
> versions are controlled in this 
> /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-27654) Older jackson-databind found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar

2022-05-19 Thread Yang Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated FLINK-27654:
--
Priority: Blocker  (was: Major)

> Older jackson-databind found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar
> -
>
> Key: FLINK-27654
> URL: https://issues.apache.org/jira/browse/FLINK-27654
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-0.1.0
>Reporter: James Busche
>Priority: Blocker
> Fix For: kubernetes-operator-1.0.0
>
>
> A twistlock security scan of the latest kubernetes flink operator is showing 
> an older version of jackson-databind in the 
> /flink-kubernetes-shaded-1.0-SNAPSHOT.jar file.  I don't know how to 
> control/update the contents of this snapshot file.  
> I see this in the report (Otherwise, everything else looks good!):
> ==
> severity: High
> cvss: 7.5 
> riskFactors: Attack complexity: low,Attack vector: network,DoS,Has fix,High 
> severity
> cve: CVE-2020-36518
> Link: [https://nvd.nist.gov/vuln/detail/CVE-2020-36518]
> packageName: com.fasterxml.jackson.core_jackson-databind
> packagePath: /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar
> description: jackson-databind before 2.13.0 allows a Java StackOverflow 
> exception and denial of service via a large depth of nested objects.
> =
> I'd be glad to try to fix it, I'm just not sure how the jackson-databind 
> versions are controlled in this 
> /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-27654) Older jackson-databind found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar

2022-05-19 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539892#comment-17539892
 ] 

Yang Wang commented on FLINK-27654:
---

Of cause, we could also bump the kubernetes-client in 
flink-kubernetes-operator/pom.xml from 5.12.1 to 5.12.2[1]. But it still does 
not fix the vulnerability and we still need the above dependencyManagement 
solution.

[1]. https://mvnrepository.com/artifact/io.fabric8/kubernetes-client/5.12.2

> Older jackson-databind found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar
> -
>
> Key: FLINK-27654
> URL: https://issues.apache.org/jira/browse/FLINK-27654
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-0.1.0
>Reporter: James Busche
>Priority: Major
> Fix For: kubernetes-operator-1.0.0
>
>
> A twistlock security scan of the latest kubernetes flink operator is showing 
> an older version of jackson-databind in the 
> /flink-kubernetes-shaded-1.0-SNAPSHOT.jar file.  I don't know how to 
> control/update the contents of this snapshot file.  
> I see this in the report (Otherwise, everything else looks good!):
> ==
> severity: High
> cvss: 7.5 
> riskFactors: Attack complexity: low,Attack vector: network,DoS,Has fix,High 
> severity
> cve: CVE-2020-36518
> Link: [https://nvd.nist.gov/vuln/detail/CVE-2020-36518]
> packageName: com.fasterxml.jackson.core_jackson-databind
> packagePath: /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar
> description: jackson-databind before 2.13.0 allows a Java StackOverflow 
> exception and denial of service via a large depth of nested objects.
> =
> I'd be glad to try to fix it, I'm just not sure how the jackson-databind 
> versions are controlled in this 
> /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-27557) Let FileStoreWrite accept pre-planned manifest entries

2022-05-19 Thread Jane Chan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-27557:
--
Summary: Let FileStoreWrite accept pre-planned manifest entries  (was: 
Create the empty writer for 'ALTER TABLE ... COMPACT')

> Let FileStoreWrite accept pre-planned manifest entries
> --
>
> Key: FLINK-27557
> URL: https://issues.apache.org/jira/browse/FLINK-27557
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.2.0
>
> Attachments: normal-compaction.png, optimized-normal-compaction.png, 
> rescale-bucket-compaction.png
>
>
> Currently, FileStoreWrite only creates an empty writer for the \{{INSERT 
> OVERWRITE}} clause. We should also create the empty writer for manual 
> compaction.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-27683) Insert into (column1, column2) Values(.....) can't work with sql Hints

2022-05-19 Thread Xin Yang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539891#comment-17539891
 ] 

Xin Yang commented on FLINK-27683:
--

[~fsk119] It's a bug, right?

> Insert into (column1, column2) Values(.) can't work with sql Hints
> --
>
> Key: FLINK-27683
> URL: https://issues.apache.org/jira/browse/FLINK-27683
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.14.0, 1.16.0, 1.15.1
>Reporter: Xin Yang
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> When I try to use statement `Insert into (column1, column2) Values(.)` 
> with SQL hints, it throw some exception, which is certainly a bug.
>  
>  * Sql 1
> {code:java}
> INSERT INTO `tidb`.`%s`.`%s` /*+ OPTIONS('tidb.sink.update-columns'='c2, 
> c13')*/   (c2, c13) values(1, 12.12) {code}
>  * 
>  ** result 1
> !screenshot-1.png!
>  * Sql 2
> {code:java}
> INSERT INTO `tidb`.`%s`.`%s` (c2, c13) /*+ 
> OPTIONS('tidb.sink.update-columns'='c2, c13')*/  values(1, 12.12)
> {code}
>  * 
>  ** result 2
> !screenshot-2.png!
>  * Sql 3
> {code:java}
> INSERT INTO `tidb`.`%s`.`%s` (c2, c13)  values(1, 12.12)
> {code}
>  * 
>  ** result3 : success



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Comment Edited] (FLINK-27557) Create the empty writer for 'ALTER TABLE ... COMPACT'

2022-05-19 Thread Jane Chan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539352#comment-17539352
 ] 

Jane Chan edited comment on FLINK-27557 at 5/20/22 2:59 AM:


{quote}Can we just use overwrite?
 * overwrite specific manifest entries.
 * overwrite whole table or some whole partition. (eg: when rescale in 
compaction){quote}
Overwrite means the writer cannot accept the specified manifest entries as 
restored files, so how to pass them to FileStoreCommit as compact before(mark 
as delete) is a new question. Beyond that, overwrite corresponds to the commit 
kind "OVERWRITE", but it should be more suitable to use "COMPACT" in this 
situation.

I rethink it and maybe we don't need to generate the new files for the normal 
compaction.
 * The simplest way is to build the levels from the restored files, and don't 
sink record to MemTable, and submit compaction when precommit is invoked.
 * In the future, maybe we can introduce a new dedicated compaction source to 
restore LSM, perform compaction and commit, which avoids the shuffle cost.

 

 

 


was (Author: qingyue):
{quote}Can we just use overwrite?
 * overwrite specific manifest entries.
 * overwrite whole table or some whole partition. (eg: when rescale in 
compaction){quote}

Overwrite means the writer cannot accept the specified manifest entries as 
restored files, so how to pass them to FileStoreCommit as compact before(mark 
as delete) is a new question. Beyond that, overwrite corresponds to the commit 
kind "OVERWRITE", but it should be more suitable to use "COMPACT" in this 
situation.

I rethink it and maybe we don't need to generate the new files for the normal 
compaction. 

* The simplest way is to build the levels from the restored files, and don't 
sink record to MemTable, and submit compaction when precommit is invoked.
* In the future, maybe we can introduce a new dedicated compaction source to 
restore LSM, perform compaction and commit, which avoids the shuffle cost.

 !normal-compaction.png! 

 !rescale-bucket-compaction.png! 

 !optimized-normal-compaction.png! 

> Create the empty writer for 'ALTER TABLE ... COMPACT'
> -
>
> Key: FLINK-27557
> URL: https://issues.apache.org/jira/browse/FLINK-27557
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.2.0
>
> Attachments: normal-compaction.png, optimized-normal-compaction.png, 
> rescale-bucket-compaction.png
>
>
> Currently, FileStoreWrite only creates an empty writer for the \{{INSERT 
> OVERWRITE}} clause. We should also create the empty writer for manual 
> compaction.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Comment Edited] (FLINK-27654) Older jackson-databind found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar

2022-05-19 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539886#comment-17539886
 ] 

Yang Wang edited comment on FLINK-27654 at 5/20/22 2:54 AM:


[~jbusche] You are right. The kubernetes-client in Flink project is a little 
old and I am not against with bumping the version. The reason we are lazy to 
update the kubernetes-client is that Flink only depends some core features(e.g. 
creating deployment/pod/configmap/service, leader election, watch/informer) and 
they are stable enough now. Currently, these functionalities has already been 
covered by the e2e tests in Flink project[1]. It is not a burden to bump the 
version. If you want to do this, we could create a new dedicated ticket.

I have to clarify one more thing. In Flink project, we do not need to bump the 
kubernetes-client version to update the jackson-databind. Actually, the version 
is managed by parent pom[2] via maven dependencyManagement.

This ticket also inspires me to verify the bundled the jackson-databind in the 
flink-kubernetes-operator module. The version is 
"com.fasterxml.jackson.core:jackson-databind:jar:2.13.1:compile". It is 
introduced by "io.fabric8:kubernetes-client:jar:5.12.1:compile". From the maven 
repository, 2.13.1 has one known vulnerability[3].

Would you like to create a PR to fix this? I believe it is simple since we 
could use dependencyManagement in the parent pom to pin the jackson version 
just like Flink project.


[1]. 
https://github.com/apache/flink/blob/release-1.15/flink-end-to-end-tests/test-scripts
[2]. https://github.com/apache/flink/blob/release-1.15/pom.xml#L563
[3]. 
https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-databind


was (Author: fly_in_gis):
[~jbusche] You are right. The kubernetes-client in Flink project is a little 
old and I am not against with bumping the version. The reason we are lazy to 
update the kubernetes-client is that Flink only depends some core features(e.g. 
creating deployment/pod/configmap/service, leader election, watch/informer) and 
they are stable enough now. Currently, these functionalities has already been 
covered by the e2e tests in Flink project[1]. It is not a burden to bump the 
version. If you want to do this, we could create a new dedicated ticket.

I have to clarify one more thing. In Flink project, we do not need to bump the 
kubernetes-client version to update the jackson-databind. Actually, the version 
is managed by parent pom[2] via maven dependencyManagement.

This ticket also inspires me to verify the bundled the jackson-databind in the 
flink-kubernetes-operator module. The version is 
"com.fasterxml.jackson.core:jackson-databind:jar:2.13.1:compile". It is 
introduced by "io.javaoperatorsdk:operator-framework:jar:2.1.4:compile" -> 
"io.fabric8:kubernetes-client:jar:5.12.1:compile". From the maven repository, 
2.13.1 has one known vulnerability[3].

Would you like to create a PR to fix this? I believe it is simple since we 
could use dependencyManagement in the parent pom to pin the jackson version 
just like Flink project.


[1]. 
https://github.com/apache/flink/blob/release-1.15/flink-end-to-end-tests/test-scripts
[2]. https://github.com/apache/flink/blob/release-1.15/pom.xml#L563
[3]. 
https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-databind

> Older jackson-databind found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar
> -
>
> Key: FLINK-27654
> URL: https://issues.apache.org/jira/browse/FLINK-27654
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-0.1.0
>Reporter: James Busche
>Priority: Major
> Fix For: kubernetes-operator-1.0.0
>
>
> A twistlock security scan of the latest kubernetes flink operator is showing 
> an older version of jackson-databind in the 
> /flink-kubernetes-shaded-1.0-SNAPSHOT.jar file.  I don't know how to 
> control/update the contents of this snapshot file.  
> I see this in the report (Otherwise, everything else looks good!):
> ==
> severity: High
> cvss: 7.5 
> riskFactors: Attack complexity: low,Attack vector: network,DoS,Has fix,High 
> severity
> cve: CVE-2020-36518
> Link: [https://nvd.nist.gov/vuln/detail/CVE-2020-36518]
> packageName: com.fasterxml.jackson.core_jackson-databind
> packagePath: /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar
> description: jackson-databind before 2.13.0 allows a Java StackOverflow 
> exception and denial of service via a large depth of nested objects.
> =
> I'd be glad to try to fix it, I'm just not sure how the jackson-databind 
> versions are controlled in this 
> /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar 



--
This message was sent by Atlassian

[GitHub] [flink] KarmaGYZ commented on a diff in pull request #17873: [FLINK-25009][CLI] Output slotSharingGroup as part of JsonGraph

2022-05-19 Thread GitBox



KarmaGYZ commented on code in PR #17873:
URL: https://github.com/apache/flink/pull/17873#discussion_r877686196


##
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/graph/JSONGenerator.java:
##
@@ -184,6 +185,8 @@ private void decorateNode(Integer vertexID, ObjectNode 
node) {
 
 node.put(CONTENTS, vertex.getOperatorDescription());
 
+node.put(SLOT_SHARING_GROUP, vertex.getSlotSharingGroup());

Review Comment:
   Then, we can add a check as a safety net here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-27705) num-sorted-run.compaction-trigger should not interfere the num-levels

2022-05-19 Thread Jane Chan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539887#comment-17539887
 ] 

Jane Chan commented on FLINK-27705:
---

The reason is that the previous write job had triggered compaction and 
generated the new data file with level 5.

Since I didn't set the num of levels explicitly, when the trigger was reduced 
to 2, the num of levels was reduced to 3, so the new write job failed.

> num-sorted-run.compaction-trigger should not interfere the num-levels 
> --
>
> Key: FLINK-27705
> URL: https://issues.apache.org/jira/browse/FLINK-27705
> Project: Flink
>  Issue Type: Improvement
>  Components: Table Store
>Affects Versions: table-store-0.2.0
>Reporter: Jane Chan
>Priority: Major
> Fix For: table-store-0.2.0
>
>
> h3. Issue Description
> The default value for MergeTreeOptions.NUM_LEVELS is not a constant, and it 
> varies with MergeTreeOptions.NUM_SORTED_RUNS_COMPACTION_TRIGGER. If users do 
> not specify the MergeTreeOptions.NUM_LEVELS at the first, once 
> MergeTreeOptions.NUM_SORTED_RUNS_COMPACTION_TRIGGER is changed, and the 
> successive write task would get a chance to fail (to be specific, when the 
> compaction trigger size shrinks and the previous writes had triggered 
> compaction).
> h3. How to Reproduce
> add a test under ForceCompactionITCase
> {code:java}
> @Override
> protected List ddl() {
> return Arrays.asList(
> "CREATE TABLE IF NOT EXISTS T (\n"
> + "  f0 INT\n, "
> + "  f1 STRING\n, "
> + "  f2 STRING\n"
> + ") PARTITIONED BY (f1)",
> "CREATE TABLE IF NOT EXISTS T1 (\n"
> + "  f0 INT\n, "
> + "  f1 STRING\n, "
> + "  f2 STRING\n"
> + ")");
> }
> @Test
> public void test() throws Exception {
> bEnv.executeSql("ALTER TABLE T1 SET ('commit.force-compact' = 'true')");
> bEnv.executeSql(
> "INSERT INTO T1 VALUES(1, 'Winter', 'Winter is Coming'),"
> + "(2, 'Winter', 'The First Snowflake'), "
> + "(2, 'Spring', 'The First Rose in Spring'), "
> + "(7, 'Summer', 'Summertime Sadness')")
> .await();
> bEnv.executeSql("INSERT INTO T1 VALUES(12, 'Winter', 'Last 
> Christmas')").await();
> bEnv.executeSql("INSERT INTO T1 VALUES(11, 'Winter', 'Winter is 
> Coming')").await();
> bEnv.executeSql("INSERT INTO T1 VALUES(10, 'Autumn', 'Refrain')").await();
> bEnv.executeSql(
> "INSERT INTO T1 VALUES(6, 'Summer', 'Watermelon Sugar'), "
> + "(4, 'Spring', 'Spring Water')")
> .await();
> bEnv.executeSql(
> "INSERT INTO T1 VALUES(66, 'Summer', 'Summer Vibe'),"
> + " (9, 'Autumn', 'Wake Me Up When September 
> Ends')")
> .await();
> bEnv.executeSql(
> "INSERT INTO T1 VALUES(666, 'Summer', 'Summer Vibe'),"
> + " (9, 'Autumn', 'Wake Me Up When September 
> Ends')")
> .await();
> bEnv.executeSql("ALTER TABLE T1 SET ('num-sorted-run.compaction-trigger' 
> = '2')");
> bEnv.executeSql(
> "INSERT INTO T1 VALUES(666, 'Summer', 'Summer Vibe'),"
> + " (9, 'Autumn', 'Wake Me Up When September 
> Ends')")
> .await();
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-27654) Older jackson-databind found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar

2022-05-19 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539886#comment-17539886
 ] 

Yang Wang commented on FLINK-27654:
---

[~jbusche] You are right. The kubernetes-client in Flink project is a little 
old and I am not against with bumping the version. The reason we are lazy to 
update the kubernetes-client is that Flink only depends some core features(e.g. 
creating deployment/pod/configmap/service, leader election, watch/informer) and 
they are stable enough now. Currently, these functionalities has already been 
covered by the e2e tests in Flink project[1]. It is not a burden to bump the 
version. If you want to do this, we could create a new dedicated ticket.

I have to clarify one more thing. In Flink project, we do not need to bump the 
kubernetes-client version to update the jackson-databind. Actually, the version 
is managed by parent pom[2] via maven dependencyManagement.

This ticket also inspires me to verify the bundled the jackson-databind in the 
flink-kubernetes-operator module. The version is 
"com.fasterxml.jackson.core:jackson-databind:jar:2.13.1:compile". It is 
introduced by "io.javaoperatorsdk:operator-framework:jar:2.1.4:compile" -> 
"io.fabric8:kubernetes-client:jar:5.12.1:compile". From the maven repository, 
2.13.1 has one known vulnerability[3].

Would you like to create a PR to fix this? I believe it is simple since we 
could use dependencyManagement in the parent pom to pin the jackson version 
just like Flink project.


[1]. 
https://github.com/apache/flink/blob/release-1.15/flink-end-to-end-tests/test-scripts
[2]. https://github.com/apache/flink/blob/release-1.15/pom.xml#L563
[3]. 
https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-databind

> Older jackson-databind found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar
> -
>
> Key: FLINK-27654
> URL: https://issues.apache.org/jira/browse/FLINK-27654
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-0.1.0
>Reporter: James Busche
>Priority: Major
> Fix For: kubernetes-operator-1.0.0
>
>
> A twistlock security scan of the latest kubernetes flink operator is showing 
> an older version of jackson-databind in the 
> /flink-kubernetes-shaded-1.0-SNAPSHOT.jar file.  I don't know how to 
> control/update the contents of this snapshot file.  
> I see this in the report (Otherwise, everything else looks good!):
> ==
> severity: High
> cvss: 7.5 
> riskFactors: Attack complexity: low,Attack vector: network,DoS,Has fix,High 
> severity
> cve: CVE-2020-36518
> Link: [https://nvd.nist.gov/vuln/detail/CVE-2020-36518]
> packageName: com.fasterxml.jackson.core_jackson-databind
> packagePath: /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar
> description: jackson-databind before 2.13.0 allows a Java StackOverflow 
> exception and denial of service via a large depth of nested objects.
> =
> I'd be glad to try to fix it, I'm just not sure how the jackson-databind 
> versions are controlled in this 
> /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-27654) Older jackson-databind found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar

2022-05-19 Thread Yang Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated FLINK-27654:
--
Summary: Older jackson-databind found in 
/flink-kubernetes-shaded-1.0-SNAPSHOT.jar  (was: Older jackson-databind found 
in flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar)

> Older jackson-databind found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar
> -
>
> Key: FLINK-27654
> URL: https://issues.apache.org/jira/browse/FLINK-27654
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-0.1.0
>Reporter: James Busche
>Priority: Major
> Fix For: kubernetes-operator-1.0.0
>
>
> A twistlock security scan of the latest kubernetes flink operator is showing 
> an older version of jackson-databind in the 
> /flink-kubernetes-shaded-1.0-SNAPSHOT.jar file.  I don't know how to 
> control/update the contents of this snapshot file.  
> I see this in the report (Otherwise, everything else looks good!):
> ==
> severity: High
> cvss: 7.5 
> riskFactors: Attack complexity: low,Attack vector: network,DoS,Has fix,High 
> severity
> cve: CVE-2020-36518
> Link: [https://nvd.nist.gov/vuln/detail/CVE-2020-36518]
> packageName: com.fasterxml.jackson.core_jackson-databind
> packagePath: /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar
> description: jackson-databind before 2.13.0 allows a Java StackOverflow 
> exception and denial of service via a large depth of nested objects.
> =
> I'd be glad to try to fix it, I'm just not sure how the jackson-databind 
> versions are controlled in this 
> /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Commented] (FLINK-27705) num-sorted-run.compaction-trigger should not interfere the num-levels

2022-05-19 Thread Jane Chan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539885#comment-17539885
 ] 

Jane Chan commented on FLINK-27705:
---

Stacktrace
{code:java}
java.util.concurrent.ExecutionException: 
org.apache.flink.table.api.TableException: Failed to wait job finish    at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
    at 
org.apache.flink.table.api.internal.TableResultImpl.awaitInternal(TableResultImpl.java:118)
    at 
org.apache.flink.table.api.internal.TableResultImpl.await(TableResultImpl.java:81)
    at 
org.apache.flink.table.store.connector.ForceCompactionITCase.test(ForceCompactionITCase.java:76)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
    at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
    at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
    at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
    at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
    at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
    at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
    at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
    at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
    at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
    at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
    at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
    at org.junit.rules.RunRules.evaluate(RunRules.java:20)
    at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
    at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
    at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
    at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
    at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54)
Caused by: org.apache.flink.table.api.TableException: Failed to wait job finish
    at 
org.apache.flink.table.api.internal.InsertResultProvider.hasNext(InsertResultProvider.java:85)
    at 
org.apache.flink.table.api.internal.InsertResultProvider.isFirstRowReady(InsertResultProvider.java:71)
    at 
org.apache.flink.table.api.internal.TableResultImpl.lambda$awaitInternal$1(TableResultImpl.java:105)
    at 
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
    at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
    at 
org.apache.flink.table.api.internal.InsertResultProvider.hasNext(InsertResultProvider.java:83)
    ... 6 more
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution 
failed.
    at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
    at 
org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:141)
    at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
    at 
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
    at 
java.util.conc

[GitHub] [flink-table-store] tsreaper commented on a diff in pull request #121: [FLINK-27626] Introduce AggregatuibMergeFunction

2022-05-19 Thread GitBox



tsreaper commented on code in PR #121:
URL: https://github.com/apache/flink-table-store/pull/121#discussion_r876537585


##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/FileStoreImpl.java:
##
@@ -201,17 +204,28 @@ public static FileStoreImpl createWithPrimaryKey(
 .collect(Collectors.toList()));
 
 MergeFunction mergeFunction;
+Map rightConfMap =
+options.getFilterConf(e -> 
e.getKey().endsWith(".aggregate-function"));

Review Comment:
   Move this to `AGGREGATION` branch.



##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/compact/AggregateFunction.java:
##
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.mergetree.compact;
+
+import java.io.Serializable;
+
+/**
+ * 自定义的列聚合抽象类.

Review Comment:
   Flink Table Store is currently a sub-project of Flink. English comments only.



##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/compact/AggregateFunction.java:
##
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.mergetree.compact;
+
+import java.io.Serializable;
+
+/**
+ * 自定义的列聚合抽象类.
+ *
+ * @param 
+ */
+public interface AggregateFunction extends Serializable {
+// T aggregator;
+
+T getResult();
+
+default void init() {
+reset();
+}
+
+void reset();
+
+default void aggregate(Object value) {
+aggregate(value, true);
+}
+
+void aggregate(Object value, boolean add);
+
+void reset(Object value);

Review Comment:
   What's the usage of this method?



##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/compact/AggregateFunction.java:
##
@@ -0,0 +1,159 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.mergetree.compact;
+
+import java.io.Serializable;
+
+/**
+ * 自定义的列聚合抽象类.
+ *
+ * @param 
+ */
+public interface AggregateFunction extends Serializable {
+// T aggregator;

Review Comment:
   Remove useless code.



##
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/compact/AggregationMergeFunction.java:
##
@@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache Licen

[jira] [Created] (FLINK-27705) num-sorted-run.compaction-trigger should not interfere the num-levels

2022-05-19 Thread Jane Chan (Jira)

Jane Chan created FLINK-27705:
-

 Summary: num-sorted-run.compaction-trigger should not interfere 
the num-levels 
 Key: FLINK-27705
 URL: https://issues.apache.org/jira/browse/FLINK-27705
 Project: Flink
  Issue Type: Improvement
  Components: Table Store
Affects Versions: table-store-0.2.0
Reporter: Jane Chan
 Fix For: table-store-0.2.0


h3. Issue Description
The default value for MergeTreeOptions.NUM_LEVELS is not a constant, and it 
varies with MergeTreeOptions.NUM_SORTED_RUNS_COMPACTION_TRIGGER. If users do 
not specify the MergeTreeOptions.NUM_LEVELS at the first, once 
MergeTreeOptions.NUM_SORTED_RUNS_COMPACTION_TRIGGER is changed, and the 
successive write task would get a chance to fail (to be specific, when the 
compaction trigger size shrinks and the previous writes had triggered 
compaction).


h3. How to Reproduce

add a test under ForceCompactionITCase
{code:java}
@Override
protected List ddl() {
return Arrays.asList(
"CREATE TABLE IF NOT EXISTS T (\n"
+ "  f0 INT\n, "
+ "  f1 STRING\n, "
+ "  f2 STRING\n"
+ ") PARTITIONED BY (f1)",
"CREATE TABLE IF NOT EXISTS T1 (\n"
+ "  f0 INT\n, "
+ "  f1 STRING\n, "
+ "  f2 STRING\n"
+ ")");
}

@Test
public void test() throws Exception {
bEnv.executeSql("ALTER TABLE T1 SET ('commit.force-compact' = 'true')");
bEnv.executeSql(
"INSERT INTO T1 VALUES(1, 'Winter', 'Winter is Coming'),"
+ "(2, 'Winter', 'The First Snowflake'), "
+ "(2, 'Spring', 'The First Rose in Spring'), "
+ "(7, 'Summer', 'Summertime Sadness')")
.await();
bEnv.executeSql("INSERT INTO T1 VALUES(12, 'Winter', 'Last 
Christmas')").await();
bEnv.executeSql("INSERT INTO T1 VALUES(11, 'Winter', 'Winter is 
Coming')").await();
bEnv.executeSql("INSERT INTO T1 VALUES(10, 'Autumn', 'Refrain')").await();
bEnv.executeSql(
"INSERT INTO T1 VALUES(6, 'Summer', 'Watermelon Sugar'), "
+ "(4, 'Spring', 'Spring Water')")
.await();
bEnv.executeSql(
"INSERT INTO T1 VALUES(66, 'Summer', 'Summer Vibe'),"
+ " (9, 'Autumn', 'Wake Me Up When September 
Ends')")
.await();
bEnv.executeSql(
"INSERT INTO T1 VALUES(666, 'Summer', 'Summer Vibe'),"
+ " (9, 'Autumn', 'Wake Me Up When September 
Ends')")
.await();
bEnv.executeSql("ALTER TABLE T1 SET ('num-sorted-run.compaction-trigger' = 
'2')");
bEnv.executeSql(
"INSERT INTO T1 VALUES(666, 'Summer', 'Summer Vibe'),"
+ " (9, 'Autumn', 'Wake Me Up When September 
Ends')")
.await();
} {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Updated] (FLINK-27654) Older jackson-databind found in flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar

2022-05-19 Thread Yang Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Wang updated FLINK-27654:
--
Summary: Older jackson-databind found in 
flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar  (was: Older jackson-databind 
found in /flink-kubernetes-shaded-1.0-SNAPSHOT.jar)

> Older jackson-databind found in 
> flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar
> -
>
> Key: FLINK-27654
> URL: https://issues.apache.org/jira/browse/FLINK-27654
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Affects Versions: kubernetes-operator-0.1.0
>Reporter: James Busche
>Priority: Major
> Fix For: kubernetes-operator-1.0.0
>
>
> A twistlock security scan of the latest kubernetes flink operator is showing 
> an older version of jackson-databind in the 
> /flink-kubernetes-shaded-1.0-SNAPSHOT.jar file.  I don't know how to 
> control/update the contents of this snapshot file.  
> I see this in the report (Otherwise, everything else looks good!):
> ==
> severity: High
> cvss: 7.5 
> riskFactors: Attack complexity: low,Attack vector: network,DoS,Has fix,High 
> severity
> cve: CVE-2020-36518
> Link: [https://nvd.nist.gov/vuln/detail/CVE-2020-36518]
> packageName: com.fasterxml.jackson.core_jackson-databind
> packagePath: /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar
> description: jackson-databind before 2.13.0 allows a Java StackOverflow 
> exception and denial of service via a large depth of nested objects.
> =
> I'd be glad to try to fix it, I'm just not sure how the jackson-databind 
> versions are controlled in this 
> /flink-kubernetes-operator-1.0-SNAPSHOT-shaded.jar 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Closed] (FLINK-27678) Support append-only table for file store.

2022-05-19 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-27678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee closed FLINK-27678.

Fix Version/s: table-store-0.2.0
 Assignee: Zheng Hu
   Resolution: Fixed

master: 59282f71c84887e3f16af81890558f9593bdaa79

> Support append-only table for file store.
> -
>
> Key: FLINK-27678
> URL: https://issues.apache.org/jira/browse/FLINK-27678
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: table-store-0.2.0
>
>
> Let me publish a separate PR for supporting append-only table in flink table 
> store's file store.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[GitHub] [flink-table-store] JingsongLi merged pull request #126: [FLINK-27678] Support append-only table for file store.

2022-05-19 Thread GitBox



JingsongLi merged PR #126:
URL: https://github.com/apache/flink-table-store/pull/126


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] HuangXingBo commented on a diff in pull request #19743: [FLINK-27657][python] Implement remote operator state backend in PyFlink

2022-05-19 Thread GitBox



HuangXingBo commented on code in PR #19743:
URL: https://github.com/apache/flink/pull/19743#discussion_r876562559


##
flink-python/pyflink/datastream/functions.py:
##
@@ -119,6 +118,22 @@ def get_aggregating_state(
 pass
 
 
+class OperatorStateStore(ABC):

Review Comment:
   What about moving this class to the `state.py`?



##
flink-python/src/main/java/org/apache/flink/streaming/api/runners/python/beam/state/BeamStateHandler.java:
##
@@ -23,6 +23,35 @@
 /** Interface for doing actual operations on Flink state based on {@link 
BeamFnApi.StateRequest}. */
 public interface BeamStateHandler {
 
-BeamFnApi.StateResponse.Builder handle(BeamFnApi.StateRequest request, S 
state)
+/**
+ * Dispatches {@link BeamFnApi.StateRequest} to different handle functions 
base on request case.
+ */
+default BeamFnApi.StateResponse.Builder handle(BeamFnApi.StateRequest 
request, S state)

Review Comment:
   I wonder if it would be clearer if we introduce an abstract class of 
`AbstractBeamStateHandler`. Personally, I don't like the form of default 
interface methods.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-table-store] ajian2002 commented on pull request #121: [FLINK-27626] Introduce AggregatuibMergeFunction

2022-05-19 Thread GitBox



ajian2002 commented on PR #121:
URL: 
https://github.com/apache/flink-table-store/pull/121#issuecomment-1132373009

   So the problem is solved?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-27646) Create Roadmap page for Flink Kubernetes operator

2022-05-19 Thread ConradJam (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-27646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539873#comment-17539873
 ] 

ConradJam commented on FLINK-27646:
---

OK，I would like to ask is this page built on Flink's Confluence? Or do we build 
new ways paths [~gyfora] 

> Create Roadmap page for Flink Kubernetes operator
> -
>
> Key: FLINK-27646
> URL: https://issues.apache.org/jira/browse/FLINK-27646
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: ConradJam
>Priority: Major
> Fix For: kubernetes-operator-1.0.0
>
>
> We should create a dedicated wiki page for the current roadmap of the 
> operator and link it to the overview page in our docs.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

1 2 3 >

1 - 100 of 293 matches

Mail list logo