[jira] [Commented] (FLINK-27314) Support reactive mode for native Kubernetes integration in Flink Kubernetes Operator
[ https://issues.apache.org/jira/browse/FLINK-27314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528549#comment-17528549 ] Gyula Fora commented on FLINK-27314: I think this work (while certainly very interesting and can be extremely useful for managed services) will need a larger design discussion to figure out how it could be best included in the current architecture. + a FLIP > Support reactive mode for native Kubernetes integration in Flink Kubernetes > Operator > > > Key: FLINK-27314 > URL: https://issues.apache.org/jira/browse/FLINK-27314 > Project: Flink > Issue Type: New Feature > Components: Kubernetes Operator >Reporter: Fuyao Li >Priority: Major > > Generally, this task is a low priority task now. > Flink has some system level Flink metrics, Flink kubernetes operator can > detect these metrics and rescale automatically based checkpoint(similar to > standalone reactive mode) and rescale policy configured by users. > The rescale behavior can be based on CPU utilization or memory utilization. > # Before rescaling, Flink operator should check whether the cluster has > enough resources, if not, the rescaling will be aborted. > # We can create a addition field to support this feature. The fields below > is just a rough suggestion. > {code:java} > reactiveScaling: > enabled: boolean > scaleMetric: enum ["CPU", "MEM"] > scaleDownThreshold: > scaleUpThreshold: > minimumLimit: > maximumLimit: > increasePolicy: > {code} > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (FLINK-27328) Could not resolve ResourceManager address
[ https://issues.apache.org/jira/browse/FLINK-27328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora updated FLINK-27328: --- Component/s: Deployment / Kubernetes (was: Kubernetes Operator) > Could not resolve ResourceManager address > - > > Key: FLINK-27328 > URL: https://issues.apache.org/jira/browse/FLINK-27328 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes >Affects Versions: 1.14.4 > Environment: h3. JobManager > {{apiVersion: v1 > kind: Service > metadata: > name: jobmanager-cs > spec: > type: NodePort > ports: > - name: ui > port: 8081 > selector: > app: flink > component: jobmanager > --- > apiVersion: v1 > kind: Service > metadata: > name: jobmanager-hs > spec: > type: ClusterIP > ports: > - port: 6123 > name: rpc > - port: 6124 > name: blob-server > - port: 6125 > name: query > selector: > app: flink > component: jobmanager > --- > apiVersion: apps/v1 > kind: Deployment > metadata: > name: flink-jobmanager > spec: > selector: > matchLabels: > app: flink > template: > metadata: > labels: > app: flink > component: jobmanager > spec: > restartPolicy: Always > containers: > - name: jobmanager > image: flink:1.13.1-scala_2.12 > command: [bash,"-ec",bin/jobmanager.sh start-foreground cluster] > resources: > limits: > memory: "2024Mi" > cpu: "500m" > env: > - name: JOB_MANAGER_ID > valueFrom: > fieldRef: > apiVersion: v1 > fieldPath: status.podIP > - name: POD_IP > valueFrom: > fieldRef: > apiVersion: v1 > fieldPath: status.podIP > # The following args overwrite the value of jobmanager.rpc.address > configured in the configuration config map to POD_IP. > args: ["standalone-job", "--host", "$POD_IP", "--job-classname", > "org.apache.flink.application.Main"] #, , arguments>] optional arguments: ["--job-id", "", "--fromSavepoint", > "/path/to/savepoint", "--allowNonRestoredState"] > ports: > - containerPort: 6123 > name: rpc > - containerPort: 6124 > name: blob-server > - containerPort: 6125 > name: query > - containerPort: 8081 > name: webui > volumeMounts: > - name: flink-config-volume > mountPath: /opt/flink/conf > - name: job-artifacts-volume > mountPath: /opt/flink/usrlib > securityContext: > runAsUser: > volumes: > - name: flink-config-volume > configMap: > name: flink-config > items: > - key: flink-conf.yaml > path: flink-conf.yaml > - key: log4j-console.properties > path: log4j-console.properties > - name: job-artifacts-volume > hostPath: > path: /config/flink}} > h3. Task Manager > {{apiVersion: apps/v1 > kind: Deployment > metadata: > name: flink-taskmanager > spec: > replicas: 2 > selector: > matchLabels: > app: flink > component: taskmanager > template: > metadata: > labels: > app: flink > component: taskmanager > spec: > containers: > - name: taskmanager > image: flink:1.13.1-scala_2.12 > env: > - name: K8S_POD_IP > valueFrom: > fieldRef: > fieldPath: status.podIP > command: ["/bin/sh", "-ec", "sleep 1000"] > resources: > limits: > memory: "800Mi" > cpu: "2000m" > args: > ["taskmanager","start-foreground","-Dtaskmanager.host=$K8S_POD_IP"] > ports: > - containerPort: 6122 > name: rpc > - containerPort: 6125 > name: query-state > volumeMounts: > - name: flink-config-volume > mountPath: /opt/flink/conf/ > - name: job-artifacts-volume > mountPath: /opt/flink/usrlib > securityContext: > runAsUser: > volumes: > - name: flink-config-volume > configMap: > name: flink-config > items: > - key: flink-conf.yaml > path: flink-conf.yaml > - key: log4j-console.properties > path: log4j-console.properties > - name: job-artifacts-volume > hostPath: > path: /config/flink}} > h3. ConfigMap > {{apiVersion: v1 > kind: ConfigMap > metadata: > name: flink-config > labels:
[jira] [Assigned] (FLINK-27416) FLIP-225: Implement standalone mode support in the kubernetes operator
[ https://issues.apache.org/jira/browse/FLINK-27416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora reassigned FLINK-27416: -- Assignee: Usamah Jassat > FLIP-225: Implement standalone mode support in the kubernetes operator > > > Key: FLINK-27416 > URL: https://issues.apache.org/jira/browse/FLINK-27416 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Usamah Jassat >Assignee: Usamah Jassat >Priority: Major > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-225%3A+Implement+standalone+mode+support+in+the+kubernetes+operator -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (FLINK-27422) Do not create temporary pod template files for JobManager and TaskManager if not configured explicitly
[ https://issues.apache.org/jira/browse/FLINK-27422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gyula Fora updated FLINK-27422: --- Labels: Starter (was: ) > Do not create temporary pod template files for JobManager and TaskManager if > not configured explicitly > -- > > Key: FLINK-27422 > URL: https://issues.apache.org/jira/browse/FLINK-27422 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Yang Wang >Priority: Major > Labels: Starter > Fix For: kubernetes-operator-1.0.0 > > > We do not need to create temporary pod template files for JobManager and > TaskManager if it is not configured explicitly via > {{.spec.JobManagerSpec.podTemplate}} or > {{{}.spec.TaskManagerSpec.podTemplate{}}}. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27418) Flink SQL TopN result is wrong
[ https://issues.apache.org/jira/browse/FLINK-27418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528540#comment-17528540 ] zhangbin commented on FLINK-27418: -- Thank you for your reply, the useBlinkPlanner method still exists in Flink 1.14 and the result is sometimes wrong in Flink 1.14. You will get different results if you execute it several times > Flink SQL TopN result is wrong > -- > > Key: FLINK-27418 > URL: https://issues.apache.org/jira/browse/FLINK-27418 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.12.2, 1.14.3 > Environment: Flink 1.12.2 and Flink 1.14.3 test results are sometimes > wrong >Reporter: zhangbin >Priority: Major > > Flink SQL TopN is executed multiple times with different results, sometimes > with correct results and sometimes with incorrect results. > Example: > {code:java} > @Test > public void flinkSqlJoinRetract() { > EnvironmentSettings settings = EnvironmentSettings.newInstance() > .useBlinkPlanner() > .inStreamingMode() > .build(); > StreamExecutionEnvironment streamEnv = > StreamExecutionEnvironment.getExecutionEnvironment(); > streamEnv.setParallelism(1); > StreamTableEnvironment tableEnv = > StreamTableEnvironment.create(streamEnv, settings); > tableEnv.getConfig().setIdleStateRetention(Duration.ofSeconds(1)); > RowTypeInfo waybillTableTypeInfo = buildWaybillTableTypeInfo(); > RowTypeInfo itemTableTypeInfo = buildItemTableTypeInfo(); > SourceFunction waybillSourceFunction = > buildWaybillStreamSource(waybillTableTypeInfo); > SourceFunction itemSourceFunction = > buildItemStreamSource(itemTableTypeInfo); > String waybillTable = "waybill"; > String itemTable = "item"; > DataStreamSource waybillStream = streamEnv.addSource( > waybillSourceFunction, > waybillTable, > waybillTableTypeInfo); > DataStreamSource itemStream = streamEnv.addSource( > itemSourceFunction, > itemTable, > itemTableTypeInfo); > Expression[] waybillFields = ExpressionParser > .parseExpressionList(String.join(",", > waybillTableTypeInfo.getFieldNames()) > + ",proctime.proctime").toArray(new Expression[0]); > Expression[] itemFields = ExpressionParser > .parseExpressionList( > String.join(",", itemTableTypeInfo.getFieldNames()) + > ",proctime.proctime") > .toArray(new Expression[0]); > tableEnv.createTemporaryView(waybillTable, waybillStream, > waybillFields); > tableEnv.createTemporaryView(itemTable, itemStream, itemFields); > String sql = > "select \n" > + "city_id, \n" > + "count(*) as cnt\n" > + "from (\n" > + "select id,city_id\n" > + "from (\n" > + "select \n" > + "id,\n" > + "city_id,\n" > + "row_number() over(partition by id order by > utime desc ) as rno \n" > + "from (\n" > + "select \n" > + "waybill.id as id,\n" > + "coalesce(item.city_id, waybill.city_id) as > city_id,\n" > + "waybill.utime as utime \n" > + "from waybill left join item \n" > + "on waybill.id = item.id \n" > + ") \n" > + ")\n" > + "where rno =1\n" > + ")\n" > + "group by city_id"; > StatementSet statementSet = tableEnv.createStatementSet(); > Table table = tableEnv.sqlQuery(sql); > DataStream> rowDataStream = > tableEnv.toRetractStream(table, Row.class); > rowDataStream.printToErr(); > try { > streamEnv.execute(); > } catch (Exception e) { > e.printStackTrace(); > } > } > private static RowTypeInfo buildWaybillTableTypeInfo() { > TypeInformation[] types = new TypeInformation[]{Types.INT(), > Types.STRING(), Types.LONG(), Types.LONG()}; > String[] fields = new String[]{"id", "city_id", "rider_id", "utime"}; > return new RowTypeInfo(types, fields); > } > private static RowTypeInfo buildItemTableTypeInfo() { > TypeInformation[] types = new TypeInformation[]{Types.INT(), > Types.STRING(), Types.LONG()}; > String[] fields = new String[]{"id", "city_id",
[GitHub] [flink] flinkbot commented on pull request #19590: [FLINK-27352][tests] [JUnit5 Migration] Module: flink-json
flinkbot commented on PR #19590: URL: https://github.com/apache/flink/pull/19590#issuecomment-1110544425 ## CI report: * bd1c9f50f4efa4fc07afc15be8391266cf87e70b UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-27303) Flink Operator will create a large amount of temp log config files
[ https://issues.apache.org/jira/browse/FLINK-27303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528537#comment-17528537 ] Gyula Fora edited comment on FLINK-27303 at 4/27/22 5:01 AM: - [~wangyang0918] in the current design temp files are deleted after 30 minutes (we delete on config cache expiration) are they still there for more than 30 mins? was (Author: gyfora): [~wangyang0918] in the current design temp files are deleted after 30 minutes (we delete on config cache expiration) are they still there for more then 30 mins? > Flink Operator will create a large amount of temp log config files > -- > > Key: FLINK-27303 > URL: https://issues.apache.org/jira/browse/FLINK-27303 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.0.0 >Reporter: Gyula Fora >Assignee: Gyula Fora >Priority: Critical > Labels: pull-request-available > Fix For: kubernetes-operator-1.0.0 > > > Now we use the configbuilder in multiple different places to generate the > effective config including observer, reconciler, validator etc. > The effective config gerenration logic also creates temporary log config > files (if spec logConfiguration is set) which would lead to 3-4 files > generated in every reconcile loop for a given job. These files are not > cleaned up until the operator restarts leading to a large amount of files. > I believe we should change the config generation logic and only apply the > logconfig generation logic right before flink cluster submission as that is > the only thing affected by it. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27303) Flink Operator will create a large amount of temp log config files
[ https://issues.apache.org/jira/browse/FLINK-27303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528537#comment-17528537 ] Gyula Fora commented on FLINK-27303: [~wangyang0918] in the current design temp files are deleted after 30 minutes (we delete on config cache expiration) are they still there for more then 30 mins? > Flink Operator will create a large amount of temp log config files > -- > > Key: FLINK-27303 > URL: https://issues.apache.org/jira/browse/FLINK-27303 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.0.0 >Reporter: Gyula Fora >Assignee: Gyula Fora >Priority: Critical > Labels: pull-request-available > Fix For: kubernetes-operator-1.0.0 > > > Now we use the configbuilder in multiple different places to generate the > effective config including observer, reconciler, validator etc. > The effective config gerenration logic also creates temporary log config > files (if spec logConfiguration is set) which would lead to 3-4 files > generated in every reconcile loop for a given job. These files are not > cleaned up until the operator restarts leading to a large amount of files. > I believe we should change the config generation logic and only apply the > logconfig generation logic right before flink cluster submission as that is > the only thing affected by it. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink] BiGsuw opened a new pull request, #19590: [FLINK-27352][tests] [JUnit5 Migration] Module: flink-json
BiGsuw opened a new pull request, #19590: URL: https://github.com/apache/flink/pull/19590 What is the purpose of the change Update the flink-table/flink-table-common and flink-formats/flink-json module to AssertJ and JUnit 5 following the [JUnit 5 Migration Guide](https://docs.google.com/document/d/1514Wa_aNB9bJUen4xm5uiuXOooOJTtXqS_Jqk9KJitU/edit) Brief change log Use JUnit5 and AssertJ in tests instead of JUnit4 and Hamcrest Ignore deprecated class test case upgrades(JsonRowSerializationSchemaTest.class,JsonRowDeserializationSchemaTest.class) Verifying this change This change is a code cleanup without any test coverage. Does this pull request potentially affect one of the following parts: Dependencies (does it add or upgrade a dependency): ( no) The public API, i.e., is any changed class annotated with @public(Evolving): ( no) The serializers: (no) The runtime per-record code paths (performance sensitive): (no) Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no) The S3 file system connector: ( no) Documentation Does this pull request introduce a new feature? ( no) If yes, how is the feature documented? (not applicable) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] BiGsuw closed pull request #19564: [FLINK-27352][tests] [JUnit5 Migration] Module: flink-json
BiGsuw closed pull request #19564: [FLINK-27352][tests] [JUnit5 Migration] Module: flink-json URL: https://github.com/apache/flink/pull/19564 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] BiGsuw commented on pull request #19564: [FLINK-27352][tests] [JUnit5 Migration] Module: flink-json
BiGsuw commented on PR #19564: URL: https://github.com/apache/flink/pull/19564#issuecomment-1110541540 > ## CI report: > * [bd1c9f5](https://github.com/apache/flink/commit/bd1c9f50f4efa4fc07afc15be8391266cf87e70b) Azure: [CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=35158) > > Bot commands -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #19589: [FLINK-27423][Connectors/Hive] Upgrade Hive 3.1 connector from 3.1.2 to 3.1.3
flinkbot commented on PR #19589: URL: https://github.com/apache/flink/pull/19589#issuecomment-1110541455 ## CI report: * 1c53b182ca674a43da34c623155e1d30748fd646 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-27423) Upgrade Hive 3.1 connector from 3.1.2 to 3.1.3
[ https://issues.apache.org/jira/browse/FLINK-27423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528534#comment-17528534 ] Jeff Yang commented on FLINK-27423: --- [~gaoyunhaii] [~jark] Hi,Please take a look in your free time ,thank you. > Upgrade Hive 3.1 connector from 3.1.2 to 3.1.3 > -- > > Key: FLINK-27423 > URL: https://issues.apache.org/jira/browse/FLINK-27423 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / Hive >Affects Versions: 1.15.0, 1.16.0 >Reporter: Jeff Yang >Priority: Major > Labels: pull-request-available > > The latest supported version of the Hive 3.1.* release. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink] BiGsuw commented on pull request #19564: [FLINK-27352][tests] [JUnit5 Migration] Module: flink-json
BiGsuw commented on PR #19564: URL: https://github.com/apache/flink/pull/19564#issuecomment-1110541072 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-27423) Upgrade Hive 3.1 connector from 3.1.2 to 3.1.3
[ https://issues.apache.org/jira/browse/FLINK-27423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-27423: --- Labels: pull-request-available (was: ) > Upgrade Hive 3.1 connector from 3.1.2 to 3.1.3 > -- > > Key: FLINK-27423 > URL: https://issues.apache.org/jira/browse/FLINK-27423 > Project: Flink > Issue Type: Technical Debt > Components: Connectors / Hive >Affects Versions: 1.15.0, 1.16.0 >Reporter: Jeff Yang >Priority: Major > Labels: pull-request-available > > The latest supported version of the Hive 3.1.* release. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink] yangjf2019 opened a new pull request, #19589: [FLINK-27423][Connectors/Hive] Upgrade Hive 3.1 connector from 3.1.2 to 3.1.3
yangjf2019 opened a new pull request, #19589: URL: https://github.com/apache/flink/pull/19589 ## What is the purpose of the change *Upgrade the Hive 3.1 connector to the latest available version, 3.1.3* ## Brief change log - *Renamed connector from Hive 3.1.2 to 3.1.3.* - *Updated documentations and build profiles.* - *Updated dependencies.* ## Verifying this change I ran the corresponding test class and didn't see any failures. *(Please pick either of the following options)* This change is a trivial rework / code cleanup without any test coverage. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (yes) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (docs) Hi, @gaoyunhaii @wuchong Please take a look in your free time ,thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-27423) Upgrade Hive 3.1 connector from 3.1.2 to 3.1.3
Jeff Yang created FLINK-27423: - Summary: Upgrade Hive 3.1 connector from 3.1.2 to 3.1.3 Key: FLINK-27423 URL: https://issues.apache.org/jira/browse/FLINK-27423 Project: Flink Issue Type: Technical Debt Components: Connectors / Hive Affects Versions: 1.15.0, 1.16.0 Reporter: Jeff Yang The latest supported version of the Hive 3.1.* release. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (FLINK-26457) Introduce Join Accumulator for Wide table
[ https://issues.apache.org/jira/browse/FLINK-26457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526152#comment-17526152 ] Jingsong Lee edited comment on FLINK-26457 at 4/27/22 4:22 AM: --- master: 41338cc0d13810becdd2390ab809e5cee9b59712 release-0.1: b7ae760200c500f2277e266e17b2cb271dcec805 was (Author: lzljs3620320): master: 41338cc0d13810becdd2390ab809e5cee9b59712 > Introduce Join Accumulator for Wide table > - > > Key: FLINK-26457 > URL: https://issues.apache.org/jira/browse/FLINK-26457 > Project: Flink > Issue Type: New Feature > Components: Table Store >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: table-store-0.1.0 > > > Consider a Join Accumulator, It will merge two records, completing the > not-null fields. > It is very useful for wide tables, where two source tables join together to > form a wide table. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (FLINK-26457) Introduce Join Accumulator for Wide table
[ https://issues.apache.org/jira/browse/FLINK-26457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee updated FLINK-26457: - Fix Version/s: table-store-0.1.0 (was: table-store-0.2.0) > Introduce Join Accumulator for Wide table > - > > Key: FLINK-26457 > URL: https://issues.apache.org/jira/browse/FLINK-26457 > Project: Flink > Issue Type: New Feature > Components: Table Store >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: table-store-0.1.0 > > > Consider a Join Accumulator, It will merge two records, completing the > not-null fields. > It is very useful for wide tables, where two source tables join together to > form a wide table. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Closed] (FLINK-27335) Optimize async compaction in MergeTreeWriter
[ https://issues.apache.org/jira/browse/FLINK-27335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingsong Lee closed FLINK-27335. Fix Version/s: table-store-0.1.0 (was: table-store-0.2.0) Resolution: Fixed master: 72af2638b978963959bdc72fb2394f43f89161bb release-0.1: 1d0965a8622e53f4da46d267f6e13b1c0cc22213 > Optimize async compaction in MergeTreeWriter > > > Key: FLINK-27335 > URL: https://issues.apache.org/jira/browse/FLINK-27335 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Assignee: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: table-store-0.1.0 > > > Currently Full Compaction may cause the writer to be blocked, which has an > impact on LogStore latency. > We need to decouple compact and write, compact completely asynchronous. > But too many files will lead to unstable reads, when too many files, > Compaction processing speed can not keep up with Writer, need to back press > Writer. > Stop parameter: num-sorted-run.stop-trigger, default 10 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink-table-store] JingsongLi merged pull request #97: [FLINK-27335] Optimize async compaction in MergeTreeWriter
JingsongLi merged PR #97: URL: https://github.com/apache/flink-table-store/pull/97 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Myasuka commented on a diff in pull request #19550: [FLINK-25511][state/changelog] Discard pre-emptively uploaded state changes not included into any checkpoint
Myasuka commented on code in PR #19550: URL: https://github.com/apache/flink/pull/19550#discussion_r859362305 ## flink-dstl/flink-dstl-dfs/src/main/java/org/apache/flink/changelog/fs/ChangelogRegistryImpl.java: ## @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.changelog.fs; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.runtime.state.PhysicalStateHandleID; +import org.apache.flink.runtime.state.StreamStateHandle; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import javax.annotation.concurrent.ThreadSafe; + +import java.util.Map; +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.ConcurrentHashMap; +import java.util.concurrent.CopyOnWriteArraySet; +import java.util.concurrent.Executor; + +@Internal +@ThreadSafe +class ChangelogRegistryImpl implements ChangelogRegistry { +private static final Logger LOG = LoggerFactory.getLogger(ChangelogRegistryImpl.class); + +private final Map> entries = new ConcurrentHashMap<>(); +private final Executor executor; + +public ChangelogRegistryImpl(Executor executor) { +this.executor = executor; +} + +@Override +public void startTracking(StreamStateHandle handle, Set backendIDs) { +LOG.debug( +"start tracking state, key: {}, state: {}", +handle.getStreamStateHandleID(), +handle); +entries.put(handle.getStreamStateHandleID(), new CopyOnWriteArraySet<>(backendIDs)); +} + +@Override +public void stopTracking(StreamStateHandle handle) { +LOG.debug( +"stop tracking state, key: {}, state: {}", handle.getStreamStateHandleID(), handle); +entries.remove(handle.getStreamStateHandleID()); +} + +@Override +public void notUsed(StreamStateHandle handle, UUID backendId) { Review Comment: If current sequense number is up to `100`, we then call `changelogTruncateHelper.materialized(upTo)` after materialization finished. Thus, we would discard change logs with `upTo` as `100`, however, if the `stopTracking` is missed before, can we say the discarding is safe here? Considering the next checkpoints are not successful and thus the materialized parts are not be used. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Myasuka commented on a diff in pull request #19550: [FLINK-25511][state/changelog] Discard pre-emptively uploaded state changes not included into any checkpoint
Myasuka commented on code in PR #19550: URL: https://github.com/apache/flink/pull/19550#discussion_r859355932 ## flink-dstl/flink-dstl-dfs/src/main/java/org/apache/flink/changelog/fs/ChangelogRegistry.java: ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.changelog.fs; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.runtime.state.StreamStateHandle; + +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.Executor; +import java.util.concurrent.Executors; + +/** + * Registry of changelog segments uploaded by {@link + * org.apache.flink.runtime.state.changelog.StateChangelogWriter StateChangelogWriters} of a {@link + * org.apache.flink.runtime.state.changelog.StateChangelogStorage StateChangelogStorage}. + */ +@Internal +public interface ChangelogRegistry { Review Comment: I think `TaskChangelogRegistry`, and we should give explict descriptions that the ownership is **neither** on JM or on TM. ## flink-dstl/flink-dstl-dfs/src/main/java/org/apache/flink/changelog/fs/ChangelogRegistry.java: ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.changelog.fs; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.runtime.state.StreamStateHandle; + +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.Executor; +import java.util.concurrent.Executors; + +/** + * Registry of changelog segments uploaded by {@link + * org.apache.flink.runtime.state.changelog.StateChangelogWriter StateChangelogWriters} of a {@link + * org.apache.flink.runtime.state.changelog.StateChangelogStorage StateChangelogStorage}. + */ +@Internal +public interface ChangelogRegistry { Review Comment: I think `TaskChangelogRegistry` is also fine, and we should give explict descriptions that the ownership is **neither** on JM or on TM. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-27153) Allow optional last-state fallback for savepoint upgrade mode
[ https://issues.apache.org/jira/browse/FLINK-27153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528521#comment-17528521 ] Yang Wang commented on FLINK-27153: --- I want to say this is also what is in my mind at the very beginning. If the users want to restore from savepoint when using last-state upgrade mode, they could simply use {{savepointTriggerNonce}} and trigger the upgrade after then. > Allow optional last-state fallback for savepoint upgrade mode > - > > Key: FLINK-27153 > URL: https://issues.apache.org/jira/browse/FLINK-27153 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Gyula Fora >Priority: Major > Fix For: kubernetes-operator-1.0.0 > > > In many cases users would prefer to take a savepoint if the job is healthy > before performing an upgrade but still allow checkpoint based (last-state) > recovery in case the savepoint fails or the job is generally in a bad state. > We should add a configuration flag for this that the user can set in the > flinkConfiguration: > `kubernetes.operator.job.upgrade.last-state-fallback` -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Closed] (FLINK-27409) Cleanup stale slot allocation record when the resource requirement of a job is empty
[ https://issues.apache.org/jira/browse/FLINK-27409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yangze Guo closed FLINK-27409. -- Resolution: Fixed > Cleanup stale slot allocation record when the resource requirement of a job > is empty > > > Key: FLINK-27409 > URL: https://issues.apache.org/jira/browse/FLINK-27409 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.15.0, 1.14.4 >Reporter: Yangze Guo >Assignee: Yangze Guo >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0, 1.14.5, 1.15.1 > > > We need to clean up the stale slot allocation record when the resource > requirement of a job is empty, in case the pending slot of this job is > incorrectly allocated when registered. This only affects the > `FineGrainedSlotManager`. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27409) Cleanup stale slot allocation record when the resource requirement of a job is empty
[ https://issues.apache.org/jira/browse/FLINK-27409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528517#comment-17528517 ] Yangze Guo commented on FLINK-27409: master: fd4e52ba4d6292c02c8b5192a8679c1bb666a218 1.15: 39377d6c5c64734f043851087849b4278715e45c 1.14: b5d77c0cdb1519163acad92084a4f3f79b24b012 > Cleanup stale slot allocation record when the resource requirement of a job > is empty > > > Key: FLINK-27409 > URL: https://issues.apache.org/jira/browse/FLINK-27409 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 1.15.0, 1.14.4 >Reporter: Yangze Guo >Assignee: Yangze Guo >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0, 1.14.5, 1.15.1 > > > We need to clean up the stale slot allocation record when the resource > requirement of a job is empty, in case the pending slot of this job is > incorrectly allocated when registered. This only affects the > `FineGrainedSlotManager`. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink] BiGsuw commented on pull request #19564: [FLINK-27352][tests] [JUnit5 Migration] Module: flink-json
BiGsuw commented on PR #19564: URL: https://github.com/apache/flink/pull/19564#issuecomment-1110503797 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] KarmaGYZ commented on pull request #19582: [BP-1.14][FLINK-27409][runtime] Cleanup stale slot allocation record when the …
KarmaGYZ commented on PR #19582: URL: https://github.com/apache/flink/pull/19582#issuecomment-1110501423 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] KarmaGYZ closed pull request #19581: [BP-1.15][FLINK-27409][runtime] Cleanup stale slot allocation record when the …
KarmaGYZ closed pull request #19581: [BP-1.15][FLINK-27409][runtime] Cleanup stale slot allocation record when the … URL: https://github.com/apache/flink/pull/19581 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] Myasuka commented on a diff in pull request #19550: [FLINK-25511][state/changelog] Discard pre-emptively uploaded state changes not included into any checkpoint
Myasuka commented on code in PR #19550: URL: https://github.com/apache/flink/pull/19550#discussion_r859349613 ## flink-runtime/src/main/java/org/apache/flink/runtime/state/filesystem/FileStateHandle.java: ## @@ -73,6 +74,11 @@ public Optional asBytesIfInMemory() { return Optional.empty(); } +@Override +public PhysicalStateHandleID getStreamStateHandleID() { +return new PhysicalStateHandleID(filePath.toUri().toString()); Review Comment: In general, when we add a new field in a Serializable class, we need to consider the NPE when deserializing previous existing objects. That's why I suggest to make it `null` by default. I checked the place to deserialize `FileStateHandle`, it should be fine here to initialize it directly. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-ml] zhipeng93 merged pull request #92: [hotfix] Update .asf.yaml file to disable the merge button and update collaborators etc.
zhipeng93 merged PR #92: URL: https://github.com/apache/flink-ml/pull/92 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-27412) Allow flinkVersion v1_13 in flink-kubernetes-operator
[ https://issues.apache.org/jira/browse/FLINK-27412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528513#comment-17528513 ] Yang Wang commented on FLINK-27412: --- I hope we could support the Flink versions as many as possible. In practice, we could only support the latest two released version. But the supported Flink versions could enriched if verified. > Allow flinkVersion v1_13 in flink-kubernetes-operator > - > > Key: FLINK-27412 > URL: https://issues.apache.org/jira/browse/FLINK-27412 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Yang Wang >Priority: Major > Labels: starter > Fix For: kubernetes-operator-1.0.0 > > > The core k8s related features: > * native k8s integration for session cluster, 1.10 > * native k8s integration for application cluster, 1.11 > * Flink K8s HA, 1.12 > * pod template, 1.13 > So we could set required the minimum version to 1.13. This will allow more > users to have a try on flink-kubernetes-operator. > > BTW, we need to update the e2e tests to cover all the supported versions. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink] KarmaGYZ closed pull request #19580: [FLINK-27409][runtime] Cleanup stale slot allocation record when the …
KarmaGYZ closed pull request #19580: [FLINK-27409][runtime] Cleanup stale slot allocation record when the … URL: https://github.com/apache/flink/pull/19580 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-27422) Do not create temporary pod template files for JobManager and TaskManager if not configured explicitly
Yang Wang created FLINK-27422: - Summary: Do not create temporary pod template files for JobManager and TaskManager if not configured explicitly Key: FLINK-27422 URL: https://issues.apache.org/jira/browse/FLINK-27422 Project: Flink Issue Type: Improvement Components: Kubernetes Operator Reporter: Yang Wang Fix For: kubernetes-operator-1.0.0 We do not need to create temporary pod template files for JobManager and TaskManager if it is not configured explicitly via {{.spec.JobManagerSpec.podTemplate}} or {{{}.spec.TaskManagerSpec.podTemplate{}}}. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-25645) UnsupportedOperationException would thrown out when hash shuffle by a field with array type
[ https://issues.apache.org/jira/browse/FLINK-25645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528510#comment-17528510 ] dalongliu commented on FLINK-25645: --- [~jingzhang] Thanks for report it, I will take over this ticket. > UnsupportedOperationException would thrown out when hash shuffle by a field > with array type > --- > > Key: FLINK-25645 > URL: https://issues.apache.org/jira/browse/FLINK-25645 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Planner >Reporter: Jing Zhang >Priority: Major > Attachments: image-2022-01-13-19-12-40-756.png, > image-2022-01-13-19-15-28-395.png > > > Currently array type is not supported as hash shuffle key because CodeGen > does not support it yet. > !image-2022-01-13-19-15-28-395.png! > An unsupportedOperationException would thrown out when hash shuffle by a > field with array type, > !image-2022-01-13-19-12-40-756.png! -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27417) Flink JDBC SQL Connector:SELECT * FROM table WHERE co > 100; mysql will execute SELECT * FROM table to scan the whole table
[ https://issues.apache.org/jira/browse/FLINK-27417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528505#comment-17528505 ] haojiawei commented on FLINK-27417: --- [~martijnvisser] Looks like FLINK-16024 is the problem, is there any workaround? The amount of data in our original table is relatively large, and we need to filter and query. > Flink JDBC SQL Connector:SELECT * FROM table WHERE co > 100; mysql will > execute SELECT * FROM table to scan the whole table > > > Key: FLINK-27417 > URL: https://issues.apache.org/jira/browse/FLINK-27417 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Affects Versions: 1.14.0 >Reporter: haojiawei >Priority: Major > > Use flink cli to create a mysql mapping table, and execute the query SELECT * > FROM table WHERE co > 100;Mysql will execute SELECT * FROM table to scan the > whole table. > > show mysql execute sql: select * from information_schema.`PROCESSLIST` where > info is not null; > > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27303) Flink Operator will create a large amount of temp log config files
[ https://issues.apache.org/jira/browse/FLINK-27303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528501#comment-17528501 ] Yang Wang commented on FLINK-27303: --- It seems the {{FlinkConfigManager}} is not working as expected. I still could find some residual pod template files. Please note that only one FlinkDeployment is running. What I have done is triggering an upgrade. BWT, the pod template files could be deleted as soon as the application/session cluster is started successfully. {code:java} flink@flink-kubernetes-operator-85465d98b9-gqpzf:/$ ls -lrt /tmp/ total 44 drwxr-xr-x 2 root root 4096 Mar 29 23:12 hsperfdata_root drwxr-xr-x 2 flink flink 4096 Apr 26 11:53 hsperfdata_flink -rw-r--r-- 1 flink flink 936 Apr 26 11:56 flink_op_generated_podTemplate_4426150248786094723.yaml -rw-r--r-- 1 flink flink 936 Apr 26 11:56 flink_op_generated_podTemplate_15676526370248802485.yaml -rw-r--r-- 1 flink flink 936 Apr 26 11:56 flink_op_generated_podTemplate_1417732719016118695.yaml -rw-r--r-- 1 flink flink 936 Apr 27 02:04 flink_op_generated_podTemplate_4906388424446930059.yaml -rw-r--r-- 1 flink flink 936 Apr 27 02:04 flink_op_generated_podTemplate_1327670729741500815.yaml -rw-r--r-- 1 flink flink 936 Apr 27 02:04 flink_op_generated_podTemplate_11405044963950044885.yaml -rw-r--r-- 1 flink flink 936 Apr 27 02:18 flink_op_generated_podTemplate_15361701471858080798.yaml -rw-r--r-- 1 flink flink 936 Apr 27 02:18 flink_op_generated_podTemplate_9205802344494485904.yaml -rw-r--r-- 1 flink flink 936 Apr 27 02:18 flink_op_generated_podTemplate_8105630178263734625.yaml flink@flink-kubernetes-operator-85465d98b9-gqpzf:/$ flink@flink-kubernetes-operator-85465d98b9-gqpzf:/$ flink@flink-kubernetes-operator-85465d98b9-gqpzf:/$ date Wed Apr 27 03:12:33 UTC 2022 {code} > Flink Operator will create a large amount of temp log config files > -- > > Key: FLINK-27303 > URL: https://issues.apache.org/jira/browse/FLINK-27303 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.0.0 >Reporter: Gyula Fora >Assignee: Gyula Fora >Priority: Critical > Labels: pull-request-available > Fix For: kubernetes-operator-1.0.0 > > > Now we use the configbuilder in multiple different places to generate the > effective config including observer, reconciler, validator etc. > The effective config gerenration logic also creates temporary log config > files (if spec logConfiguration is set) which would lead to 3-4 files > generated in every reconcile loop for a given job. These files are not > cleaned up until the operator restarts leading to a large amount of files. > I believe we should change the config generation logic and only apply the > logconfig generation logic right before flink cluster submission as that is > the only thing affected by it. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Closed] (FLINK-27370) Add a new SessionJobState - Failed
[ https://issues.apache.org/jira/browse/FLINK-27370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xin Hao closed FLINK-27370. --- Resolution: Not A Problem > Add a new SessionJobState - Failed > -- > > Key: FLINK-27370 > URL: https://issues.apache.org/jira/browse/FLINK-27370 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Xin Hao >Priority: Minor > > It will be nice if we can add a new SessionJobState Failed to indicate there > is an error for the session job. > {code:java} > status: > error: 'The error message' > jobStatus: > savepointInfo: {} > reconciliationStatus: > reconciliationTimestamp: 0 > state: DEPLOYED {code} > Reason: > 1. It will be easier for monitoring > 2. I have a personal controller to submit session jobs, it will be cleaner to > check the state by a single field and get the details by the error field. > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27370) Add a new SessionJobState - Failed
[ https://issues.apache.org/jira/browse/FLINK-27370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528500#comment-17528500 ] Xin Hao commented on FLINK-27370: - k, got it~ > Add a new SessionJobState - Failed > -- > > Key: FLINK-27370 > URL: https://issues.apache.org/jira/browse/FLINK-27370 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Xin Hao >Priority: Minor > > It will be nice if we can add a new SessionJobState Failed to indicate there > is an error for the session job. > {code:java} > status: > error: 'The error message' > jobStatus: > savepointInfo: {} > reconciliationStatus: > reconciliationTimestamp: 0 > state: DEPLOYED {code} > Reason: > 1. It will be easier for monitoring > 2. I have a personal controller to submit session jobs, it will be cleaner to > check the state by a single field and get the details by the error field. > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (FLINK-27303) Flink Operator will create a large amount of temp log config files
[ https://issues.apache.org/jira/browse/FLINK-27303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated FLINK-27303: -- Attachment: image-2022-04-27-11-14-00-964.png > Flink Operator will create a large amount of temp log config files > -- > > Key: FLINK-27303 > URL: https://issues.apache.org/jira/browse/FLINK-27303 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.0.0 >Reporter: Gyula Fora >Assignee: Gyula Fora >Priority: Critical > Labels: pull-request-available > Fix For: kubernetes-operator-1.0.0 > > > Now we use the configbuilder in multiple different places to generate the > effective config including observer, reconciler, validator etc. > The effective config gerenration logic also creates temporary log config > files (if spec logConfiguration is set) which would lead to 3-4 files > generated in every reconcile loop for a given job. These files are not > cleaned up until the operator restarts leading to a large amount of files. > I believe we should change the config generation logic and only apply the > logconfig generation logic right before flink cluster submission as that is > the only thing affected by it. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (FLINK-27303) Flink Operator will create a large amount of temp log config files
[ https://issues.apache.org/jira/browse/FLINK-27303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Wang updated FLINK-27303: -- Attachment: (was: image-2022-04-27-11-14-00-964.png) > Flink Operator will create a large amount of temp log config files > -- > > Key: FLINK-27303 > URL: https://issues.apache.org/jira/browse/FLINK-27303 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.0.0 >Reporter: Gyula Fora >Assignee: Gyula Fora >Priority: Critical > Labels: pull-request-available > Fix For: kubernetes-operator-1.0.0 > > > Now we use the configbuilder in multiple different places to generate the > effective config including observer, reconciler, validator etc. > The effective config gerenration logic also creates temporary log config > files (if spec logConfiguration is set) which would lead to 3-4 files > generated in every reconcile loop for a given job. These files are not > cleaned up until the operator restarts leading to a large amount of files. > I believe we should change the config generation logic and only apply the > logconfig generation logic right before flink cluster submission as that is > the only thing affected by it. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27420) Suspended SlotManager fail to reregister metrics when started again
[ https://issues.apache.org/jira/browse/FLINK-27420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528495#comment-17528495 ] Xintong Song commented on FLINK-27420: -- Thanks for reporting this, [~baugarten]. This is indeed a valid issue. I'd like to add a bit more clarification. * For 1.13, I don't think it is supported for the JM process to live through multiple leader sessions, i.e. being revoked and re-granted leadership without failing the process. I know some of the codes look like it is supported, but unfortunately it never really worked until FLINK-23240 which is fixed in 1.14.4. * For 1.14 & 1.15, yes, the issue still exist. Since 1.14, for each leader session we create a new ResourceManager instance. However, some of the components and services are preserved in {{ResourceManagerProcessContext}} and are reused across multiple RM instances. If these components / services are closed, they need to be restarted properly. I've checked the current implementation, and it seems the only things that are affected are {{resourceManagerMetricGroup}} and {{slotManagerMetricGroup}}. I think the easiest way to fix this is probably to store {{metricRegistry}} rather than the metric groups in {{ResourceManagerProcessContext}}, so that we can create new metric group instances for each leader session. WDYT? > Suspended SlotManager fail to reregister metrics when started again > --- > > Key: FLINK-27420 > URL: https://issues.apache.org/jira/browse/FLINK-27420 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Runtime / Metrics >Affects Versions: 1.13.5 >Reporter: Ben Augarten >Priority: Major > > The symptom is that SlotManager metrics are missing (taskslotsavailable and > taskslotstotal) when a SlotManager is suspended and then restarted. We > noticed this issue when running 1.13.5, but I believe this impacts 1.14.x, > 1.15.x, and master. > > When a SlotManager is suspended, the [metrics group is > closed|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L214]. > When the SlotManager is [started > again|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L181], > it makes an attempt to [reregister > metrics|[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L199-L202],] > but that fails because the underlying metrics group [is still > closed|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroup.java#L393] > > > I was able to trace through this issue by restarting zookeeper nodes in a > staging environment and watching the JM with a debugger. > > A concise test, which currently fails, shows the expected behavior – > [https://github.com/apache/flink/compare/master...baugarten:baugarten/slot-manager-missing-metrics?expand=1] > > I am happy to provide a PR to fix this issue, but first would like to verify > that this is not intended. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink-table-store] JingsongLi commented on a diff in pull request #104: [FLINK-27336] Avoid merging when there is only one record
JingsongLi commented on code in PR #104: URL: https://github.com/apache/flink-table-store/pull/104#discussion_r859320053 ## flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/SortBufferMemTable.java: ## @@ -166,19 +169,31 @@ private void advanceIfNeeded() { if (previousRow == null) { return; } -mergeFunction.reset(); -mergeFunction.add(previous.getReusedKv().value()); +boolean hasOne = true; while (readOnce()) { if (keyComparator.compare( previous.getReusedKv().key(), current.getReusedKv().key()) != 0) { break; } +// avoid merging when there is only one record +if (hasOne) { +mergeFunction.reset(); +mergeFunction.add(previous.getReusedKv().value()); +hasOne = false; +} mergeFunction.add(current.getReusedKv().value()); swapSerializers(); } -result = mergeFunction.getValue(); +RowData previousValue = previous.getReusedKv().value(); +result = +hasOne +? mergeFunction instanceof ValueCountMergeFunction Review Comment: I think we can ignore that count is zero for `ValueCountMergeFunction`. Because if there is no merge, the count should never be zero. ## flink-table-store-core/src/main/java/org/apache/flink/table/store/file/mergetree/SortBufferMemTable.java: ## @@ -166,19 +169,31 @@ private void advanceIfNeeded() { if (previousRow == null) { return; } -mergeFunction.reset(); -mergeFunction.add(previous.getReusedKv().value()); +boolean hasOne = true; Review Comment: `hasOne = true` -> `mergeFuncInitialized = false`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-docker] gaoyunhaii opened a new pull request, #113: Update Dockerfiles for 1.15.0 release
gaoyunhaii opened a new pull request, #113: URL: https://github.com/apache/flink-docker/pull/113 This PR updates the dockerfiles for the 1.15.0 release -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Comment Edited] (FLINK-22816) Investigate feasibility of supporting multiple RM leader sessions within JM process
[ https://issues.apache.org/jira/browse/FLINK-22816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528486#comment-17528486 ] Xintong Song edited comment on FLINK-22816 at 4/27/22 2:33 AM: --- Subsumed by FLINK-23240 was (Author: xintongsong): Subsumed by FLINK-21667 > Investigate feasibility of supporting multiple RM leader sessions within JM > process > --- > > Key: FLINK-22816 > URL: https://issues.apache.org/jira/browse/FLINK-22816 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Reporter: Xintong Song >Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor > > In FLINK-21667, we decoupled RM leadership and lifecycle managements. RM is > not started after obtaining leadership, and stopped on losing leadership. > Ideally, we may start and stop multiple RMs, as the process obtains and loses > leadership. However, as discussed in the > [PR|https://github.com/apache/flink/pull/15524#pullrequestreview-663987547], > having a process to start multiple RMs may cause problems in some deployment > modes. E.g., repeated AM registration is not allowed on Yarn. > We need to investigate for all deployments that: > - Whether having multiple leader sessions causes problems. > - If it does, what can we do to solve the problem. > For information, multi-leader-session support for RM has been implemented in > FLINK-21667, but is disabled by default. To enable, add the system property > "flink.tests.enable-rm-multi-leader-session". -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Closed] (FLINK-22816) Investigate feasibility of supporting multiple RM leader sessions within JM process
[ https://issues.apache.org/jira/browse/FLINK-22816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xintong Song closed FLINK-22816. Resolution: Done Subsumed by FLINK-21667 > Investigate feasibility of supporting multiple RM leader sessions within JM > process > --- > > Key: FLINK-22816 > URL: https://issues.apache.org/jira/browse/FLINK-22816 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Reporter: Xintong Song >Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor > > In FLINK-21667, we decoupled RM leadership and lifecycle managements. RM is > not started after obtaining leadership, and stopped on losing leadership. > Ideally, we may start and stop multiple RMs, as the process obtains and loses > leadership. However, as discussed in the > [PR|https://github.com/apache/flink/pull/15524#pullrequestreview-663987547], > having a process to start multiple RMs may cause problems in some deployment > modes. E.g., repeated AM registration is not allowed on Yarn. > We need to investigate for all deployments that: > - Whether having multiple leader sessions causes problems. > - If it does, what can we do to solve the problem. > For information, multi-leader-session support for RM has been implemented in > FLINK-21667, but is disabled by default. To enable, add the system property > "flink.tests.enable-rm-multi-leader-session". -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink-ml] lindong28 commented on a diff in pull request #92: [hotfix] Update .asf.yaml file to disable the merge button and update collaborators
lindong28 commented on code in PR #92: URL: https://github.com/apache/flink-ml/pull/92#discussion_r859316310 ## .asf.yaml: ## @@ -1,3 +1,17 @@ +github: + enabled_merge_buttons: +squash: true +merge: false +rebase: true + labels: +- flink +- big-data +- java +- scala +- python +- sql Review Comment: Thanks for the suggestion. Good catch. I have updated the PR to remove the label additions. The reason is that apache/kafka and apache/flink-statefun do not have `labels` in their `.asf.yaml`. And I also could not find any concrete usage of these labels in apache/flink's pull requests. The commit message has been updated as appropriate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] ruanwenjun commented on pull request #19559: [hotfix][core] Add private constructor of FatalExitExceptionHandler
ruanwenjun commented on PR #19559: URL: https://github.com/apache/flink/pull/19559#issuecomment-1110456738 @autophagy Hi, this PR is aim to remove duplicate setting of `JobManagerOptions.ADDRESS` and `JobManagerOptions.ADDRESS`, since we have already set these two properties in `initializeServices` method. Please take a look, thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] luoyuxia commented on pull request #19573: [FLINK-27384] solve the problem that the latest data cannot be read under the creat…
luoyuxia commented on PR #19573: URL: https://github.com/apache/flink/pull/19573#issuecomment-1110455604 @empcl The compiler failed. You can use `mvn spotless:apply` to fix the failure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] luoyuxia commented on pull request #19573: [FLINK-27384] solve the problem that the latest data cannot be read under the creat…
luoyuxia commented on PR #19573: URL: https://github.com/apache/flink/pull/19573#issuecomment-1110455056 cc @leonardBang -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] luoyuxia commented on a diff in pull request #19573: [FLINK-27384] solve the problem that the latest data cannot be read under the creat…
luoyuxia commented on code in PR #19573: URL: https://github.com/apache/flink/pull/19573#discussion_r859313505 ## flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/connectors/hive/read/HivePartitionFetcherContextBase.java: ## @@ -139,16 +138,9 @@ public List getComparablePartitionValueList() throws E tablePath.getDatabaseName(), tablePath.getObjectName(), Short.MAX_VALUE); -List newNames = -partitionNames.stream() -.filter( -n -> - !partValuesToCreateTime.containsKey( - extractPartitionValues(n))) -.collect(Collectors.toList()); List newPartitions = metaStoreClient.getPartitionsByNames( -tablePath.getDatabaseName(), tablePath.getObjectName(), newNames); +tablePath.getDatabaseName(), tablePath.getObjectName(), partitionNames); Review Comment: Then it will always contain the all paritions, finally return all partitions with create time, right? I prefer to only return the partitions that dose changes. Maybe you can compare the modifcation time for all partitions to decide which partition is updated and should be returned. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-ml] zhipeng93 commented on a diff in pull request #92: [hotfix] Update .asf.yaml file to disable the merge button etc.
zhipeng93 commented on code in PR #92: URL: https://github.com/apache/flink-ml/pull/92#discussion_r859307022 ## .asf.yaml: ## @@ -1,3 +1,17 @@ +github: + enabled_merge_buttons: +squash: true +merge: false +rebase: true + labels: +- flink +- big-data +- java +- scala +- python +- sql Review Comment: Shall we update the labels here? For example, add `machine learning`, remove `scala` and `sql`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-27420) Suspended SlotManager fail to reregister metrics when started again
[ https://issues.apache.org/jira/browse/FLINK-27420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528478#comment-17528478 ] Guowei Ma commented on FLINK-27420: --- Thanks [~baugarten] for reporting this. cc [~xtsong] > Suspended SlotManager fail to reregister metrics when started again > --- > > Key: FLINK-27420 > URL: https://issues.apache.org/jira/browse/FLINK-27420 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Runtime / Metrics >Affects Versions: 1.13.5 >Reporter: Ben Augarten >Priority: Major > > The symptom is that SlotManager metrics are missing (taskslotsavailable and > taskslotstotal) when a SlotManager is suspended and then restarted. We > noticed this issue when running 1.13.5, but I believe this impacts 1.14.x, > 1.15.x, and master. > > When a SlotManager is suspended, the [metrics group is > closed|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L214]. > When the SlotManager is [started > again|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L181], > it makes an attempt to [reregister > metrics|[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L199-L202],] > but that fails because the underlying metrics group [is still > closed|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroup.java#L393] > > > I was able to trace through this issue by restarting zookeeper nodes in a > staging environment and watching the JM with a debugger. > > A concise test, which currently fails, shows the expected behavior – > [https://github.com/apache/flink/compare/master...baugarten:baugarten/slot-manager-missing-metrics?expand=1] > > I am happy to provide a PR to fix this issue, but first would like to verify > that this is not intended. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink-ml] lindong28 commented on a diff in pull request #86: [FLINK-27294] Add Transformer for BinaryClassificationEvaluator
lindong28 commented on code in PR #86: URL: https://github.com/apache/flink-ml/pull/86#discussion_r859293793 ## flink-ml-core/src/test/java/org/apache/flink/ml/api/StageTest.java: ## @@ -463,5 +463,10 @@ public void testValidators() { Assert.assertTrue(nonEmptyArray.validate(new String[] {"1"})); Assert.assertFalse(nonEmptyArray.validate(null)); Assert.assertFalse(nonEmptyArray.validate(new String[0])); + +ParamValidator isSubArray = ParamValidators.isSubArray("a", "b", "c"); +Assert.assertFalse(isSubArray.validate(new String[] {"c", "v"})); Review Comment: nits: could we also check the case where the input value is null, to be consistent with the above tests? `Assert.assertFalse(isSubArray.validate(null))` ## flink-ml-lib/src/main/java/org/apache/flink/ml/evaluation/binaryeval/BinaryClassificationEvaluatorParams.java: ## @@ -0,0 +1,59 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.evaluation.binaryeval; + +import org.apache.flink.ml.common.param.HasLabelCol; +import org.apache.flink.ml.common.param.HasRawPredictionCol; +import org.apache.flink.ml.common.param.HasWeightCol; +import org.apache.flink.ml.param.Param; +import org.apache.flink.ml.param.ParamValidators; +import org.apache.flink.ml.param.StringArrayParam; + +/** + * Params of BinaryClassificationEvaluator. + * + * @param The class type of this instance. + */ +public interface BinaryClassificationEvaluatorParams +extends HasLabelCol, HasRawPredictionCol, HasWeightCol { +/** + * param for metric names in evaluation (supports 'areaUnderROC', 'areaUnderPR', 'KS' and + * 'areaUnderLorenz'). + * + * areaUnderROC: the area under the receiver operating characteristic (ROC) curve. Review Comment: nits: could we update the Java doc here to follow the format used in `HasHandleInvalid.java`? It seems that the format in `HasHandleInvalid.java` is more readable. Currently the doc for `KS` and `areaUnderLorenz` are on the same lines. ## flink-ml-lib/src/test/java/org/apache/flink/ml/evaluation/binaryeval/BinaryClassificationEvaluatorTest.java: ## @@ -0,0 +1,216 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.ml.evaluation.binaryeval; + +import org.apache.flink.api.common.restartstrategy.RestartStrategies; +import org.apache.flink.configuration.Configuration; +import org.apache.flink.ml.linalg.Vectors; +import org.apache.flink.ml.util.StageTestUtils; +import org.apache.flink.streaming.api.datastream.DataStream; +import org.apache.flink.streaming.api.environment.ExecutionCheckpointingOptions; +import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; +import org.apache.flink.table.api.Table; +import org.apache.flink.table.api.bridge.java.StreamTableEnvironment; +import org.apache.flink.types.Row; + +import org.apache.commons.collections.IteratorUtils; +import org.junit.Before; +import org.junit.Rule; +import org.junit.Test; +import org.junit.rules.TemporaryFolder; + +import java.util.Arrays; +import java.util.List; + +import static org.junit.Assert.assertArrayEquals; +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNull; + +/** Tests {@link BinaryClassificationEvaluator}. */ +public class BinaryClassificationEvaluatorTest { +@Rule public final TemporaryFolder
[jira] [Updated] (FLINK-27420) Suspended SlotManager fail to reregister metrics when started again
[ https://issues.apache.org/jira/browse/FLINK-27420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guowei Ma updated FLINK-27420: -- Component/s: Runtime / Coordination > Suspended SlotManager fail to reregister metrics when started again > --- > > Key: FLINK-27420 > URL: https://issues.apache.org/jira/browse/FLINK-27420 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Runtime / Metrics >Affects Versions: 1.13.5 >Reporter: Ben Augarten >Priority: Major > > The symptom is that SlotManager metrics are missing (taskslotsavailable and > taskslotstotal) when a SlotManager is suspended and then restarted. We > noticed this issue when running 1.13.5, but I believe this impacts 1.14.x, > 1.15.x, and master. > > When a SlotManager is suspended, the [metrics group is > closed|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L214]. > When the SlotManager is [started > again|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L181], > it makes an attempt to [reregister > metrics|[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L199-L202],] > but that fails because the underlying metrics group [is still > closed|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroup.java#L393] > > > I was able to trace through this issue by restarting zookeeper nodes in a > staging environment and watching the JM with a debugger. > > A concise test, which currently fails, shows the expected behavior – > [https://github.com/apache/flink/compare/master...baugarten:baugarten/slot-manager-missing-metrics?expand=1] > > I am happy to provide a PR to fix this issue, but first would like to verify > that this is not intended. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink] swuferhong commented on pull request #19579: [FLINK-25097][table-planner] Fix bug in inner join when the filter co…
swuferhong commented on PR #19579: URL: https://github.com/apache/flink/pull/19579#issuecomment-1110438637 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-27387) Support Insert Multi-Table
[ https://issues.apache.org/jira/browse/FLINK-27387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528476#comment-17528476 ] luoyuxia commented on FLINK-27387: -- [~tartarus] Thanks for reporting it. I would like to take it. > Support Insert Multi-Table > -- > > Key: FLINK-27387 > URL: https://issues.apache.org/jira/browse/FLINK-27387 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Hive >Affects Versions: 1.13.1, 1.15.0 >Reporter: tartarus >Priority: Major > > We can reproduce through a UT > Add test case in HiveDialectITCase > {code:java} > @Test > public void testInsertMultiTable() { > tableEnv.executeSql("create table t1 (id bigint, name string)"); > tableEnv.executeSql("create table t2 (id bigint, name string)"); > tableEnv.executeSql("create table t3 (id bigint, name string, age int)"); > tableEnv.executeSql("from (select id, name, age from t3) t " > + "insert overwrite table t1 select id, name where age < 20 " > + "insert overwrite table t2 select id, name where age > 20 "); > } {code} > This is a very common case for batch. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27412) Allow flinkVersion v1_13 in flink-kubernetes-operator
[ https://issues.apache.org/jira/browse/FLINK-27412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528471#comment-17528471 ] Nicholas Jiang commented on FLINK-27412: [~wangyang0918], how many version would be supported in flink-kubernetes-operator? IMO, the usual practice is to support the latest three versions. > Allow flinkVersion v1_13 in flink-kubernetes-operator > - > > Key: FLINK-27412 > URL: https://issues.apache.org/jira/browse/FLINK-27412 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Yang Wang >Priority: Major > Labels: starter > Fix For: kubernetes-operator-1.0.0 > > > The core k8s related features: > * native k8s integration for session cluster, 1.10 > * native k8s integration for application cluster, 1.11 > * Flink K8s HA, 1.12 > * pod template, 1.13 > So we could set required the minimum version to 1.13. This will allow more > users to have a try on flink-kubernetes-operator. > > BTW, we need to update the e2e tests to cover all the supported versions. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-27421) Bundle test utility classes into the PyFlink package to make users write test cases easily
Dian Fu created FLINK-27421: --- Summary: Bundle test utility classes into the PyFlink package to make users write test cases easily Key: FLINK-27421 URL: https://issues.apache.org/jira/browse/FLINK-27421 Project: Flink Issue Type: Improvement Components: API / Python Reporter: Dian Fu Fix For: 1.16.0 Currently, the test utility classes are not bundled in the PyFlink package. This doesn't affect the functionalities. However, when users are trying out PyFlink and developing jobs with PyFlink, they may want to write some unit tests to ensure the functionalities work as expected. There are already some test utility classes in PyFlink, bundling them in the PyFlink package could help users a lot and allow they write unit tests more easier. See https://lists.apache.org/thread/9z468o1hmg4bm7b2vz2o3lkmoqhxnxg1 for more details. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink-ml] lindong28 commented on pull request #92: [hotfix] Update .asf.yaml file to disable the merge button etc.
lindong28 commented on PR #92: URL: https://github.com/apache/flink-ml/pull/92#issuecomment-1110414993 @zhipeng93 could you review this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink-ml] lindong28 opened a new pull request, #92: [hotfix] Update .asf.yaml file to disable the merge button etc.
lindong28 opened a new pull request, #92: URL: https://github.com/apache/flink-ml/pull/92 This PR updates the `.asf.yaml` file to disable the merge button. And it also updates collaborators and labels to be consistent with the apache/flink's `.asf.yaml` file. See https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features for explanation of asf.yaml file and its entries' semantics. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] (FLINK-26945) Add DATE_SUB supported in SQL & Table API
[ https://issues.apache.org/jira/browse/FLINK-26945 ] Lichuanliang deleted comment on FLINK-26945: -- was (Author: artiship): I'm interested in doing this > Add DATE_SUB supported in SQL & Table API > - > > Key: FLINK-26945 > URL: https://issues.apache.org/jira/browse/FLINK-26945 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: dalongliu >Priority: Major > Fix For: 1.16.0 > > > Returns the date {{numDays}} before {{{}startDate{}}}. > Syntax: > {code:java} > date_sub(startDate, numDays) {code} > Arguments: > * {{{}startDate{}}}: A DATE expression. > * {{{}numDays{}}}: An INTEGER expression. > Returns: > A DATE. > If {{numDays}} is negative abs(num_days) are added to {{{}startDate{}}}. > If the result date overflows the date range the function raises an error. > Examples: > {code:java} > > SELECT date_sub('2016-07-30', 1); > 2016-07-29 {code} > See more: > * > [Spark|https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html#date-and-timestamp-functions] > * [Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf] -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-26945) Add DATE_SUB supported in SQL & Table API
[ https://issues.apache.org/jira/browse/FLINK-26945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528460#comment-17528460 ] Lichuanliang commented on FLINK-26945: -- I'm interested in doing this > Add DATE_SUB supported in SQL & Table API > - > > Key: FLINK-26945 > URL: https://issues.apache.org/jira/browse/FLINK-26945 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / API >Reporter: dalongliu >Priority: Major > Fix For: 1.16.0 > > > Returns the date {{numDays}} before {{{}startDate{}}}. > Syntax: > {code:java} > date_sub(startDate, numDays) {code} > Arguments: > * {{{}startDate{}}}: A DATE expression. > * {{{}numDays{}}}: An INTEGER expression. > Returns: > A DATE. > If {{numDays}} is negative abs(num_days) are added to {{{}startDate{}}}. > If the result date overflows the date range the function raises an error. > Examples: > {code:java} > > SELECT date_sub('2016-07-30', 1); > 2016-07-29 {code} > See more: > * > [Spark|https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html#date-and-timestamp-functions] > * [Hive|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf] -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink] flinkbot commented on pull request #19588: [FLINK-XXXX] [core] Add Generator source
flinkbot commented on PR #19588: URL: https://github.com/apache/flink/pull/19588#issuecomment-1110359299 ## CI report: * e72d31992721dc5753b3e6ccb53aba7a948e2893 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] afedulov opened a new pull request, #19588: [FLINK-XXXX] [core] Add Generator source
afedulov opened a new pull request, #19588: URL: https://github.com/apache/flink/pull/19588 Draft PR for early feedback. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-22761) Cannot remove POJO fields
[ https://issues.apache.org/jira/browse/FLINK-22761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528451#comment-17528451 ] Scott Kidder commented on FLINK-22761: -- This is a duplicate of FLINK-21752 > Cannot remove POJO fields > - > > Key: FLINK-22761 > URL: https://issues.apache.org/jira/browse/FLINK-22761 > Project: Flink > Issue Type: Bug > Components: API / Type Serialization System >Affects Versions: 1.12.1 >Reporter: Ygor Allan de Fraga >Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor > > I tested a schema evolution in a state using POJO and no problem was found > when trying to add a new field, it was executed just fine. This same field > was removed from the POJO as it was just a test, but the application could > not restore the state due to an error. > > Here is the error: > {code:java} > 2021-05-24 13:05:31,958 WARN org.apache.flink.runtime.taskmanager.Task > [] - Co-Flat Map -> Map (3/3)#464 > (e0e6d41a18214eab0a1d3c089d8672de) switched from RUNNING to FAILED.2021-05-24 > 13:05:31,958 WARN org.apache.flink.runtime.taskmanager.Task > [] - Co-Flat Map -> Map (3/3)#464 (e0e6d41a18214eab0a1d3c089d8672de) > switched from RUNNING to FAILED.java.lang.Exception: Exception while creating > StreamOperatorStateContext. at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:254) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:272) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:425) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$2(StreamTask.java:535) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:525) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:565) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) > [zdata-flink-streams.jar:0.1] at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) > [zdata-flink-streams.jar:0.1] at java.lang.Thread.run(Thread.java:748) > [?:1.8.0_282] > Caused by: org.apache.flink.util.FlinkException: Could not restore keyed > state backend for CoStreamFlatMap_b101f370952ea85c2104e98dd54bf7f9_(3/3) from > any of the 1 provided restore options. at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:160) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:345) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:163) > ~[zdata-flink-streams.jar:0.1] ... 9 more > Caused by: org.apache.flink.runtime.state.BackendBuildingException: Caught > unexpected exception. at > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:361) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:587) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createKeyedStateBackend(RocksDBStateBackend.java:93) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:328) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:345) > ~[zdata-flink-streams.jar:0.1] at > org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:163) > ~[zdata-flink-streams.jar:0.1] ... 9 more > Caused by:
[jira] [Comment Edited] (FLINK-27405) Refactor SourceCoordinator to abstract BaseCoordinator implementation
[ https://issues.apache.org/jira/browse/FLINK-27405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528443#comment-17528443 ] Steven Zhen Wu edited comment on FLINK-27405 at 4/26/22 10:42 PM: -- cc [~arvid] [~pnowojski] [~dwysakowicz] [~thw] please share your thoughts on extracting a `CoordinatorBase` abstract class from `SourceCoordinator` to promote code reuse. if this is ok with you, [~gang ye] can create a PR later. was (Author: stevenz3wu): cc [~pnowojski] [~dwysakowicz] [~thw] please share your thoughts on extracting a `CoordinatorBase` abstract class from `SourceCoordinator` to promote code reuse. if this is ok with you, [~gang ye] can create a PR later. > Refactor SourceCoordinator to abstract BaseCoordinator implementation > - > > Key: FLINK-27405 > URL: https://issues.apache.org/jira/browse/FLINK-27405 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Reporter: gang ye >Priority: Major > > To solve small files issue caused by data skewness, Flink Iceberg data > shuffling was proposed(design doc > [https://docs.google.com/document/d/13N8cMqPi-ZPSKbkXGOBMPOzbv2Fua59j8bIjjtxLWqo/edit]#). > The basic idea is to use statistics operator to collect local statistics for > traffic distribution at taskmanagers (workers). Local statistics are > periodically sent to the statistics coordinator (running in jobmanager). Once > globally aggregated statistics are ready, the statistics coordinator > broadcasts them to all operator instances. And then a customized partitioner > uses the global statistics which is passed down from statistics operator to > distribute data to Iceberg writers. > In the process of Flink Iceberg data shuffling implementation, we found that, > StatisticsCoordinator can share function with > SourceCoordinator#runInEventLoop, StatisticsCoordinatorContext needs similar > function as SourceCoordinatorConext#callInCoordinatorThread and the > StatisticsCoordinatorProvider ExecutorThreadFactory logic is almost same as > SourceCoordinatorProvider#CoordinatorExecutorThreadFactory. So we would want > to refactor the source coordinator classes to abstract a general coordinator > implementation to reduce the duplicated code when adding other coordinators. > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27405) Refactor SourceCoordinator to abstract BaseCoordinator implementation
[ https://issues.apache.org/jira/browse/FLINK-27405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528443#comment-17528443 ] Steven Zhen Wu commented on FLINK-27405: cc [~pnowojski] [~dwysakowicz] [~thw] please share your thoughts on extracting a `CoordinatorBase` abstract class from `SourceCoordinator` to promote code reuse. if this is ok with you, [~gang ye] can create a PR later. > Refactor SourceCoordinator to abstract BaseCoordinator implementation > - > > Key: FLINK-27405 > URL: https://issues.apache.org/jira/browse/FLINK-27405 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination >Reporter: gang ye >Priority: Major > > To solve small files issue caused by data skewness, Flink Iceberg data > shuffling was proposed(design doc > [https://docs.google.com/document/d/13N8cMqPi-ZPSKbkXGOBMPOzbv2Fua59j8bIjjtxLWqo/edit]#). > The basic idea is to use statistics operator to collect local statistics for > traffic distribution at taskmanagers (workers). Local statistics are > periodically sent to the statistics coordinator (running in jobmanager). Once > globally aggregated statistics are ready, the statistics coordinator > broadcasts them to all operator instances. And then a customized partitioner > uses the global statistics which is passed down from statistics operator to > distribute data to Iceberg writers. > In the process of Flink Iceberg data shuffling implementation, we found that, > StatisticsCoordinator can share function with > SourceCoordinator#runInEventLoop, StatisticsCoordinatorContext needs similar > function as SourceCoordinatorConext#callInCoordinatorThread and the > StatisticsCoordinatorProvider ExecutorThreadFactory logic is almost same as > SourceCoordinatorProvider#CoordinatorExecutorThreadFactory. So we would want > to refactor the source coordinator classes to abstract a general coordinator > implementation to reduce the duplicated code when adding other coordinators. > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (FLINK-27308) Update the Hadoop implementation for filesystems to 3.3.2
[ https://issues.apache.org/jira/browse/FLINK-27308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn Visser updated FLINK-27308: --- Fix Version/s: 1.16.0 1.15.1 > Update the Hadoop implementation for filesystems to 3.3.2 > - > > Key: FLINK-27308 > URL: https://issues.apache.org/jira/browse/FLINK-27308 > Project: Flink > Issue Type: Technical Debt > Components: FileSystems >Reporter: Martijn Visser >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0, 1.15.1 > > > Flink currently uses Hadoop version 3.2.2 for the Flink filesystem > implementations. Upgrading this to version 3.3.2 would provide users the > features listed in HADOOP-17566 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (FLINK-27308) Update the Hadoop implementation for filesystems to 3.3.2
[ https://issues.apache.org/jira/browse/FLINK-27308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn Visser reassigned FLINK-27308: -- Assignee: Martijn Visser > Update the Hadoop implementation for filesystems to 3.3.2 > - > > Key: FLINK-27308 > URL: https://issues.apache.org/jira/browse/FLINK-27308 > Project: Flink > Issue Type: Technical Debt > Components: FileSystems >Reporter: Martijn Visser >Assignee: Martijn Visser >Priority: Major > Labels: pull-request-available > Fix For: 1.16.0, 1.15.1 > > > Flink currently uses Hadoop version 3.2.2 for the Flink filesystem > implementations. Upgrading this to version 3.3.2 would provide users the > features listed in HADOOP-17566 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink] flinkbot commented on pull request #19587: [FLINK-27308][BP 1.15][Filesystem][S3] Update the Hadoop implementation for filesystems to 3.3.2
flinkbot commented on PR #19587: URL: https://github.com/apache/flink/pull/19587#issuecomment-1110286727 ## CI report: * e0377e79a332e27d7ab210cb757dd799f0109828 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] MartijnVisser opened a new pull request, #19587: [FLINK-27308][BP 1.15][Filesystem][S3] Update the Hadoop implementation for filesystems to 3.3.2
MartijnVisser opened a new pull request, #19587: URL: https://github.com/apache/flink/pull/19587 Unchanged backport of https://github.com/apache/flink/pull/19514 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] MartijnVisser merged pull request #19514: [FLINK-27308][Filesystem][S3] Update the Hadoop implementation for filesystems to 3.3.2
MartijnVisser merged PR #19514: URL: https://github.com/apache/flink/pull/19514 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] MartijnVisser commented on pull request #19514: [FLINK-27308][Filesystem][S3] Update the Hadoop implementation for filesystems to 3.3.2
MartijnVisser commented on PR #19514: URL: https://github.com/apache/flink/pull/19514#issuecomment-1110280345 S3 E2E tests also passed (ignoring the post step of caching the Maven repo), per https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=35145=results Merging this now -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] JingGe commented on a diff in pull request #19286: [FLINK-25931] Add projection pushdown support for CsvFormatFactory
JingGe commented on code in PR #19286: URL: https://github.com/apache/flink/pull/19286#discussion_r859142503 ## flink-formats/flink-csv/src/main/java/org/apache/flink/formats/csv/CsvRowDataDeserializationSchema.java: ## @@ -81,19 +82,34 @@ private CsvRowDataDeserializationSchema( @Internal public static class Builder { -private final RowType rowType; +private final RowType rowResultType; private final TypeInformation resultTypeInfo; private CsvSchema csvSchema; private boolean ignoreParseErrors; +/** + * Creates a CSV deserialization schema for the given {@link TypeInformation} with optional + * parameters. + */ +public Builder( +RowType rowReadType, +RowType rowResultType, Review Comment: The naming used here as `rowReadType` and `rowResultType` is not clear enough. Javadoc is required. Why not use `originalRowType` `projectedRowType`? ## flink-formats/flink-csv/src/main/java/org/apache/flink/formats/csv/CsvRowDataDeserializationSchema.java: ## @@ -81,19 +82,34 @@ private CsvRowDataDeserializationSchema( @Internal public static class Builder { -private final RowType rowType; +private final RowType rowResultType; private final TypeInformation resultTypeInfo; private CsvSchema csvSchema; private boolean ignoreParseErrors; +/** + * Creates a CSV deserialization schema for the given {@link TypeInformation} with optional + * parameters. + */ +public Builder( +RowType rowReadType, +RowType rowResultType, +TypeInformation resultTypeInfo) { +Preconditions.checkNotNull(rowReadType, "RowType must not be null."); +Preconditions.checkNotNull(rowResultType, "RowType must not be null."); +Preconditions.checkNotNull(resultTypeInfo, "Result type information must not be null."); +this.rowResultType = rowResultType; +this.resultTypeInfo = resultTypeInfo; +this.csvSchema = CsvRowSchemaConverter.convert(rowReadType); Review Comment: Could we use the `rowResultType ` to create the `csvSchema`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (FLINK-27420) Suspended SlotManagers fail to reregister metrics when started again
Ben Augarten created FLINK-27420: Summary: Suspended SlotManagers fail to reregister metrics when started again Key: FLINK-27420 URL: https://issues.apache.org/jira/browse/FLINK-27420 Project: Flink Issue Type: Bug Components: Runtime / Metrics Affects Versions: 1.13.5 Reporter: Ben Augarten The symptom is that SlotManager metrics are missing (taskslotsavailable and taskslotstotal) when a SlotManager is suspended and then restarted. We noticed this issue when running 1.13.5, but I believe this impacts 1.14.x, 1.15.x, and master. When a SlotManager is suspended, the [metrics group is closed|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L214]. When the SlotManager is [started again|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L181], it makes an attempt to [reregister metrics|[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L199-L202],] but that fails because the underlying metrics group [is still closed|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroup.java#L393] I was able to trace through this issue by restarting zookeeper nodes in a staging environment and watching the JM with a debugger. A concise test, which currently fails, shows the expected behavior – [https://github.com/apache/flink/compare/master...baugarten:baugarten/slot-manager-missing-metrics?expand=1] I am happy to provide a PR to fix this issue, but first would like to verify that this is not intended. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (FLINK-27420) Suspended SlotManager fail to reregister metrics when started again
[ https://issues.apache.org/jira/browse/FLINK-27420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Augarten updated FLINK-27420: - Summary: Suspended SlotManager fail to reregister metrics when started again (was: Suspended SlotManagers fail to reregister metrics when started again) > Suspended SlotManager fail to reregister metrics when started again > --- > > Key: FLINK-27420 > URL: https://issues.apache.org/jira/browse/FLINK-27420 > Project: Flink > Issue Type: Bug > Components: Runtime / Metrics >Affects Versions: 1.13.5 >Reporter: Ben Augarten >Priority: Major > > The symptom is that SlotManager metrics are missing (taskslotsavailable and > taskslotstotal) when a SlotManager is suspended and then restarted. We > noticed this issue when running 1.13.5, but I believe this impacts 1.14.x, > 1.15.x, and master. > > When a SlotManager is suspended, the [metrics group is > closed|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L214]. > When the SlotManager is [started > again|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L181], > it makes an attempt to [reregister > metrics|[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/DeclarativeSlotManager.java#L199-L202],] > but that fails because the underlying metrics group [is still > closed|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/AbstractMetricGroup.java#L393] > > > I was able to trace through this issue by restarting zookeeper nodes in a > staging environment and watching the JM with a debugger. > > A concise test, which currently fails, shows the expected behavior – > [https://github.com/apache/flink/compare/master...baugarten:baugarten/slot-manager-missing-metrics?expand=1] > > I am happy to provide a PR to fix this issue, but first would like to verify > that this is not intended. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (FLINK-27336) Avoid merging when there is only one record
[ https://issues.apache.org/jira/browse/FLINK-27336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-27336: --- Labels: pull-request-available (was: ) > Avoid merging when there is only one record > --- > > Key: FLINK-27336 > URL: https://issues.apache.org/jira/browse/FLINK-27336 > Project: Flink > Issue Type: Improvement > Components: Table Store >Reporter: Jingsong Lee >Priority: Major > Labels: pull-request-available > Fix For: table-store-0.2.0 > > > If there is just one record, still use MergeFunction to merge. This is not > necessary, just output directly. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink-table-store] SteNicholas opened a new pull request, #104: [FLINK-27336] Avoid merging when there is only one record
SteNicholas opened a new pull request, #104: URL: https://github.com/apache/flink-table-store/pull/104 If there is just one record, still use `MergeFunction` to merge. This is not necessary, just output directly. **The brief change log** - Updates the merge in the `SortBufferMemTable` and `SortMergeReader` to output directly if there is only one record, and use `MergeFunction` to merge in the case of multiple records. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] tweise commented on pull request #19514: [FLINK-27308][Filesystem][S3] Update the Hadoop implementation for filesystems to 3.3.2
tweise commented on PR #19514: URL: https://github.com/apache/flink/pull/19514#issuecomment-1110177935 @MartijnVisser generally changes that modify dependencies are hard to back port as these can modify the final application in ways that we cannot adequately predict. This case here may be a bit different due to the self contained nature of the plugins and so I would not be opposed to bring it back to 1.15.x at least (I just saw that 1.15.0 is final). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] MartijnVisser commented on pull request #19514: [FLINK-27308][Filesystem][S3] Update the Hadoop implementation for filesystems to 3.3.2
MartijnVisser commented on PR #19514: URL: https://github.com/apache/flink/pull/19514#issuecomment-1110130110 @AHeise @tweise What's your opinion on backporting this to Flink 1.14 and Flink 1.15? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] MartijnVisser commented on pull request #19514: [FLINK-27308][Filesystem][S3] Update the Hadoop implementation for filesystems to 3.3.2
MartijnVisser commented on PR #19514: URL: https://github.com/apache/flink/pull/19514#issuecomment-1110129504 @tweise @AHeise I've completed the verification of the dependency tree. I've also included a notice for the so called `Apache Hadoop Relocated (Shaded) Third-party Libs` with regards to the Shaded Protobuf and/or Shaded Guava, similar like we do for `flink-python` (with Beam) and for the HBase connectors. I've also rebased this PR to be sure, I'll let CI verify this once more and if everything turns green, I'll merge it. Thanks again @chinmayms @tweise @AHeise -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on pull request #19571: [FLINK-27394] Integrate the Flink Elasticsearch documentation in the Flink documentation
zentol commented on PR #19571: URL: https://github.com/apache/flink/pull/19571#issuecomment-1110108757 > Is that so? yes. Building it locally failed for me until I installed go. That means we also have to double-check that hugo is installed in the CI images (which may or may not be related to the current CI failures). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-27419) Endpoint to cancel savepoints
[ https://issues.apache.org/jira/browse/FLINK-27419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ingo Bürk updated FLINK-27419: -- Description: The Flink REST API currently offers no endpoint to cancel a triggered savepoint. This would be useful for clients to make it easier to coordinate the cleanup of resources. I propose to add a new `DELETE /jobs/:jobId/savepoints/:triggerId` endpoint which aborts a triggered and not-yet completed savepoint. If the endpoint is already completed or failed, this should delete the savepoint and associated data. We can probably deprecate the savepoint-disposal endpoint in that case. was: The Flink REST API currently offers no endpoint to cancel a triggered savepoint. This would be useful for clients to make it easier to coordinate the cleanup of resources. I propose to add a new `DELETE /jobs/:jobId/savepoints/:triggerId` endpoint which aborts a triggered and not-yet completed savepoint. If the endpoint is already completed or failed, this should probably return an error as the API is built around "triggering savepoints" as a resource, not the savepoint itself, and thus DELETE would mean canceling the savepoint, not deleting it. > Endpoint to cancel savepoints > - > > Key: FLINK-27419 > URL: https://issues.apache.org/jira/browse/FLINK-27419 > Project: Flink > Issue Type: New Feature > Components: Runtime / REST >Reporter: Ingo Bürk >Priority: Major > > The Flink REST API currently offers no endpoint to cancel a triggered > savepoint. This would be useful for clients to make it easier to coordinate > the cleanup of resources. > I propose to add a new `DELETE /jobs/:jobId/savepoints/:triggerId` endpoint > which aborts a triggered and not-yet completed savepoint. If the endpoint is > already completed or failed, this should delete the savepoint and associated > data. > We can probably deprecate the savepoint-disposal endpoint in that case. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-27418) Flink SQL TopN result is wrong
[ https://issues.apache.org/jira/browse/FLINK-27418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528284#comment-17528284 ] Martijn Visser commented on FLINK-27418: [~zhangbinzaifendou] Are you sure that you've tested this in Flink 1.14? Your code refers to Blink Planner, but that doesn't exist in Flink 1.14. > Flink SQL TopN result is wrong > -- > > Key: FLINK-27418 > URL: https://issues.apache.org/jira/browse/FLINK-27418 > Project: Flink > Issue Type: Bug > Components: Table SQL / API >Affects Versions: 1.12.2, 1.14.3 > Environment: Flink 1.12.2 and Flink 1.14.3 test results are sometimes > wrong >Reporter: zhangbin >Priority: Major > > Flink SQL TopN is executed multiple times with different results, sometimes > with correct results and sometimes with incorrect results. > Example: > {code:java} > @Test > public void flinkSqlJoinRetract() { > EnvironmentSettings settings = EnvironmentSettings.newInstance() > .useBlinkPlanner() > .inStreamingMode() > .build(); > StreamExecutionEnvironment streamEnv = > StreamExecutionEnvironment.getExecutionEnvironment(); > streamEnv.setParallelism(1); > StreamTableEnvironment tableEnv = > StreamTableEnvironment.create(streamEnv, settings); > tableEnv.getConfig().setIdleStateRetention(Duration.ofSeconds(1)); > RowTypeInfo waybillTableTypeInfo = buildWaybillTableTypeInfo(); > RowTypeInfo itemTableTypeInfo = buildItemTableTypeInfo(); > SourceFunction waybillSourceFunction = > buildWaybillStreamSource(waybillTableTypeInfo); > SourceFunction itemSourceFunction = > buildItemStreamSource(itemTableTypeInfo); > String waybillTable = "waybill"; > String itemTable = "item"; > DataStreamSource waybillStream = streamEnv.addSource( > waybillSourceFunction, > waybillTable, > waybillTableTypeInfo); > DataStreamSource itemStream = streamEnv.addSource( > itemSourceFunction, > itemTable, > itemTableTypeInfo); > Expression[] waybillFields = ExpressionParser > .parseExpressionList(String.join(",", > waybillTableTypeInfo.getFieldNames()) > + ",proctime.proctime").toArray(new Expression[0]); > Expression[] itemFields = ExpressionParser > .parseExpressionList( > String.join(",", itemTableTypeInfo.getFieldNames()) + > ",proctime.proctime") > .toArray(new Expression[0]); > tableEnv.createTemporaryView(waybillTable, waybillStream, > waybillFields); > tableEnv.createTemporaryView(itemTable, itemStream, itemFields); > String sql = > "select \n" > + "city_id, \n" > + "count(*) as cnt\n" > + "from (\n" > + "select id,city_id\n" > + "from (\n" > + "select \n" > + "id,\n" > + "city_id,\n" > + "row_number() over(partition by id order by > utime desc ) as rno \n" > + "from (\n" > + "select \n" > + "waybill.id as id,\n" > + "coalesce(item.city_id, waybill.city_id) as > city_id,\n" > + "waybill.utime as utime \n" > + "from waybill left join item \n" > + "on waybill.id = item.id \n" > + ") \n" > + ")\n" > + "where rno =1\n" > + ")\n" > + "group by city_id"; > StatementSet statementSet = tableEnv.createStatementSet(); > Table table = tableEnv.sqlQuery(sql); > DataStream> rowDataStream = > tableEnv.toRetractStream(table, Row.class); > rowDataStream.printToErr(); > try { > streamEnv.execute(); > } catch (Exception e) { > e.printStackTrace(); > } > } > private static RowTypeInfo buildWaybillTableTypeInfo() { > TypeInformation[] types = new TypeInformation[]{Types.INT(), > Types.STRING(), Types.LONG(), Types.LONG()}; > String[] fields = new String[]{"id", "city_id", "rider_id", "utime"}; > return new RowTypeInfo(types, fields); > } > private static RowTypeInfo buildItemTableTypeInfo() { > TypeInformation[] types = new TypeInformation[]{Types.INT(), > Types.STRING(), Types.LONG()}; > String[] fields = new String[]{"id", "city_id", "utime"}; > return new
[GitHub] [flink] afedulov commented on pull request #19224: [hotfix][test][formats] Detach TableCsvFormatITCase from JsonPlanTestBase
afedulov commented on PR #19224: URL: https://github.com/apache/flink/pull/19224#issuecomment-1110001152 @flinkbot run azure -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-27370) Add a new SessionJobState - Failed
[ https://issues.apache.org/jira/browse/FLINK-27370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528261#comment-17528261 ] Gyula Fora commented on FLINK-27370: I personally would not reintroduce the success flag. If we have an error that should be clearly communicated via events and the error field. Otherwise the deployment was successful :) > Add a new SessionJobState - Failed > -- > > Key: FLINK-27370 > URL: https://issues.apache.org/jira/browse/FLINK-27370 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Xin Hao >Priority: Minor > > It will be nice if we can add a new SessionJobState Failed to indicate there > is an error for the session job. > {code:java} > status: > error: 'The error message' > jobStatus: > savepointInfo: {} > reconciliationStatus: > reconciliationTimestamp: 0 > state: DEPLOYED {code} > Reason: > 1. It will be easier for monitoring > 2. I have a personal controller to submit session jobs, it will be cleaner to > check the state by a single field and get the details by the error field. > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink] zentol commented on a diff in pull request #19586: [FLINK-26824] Upgrade Flink's supported Cassandra versions to match all Apache Cassandra supported versions
zentol commented on code in PR #19586: URL: https://github.com/apache/flink/pull/19586#discussion_r858837814 ## flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorBaseTest.java: ## @@ -220,7 +225,7 @@ private static Class annotatePojoWithTable(String keyspace, Stri } @NotNull -private static Table createTableAnnotation(String keyspace, String tableName) { +private Table createTableAnnotation(String keyspace, String tableName) { Review Comment: Why is this no longer static? ## flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/CassandraConnectorBaseTest.java: ## @@ -97,29 +94,33 @@ import static org.hamcrest.Matchers.samePropertyValuesAs; import static org.junit.Assert.assertTrue; -/** IT cases for all cassandra sinks. */ +/** + * Base class for IT cases for all Cassandra sinks. This class relies on Cassandra testContainer + * that needs to use a ClassRule. Parametrized tests to not work with ClassRules so the actual + * testCase classes define the tested version and manage the container. + */ @SuppressWarnings("serial") // NoHostAvailableException is raised by Cassandra client under load while connecting to the cluster @RetryOnException(times = 3, exception = NoHostAvailableException.class) -public class CassandraConnectorITCase +public abstract class CassandraConnectorBaseTest Review Comment: This isn't a test, and hence the name shouldn't end in "Test". switch "Base" and "Test". ## flink-connectors/flink-connector-cassandra/pom.xml: ## @@ -61,7 +63,8 @@ under the License. true - com.datastax.cassandra:cassandra-driver-core + Review Comment: ```suggestion ``` ## flink-connectors/flink-connector-cassandra/src/main/java/org/apache/flink/streaming/connectors/cassandra/CassandraSinkBase.java: ## @@ -140,7 +152,12 @@ public void invoke(IN value) throws Exception { semaphore.release(); throw e; } -Futures.addCallback(result, callback); +if (synchronousWrites) { +result.get(timeout, TimeUnit.SECONDS); +semaphore.release(); Review Comment: If this times out the semaphore is never released. ## flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/Cassandra30ConnectorITCase.java: ## @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.streaming.connectors.cassandra; + +import org.apache.flink.util.DockerImageVersions; + +import org.junit.AfterClass; +import org.junit.BeforeClass; +import org.junit.ClassRule; +import org.testcontainers.containers.CassandraContainer; + +/** Class for IT cases for all Cassandra sinks tested on Cassandra 3.0.x . */ +public class Cassandra30ConnectorITCase extends CassandraConnectorBaseTest { +private static final String TESTED_VERSION = DockerImageVersions.CASSANDRA_3_0; + +@ClassRule +public static final CassandraContainer CASSANDRA_CONTAINER = Review Comment: If we're going with several ITCases I'd be tempted to only have this in the sub-classes: ``` @BeforeClass public static void setup() { startAndInitializeCassandra(); } ``` and have startAndInitialize also create the container, and manually manage the lifecycle of the cassandra container instead of relying on `ClassRule` ## flink-connectors/flink-connector-cassandra/pom.xml: ## @@ -37,8 +37,10 @@ under the License. - 2.2.5 - 3.0.0 + + 4.0.3 + + 3.11.1 18.0 Review Comment: did you check whether the driver now uses a
[jira] [Commented] (FLINK-27370) Add a new SessionJobState - Failed
[ https://issues.apache.org/jira/browse/FLINK-27370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528259#comment-17528259 ] Gyula Fora commented on FLINK-27370: We removed the success flag as it did not carry extra information on top of the error field. You either have an error -> error != null, or it was successful > Add a new SessionJobState - Failed > -- > > Key: FLINK-27370 > URL: https://issues.apache.org/jira/browse/FLINK-27370 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Xin Hao >Priority: Minor > > It will be nice if we can add a new SessionJobState Failed to indicate there > is an error for the session job. > {code:java} > status: > error: 'The error message' > jobStatus: > savepointInfo: {} > reconciliationStatus: > reconciliationTimestamp: 0 > state: DEPLOYED {code} > Reason: > 1. It will be easier for monitoring > 2. I have a personal controller to submit session jobs, it will be cleaner to > check the state by a single field and get the details by the error field. > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (FLINK-26793) Flink Cassandra connector performance issue
[ https://issues.apache.org/jira/browse/FLINK-26793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528228#comment-17528228 ] Etienne Chauchot commented on FLINK-26793: -- [~arvid] thanks for your comment. I don't think either that there is anything to do at the Flink level but I'll make sure by a deep second round to see if we can improve something. > Flink Cassandra connector performance issue > > > Key: FLINK-26793 > URL: https://issues.apache.org/jira/browse/FLINK-26793 > Project: Flink > Issue Type: Improvement > Components: Connectors / Cassandra >Affects Versions: 1.14.4 >Reporter: Jay Ghiya >Assignee: Etienne Chauchot >Priority: Major > Attachments: Capture d’écran de 2022-04-14 16-34-59.png, Capture > d’écran de 2022-04-14 16-35-07.png, Capture d’écran de 2022-04-14 > 16-35-30.png, jobmanager_log.txt, taskmanager_127.0.1.1_33251-af56fa_log > > > A warning is observed during long runs of flink job stating “Insertions into > scylla might be suffering. Expect performance problems unless this is > resolved.” > Upon initial analysis - “flink cassandra connector is not keeping instance of > mapping manager that is used to convert a pojo to cassandra row. Ideally the > mapping manager should have the same life time as cluster and session objects > which are also created once when the driver is initialized” > Reference: > https://stackoverflow.com/questions/59203418/cassandra-java-driver-warning -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink] MartijnVisser commented on a diff in pull request #19571: [FLINK-27394] Integrate the Flink Elasticsearch documentation in the Flink documentation
MartijnVisser commented on code in PR #19571: URL: https://github.com/apache/flink/pull/19571#discussion_r858835514 ## docs/build_docs.sh: ## @@ -24,5 +24,6 @@ then exit 1 fi git submodule update --init --recursive - +source ./setup_docs.sh Review Comment: No, I just searched if `source` was used elsewhere in the Flink codebase and saw it was, so used that. Any preference from your end? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] MartijnVisser commented on pull request #19571: [FLINK-27394] Integrate the Flink Elasticsearch documentation in the Flink documentation
MartijnVisser commented on PR #19571: URL: https://github.com/apache/flink/pull/19571#issuecomment-1109915301 > It's unfortunate that everyone now needs to install `go` to build the docs Is that so? I believe that if you use the Hugo Docker Image, it uses the Hugo installation from there which works. But then again, I've only tested it on my machine which indeed does have `go` installed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a diff in pull request #19571: [FLINK-27394] Integrate the Flink Elasticsearch documentation in the Flink documentation
zentol commented on code in PR #19571: URL: https://github.com/apache/flink/pull/19571#discussion_r858829347 ## .github/workflows/docs.sh: ## @@ -31,6 +31,14 @@ if ! curl --fail -OL $HUGO_REPO ; then fi tar -zxvf $HUGO_ARTIFACT git submodule update --init --recursive +# retrieve latest documentation from external connector repositories +currentBranch=$(git branch --show-current) +if [[ "$currentBranch" = "master" ]]; then + # The Elasticsearch documentation is currently only available on the main branch + ./setup_docs.sh + ./hugo mod get -u github.com/apache/flink-connector-elasticsearch/docs@main Review Comment: how about moving this into setup_docs.sh as well? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] zentol commented on a diff in pull request #19571: [FLINK-27394] Integrate the Flink Elasticsearch documentation in the Flink documentation
zentol commented on code in PR #19571: URL: https://github.com/apache/flink/pull/19571#discussion_r858824392 ## docs/build_docs.sh: ## @@ -24,5 +24,6 @@ then exit 1 fi git submodule update --init --recursive - +source ./setup_docs.sh Review Comment: is there a particular reason why this one uses `source`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] rkhachatryan commented on a diff in pull request #19550: [FLINK-25511][state/changelog] Discard pre-emptively uploaded state changes not included into any checkpoint
rkhachatryan commented on code in PR #19550: URL: https://github.com/apache/flink/pull/19550#discussion_r858823012 ## flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/ChangelogTruncateHelper.java: ## @@ -0,0 +1,74 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.state.changelog; + +import org.apache.flink.runtime.state.changelog.SequenceNumber; +import org.apache.flink.runtime.state.changelog.StateChangelogWriter; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.NavigableMap; +import java.util.TreeMap; + +class ChangelogTruncateHelper { Review Comment: Good point, I'll add the javadoc. The reason for adding a separate class was to avoid adding more responsibilites to the `ChangelogKeyedStateBackend`. Indeed, the it's the only usage, so I think it doesn't make sense to make the class/API public. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [flink] flinkbot commented on pull request #19586: [FLINK-26824] Upgrade Flink's supported Cassandra versions to match all Apache Cassandra supported versions
flinkbot commented on PR #19586: URL: https://github.com/apache/flink/pull/19586#issuecomment-1109903067 ## CI report: * bac3ee51471798a0c3904872e1a347c8b3f64904 UNKNOWN Bot commands The @flinkbot bot supports the following commands: - `@flinkbot run azure` re-run the last Azure build -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (FLINK-26824) Upgrade Flink's supported Cassandra versions to match with the Cassandra community supported versions
[ https://issues.apache.org/jira/browse/FLINK-26824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-26824: --- Labels: pull-request-available (was: ) > Upgrade Flink's supported Cassandra versions to match with the Cassandra > community supported versions > - > > Key: FLINK-26824 > URL: https://issues.apache.org/jira/browse/FLINK-26824 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Cassandra >Reporter: Martijn Visser >Assignee: Etienne Chauchot >Priority: Major > Labels: pull-request-available > > Flink's Cassandra connector is currently only supporting > com.datastax.cassandra:cassandra-driver-core version 3.0.0. > The Cassandra community supports 3 versions. One GA (general availability, > the latest version), one stable and one older supported release per > https://cassandra.apache.org/_/download.html. > These are currently: > Cassandra 4.0 (GA) > Cassandra 3.11 (Stable) > Cassandra 3.0 (Older supported release). > We should support (and follow) the supported versions by the Cassandra > community -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink] echauchot opened a new pull request, #19586: [FLINK-26824] Upgrade Flink's supported Cassandra versions to match all Apache Cassandra supported versions
echauchot opened a new pull request, #19586: URL: https://github.com/apache/flink/pull/19586 ## What is the purpose of the change Make current Flink Cassandra connector support all the latest Cassandra versions supported by Apache Cassandra (3.026, 3.111.12, 4.0.3 until now) ## Brief change log **Versions**: I was able to address all Cassandra versions with cassandra 4.x lib and driver 3.x without prod code change. There was a big refactor of the driver in 4.x that totally changes the driver API so I preferred sticking to latest driver 3.x that allows to call Cassandra 4.x ** Bug uncovering ** Migrating to Cassandra 4.x uncovered a race condition in the tests between the asynchronous writes and the junit assertions. So I introduced a `sink#setSynchronousWrites()` method defaulting to false for backward compatibility. And I put all the sinks in the tests to write synchronously. That way we are sure that, in the tests, writes are finished before asserting on their result (no more race condition). **Tests**: I first tried to use parameterized tests but the parameters are injected after the testContainers ClassRule that creates the container is evaluated. As this ClassRule is mandatory for Cassandra testConainer, I could not use parameterized tests. So I switched to a hierarchy of classes with an ITCase per Cassandra version : the subClasses do only the container management and the actual tests remain in CassandraConnectorITCase which was renamed (and put abstract) because it is no more the IT entry point. **Commits history** I think the history is clean: I isolated support of all Cassandra versions in a single commit. I also isolated synchronous write support of every sink for easy revert. The final commit is to fix the race condition in all the tests. ## Verifying this change This change added tests and can be verified as follows: Cassandra30ConnectorITCase (Cassandra 3.0.26), Cassandra311ConnectorITCase (Cassandra 3.11.12), Cassandra40ConnectorITCase (Cassandra 4.0.3) ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): yes upgrade - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no - The serializers: (yes / no / don't know): no - The runtime per-record code paths (performance sensitive): no - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? yes ability for sync writes with timeout - If yes, how is the feature documented? javadoc R: @MartijnVisser CC: @zentol -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (FLINK-27370) Add a new SessionJobState - Failed
[ https://issues.apache.org/jira/browse/FLINK-27370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17528217#comment-17528217 ] Aitozi commented on FLINK-27370: Get it, I think we can add the {{success}} field here as a direct indication of whether reconcile succeed. cc [~gyfora] do you have some inputs for this ? > Add a new SessionJobState - Failed > -- > > Key: FLINK-27370 > URL: https://issues.apache.org/jira/browse/FLINK-27370 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Reporter: Xin Hao >Priority: Minor > > It will be nice if we can add a new SessionJobState Failed to indicate there > is an error for the session job. > {code:java} > status: > error: 'The error message' > jobStatus: > savepointInfo: {} > reconciliationStatus: > reconciliationTimestamp: 0 > state: DEPLOYED {code} > Reason: > 1. It will be easier for monitoring > 2. I have a personal controller to submit session jobs, it will be cleaner to > check the state by a single field and get the details by the error field. > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[GitHub] [flink] rkhachatryan commented on a diff in pull request #19550: [FLINK-25511][state/changelog] Discard pre-emptively uploaded state changes not included into any checkpoint
rkhachatryan commented on code in PR #19550: URL: https://github.com/apache/flink/pull/19550#discussion_r858803313 ## flink-dstl/flink-dstl-dfs/src/main/java/org/apache/flink/changelog/fs/ChangelogRegistry.java: ## @@ -0,0 +1,68 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.changelog.fs; + +import org.apache.flink.annotation.Internal; +import org.apache.flink.runtime.state.StreamStateHandle; + +import java.util.Set; +import java.util.UUID; +import java.util.concurrent.Executor; +import java.util.concurrent.Executors; + +/** + * Registry of changelog segments uploaded by {@link + * org.apache.flink.runtime.state.changelog.StateChangelogWriter StateChangelogWriters} of a {@link + * org.apache.flink.runtime.state.changelog.StateChangelogStorage StateChangelogStorage}. + */ +@Internal +public interface ChangelogRegistry { Review Comment: I agree, this info is missing. But maybe even more informative would be `TaskChangelogRegistry`. That would highlight that the ownership is either on JM or on TM. WDYT? (`UnconfirmedChaneglogRegistry` is also fine to me) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org