[jira] [Commented] (FLINK-12376) GCS runtime exn: Request payload size exceeds the limit
[ https://issues.apache.org/jira/browse/FLINK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359126#comment-17359126 ] Henrik commented on FLINK-12376: It's messed up that you let this linger for two years before acting on it and when you act on it, you let a bot do it. > GCS runtime exn: Request payload size exceeds the limit > --- > > Key: FLINK-12376 > URL: https://issues.apache.org/jira/browse/FLINK-12376 > Project: Flink > Issue Type: Bug > Components: Connectors / Google Cloud PubSub >Affects Versions: 1.7.2 > Environment: FROM flink:1.8.0-scala_2.11 > ARG version=0.17 > ADD > https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar > /opt/flink/lib > COPY target/analytics-${version}.jar /opt/flink/lib/analytics.jar >Reporter: Henrik >Priority: Major > Labels: auto-unassigned, stale-major > Attachments: Screenshot 2019-04-30 at 22.32.34.png, Screenshot > 2019-05-08 at 12.41.07.png > > > I'm trying to use the google cloud storage file system, but it would seem > that the FLINK / GCS client libs are creating too-large requests far down in > the GCS Java client. > The Java client is added to the lib folder with this command in Dockerfile > (probably > [hadoop2-1.9.16|https://search.maven.org/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-1.9.16/jar] > at the time of writing): > > {code:java} > ADD > https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar > /opt/flink/lib{code} > This is the crash output. Focus lines: > {code:java} > java.lang.RuntimeException: Error while confirming checkpoint{code} > and > {code:java} > Caused by: com.google.api.gax.rpc.InvalidArgumentException: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size > exceeds the limit: 524288 bytes.{code} > Full stacktrace: > > {code:java} > [analytics-867c867ff6-l622h taskmanager] 2019-04-30 20:23:14,532 INFO > org.apache.flink.runtime.taskmanager.Task - Source: > Custom Source -> Process -> Timestamps/Watermarks -> app_events (1/1) > (9a01e96c0271025d5ba73b735847cd4c) switched from RUNNING to FAILED. > [analytics-867c867ff6-l622h taskmanager] java.lang.RuntimeException: Error > while confirming checkpoint > [analytics-867c867ff6-l622h taskmanager] at > org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1211) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [analytics-867c867ff6-l622h taskmanager] at > java.lang.Thread.run(Thread.java:748) > [analytics-867c867ff6-l622h taskmanager] Caused by: > com.google.api.gax.rpc.InvalidArgumentException: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size > exceeds the limit: 524288 bytes. > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748) > [analytics-867c867ff6-l622h taskmanager] at > io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:507) > [analytics-867c867ff6-l622h taskmanager] at > io.
[jira] [Commented] (FLINK-12376) GCS runtime exn: Request payload size exceeds the limit
[ https://issues.apache.org/jira/browse/FLINK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333945#comment-17333945 ] Flink Jira Bot commented on FLINK-12376: This issue was marked "stale-assigned" and has not received an update in 7 days. It is now automatically unassigned. If you are still working on it, you can assign it to yourself again. Please also give an update about the status of the work. > GCS runtime exn: Request payload size exceeds the limit > --- > > Key: FLINK-12376 > URL: https://issues.apache.org/jira/browse/FLINK-12376 > Project: Flink > Issue Type: Bug > Components: Connectors / Google Cloud PubSub >Affects Versions: 1.7.2 > Environment: FROM flink:1.8.0-scala_2.11 > ARG version=0.17 > ADD > https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar > /opt/flink/lib > COPY target/analytics-${version}.jar /opt/flink/lib/analytics.jar >Reporter: Henrik >Assignee: Richard Deurwaarder >Priority: Major > Labels: stale-assigned > Attachments: Screenshot 2019-04-30 at 22.32.34.png, Screenshot > 2019-05-08 at 12.41.07.png > > > I'm trying to use the google cloud storage file system, but it would seem > that the FLINK / GCS client libs are creating too-large requests far down in > the GCS Java client. > The Java client is added to the lib folder with this command in Dockerfile > (probably > [hadoop2-1.9.16|https://search.maven.org/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-1.9.16/jar] > at the time of writing): > > {code:java} > ADD > https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar > /opt/flink/lib{code} > This is the crash output. Focus lines: > {code:java} > java.lang.RuntimeException: Error while confirming checkpoint{code} > and > {code:java} > Caused by: com.google.api.gax.rpc.InvalidArgumentException: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size > exceeds the limit: 524288 bytes.{code} > Full stacktrace: > > {code:java} > [analytics-867c867ff6-l622h taskmanager] 2019-04-30 20:23:14,532 INFO > org.apache.flink.runtime.taskmanager.Task - Source: > Custom Source -> Process -> Timestamps/Watermarks -> app_events (1/1) > (9a01e96c0271025d5ba73b735847cd4c) switched from RUNNING to FAILED. > [analytics-867c867ff6-l622h taskmanager] java.lang.RuntimeException: Error > while confirming checkpoint > [analytics-867c867ff6-l622h taskmanager] at > org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1211) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [analytics-867c867ff6-l622h taskmanager] at > java.lang.Thread.run(Thread.java:748) > [analytics-867c867ff6-l622h taskmanager] Caused by: > com.google.api.gax.rpc.InvalidArgumentException: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size > exceeds the limit: 524288 bytes. > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748) > [analyt
[jira] [Commented] (FLINK-12376) GCS runtime exn: Request payload size exceeds the limit
[ https://issues.apache.org/jira/browse/FLINK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17323291#comment-17323291 ] Flink Jira Bot commented on FLINK-12376: This issue is assigned but has not received an update in 7 days so it has been labeled "stale-assigned". If you are still working on the issue, please give an update and remove the label. If you are no longer working on the issue, please unassign so someone else may work on it. In 7 days the issue will be automatically unassigned. > GCS runtime exn: Request payload size exceeds the limit > --- > > Key: FLINK-12376 > URL: https://issues.apache.org/jira/browse/FLINK-12376 > Project: Flink > Issue Type: Bug > Components: Connectors / Google Cloud PubSub >Affects Versions: 1.7.2 > Environment: FROM flink:1.8.0-scala_2.11 > ARG version=0.17 > ADD > https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar > /opt/flink/lib > COPY target/analytics-${version}.jar /opt/flink/lib/analytics.jar >Reporter: Henrik >Assignee: Richard Deurwaarder >Priority: Major > Labels: stale-assigned > Attachments: Screenshot 2019-04-30 at 22.32.34.png, Screenshot > 2019-05-08 at 12.41.07.png > > > I'm trying to use the google cloud storage file system, but it would seem > that the FLINK / GCS client libs are creating too-large requests far down in > the GCS Java client. > The Java client is added to the lib folder with this command in Dockerfile > (probably > [hadoop2-1.9.16|https://search.maven.org/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-1.9.16/jar] > at the time of writing): > > {code:java} > ADD > https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar > /opt/flink/lib{code} > This is the crash output. Focus lines: > {code:java} > java.lang.RuntimeException: Error while confirming checkpoint{code} > and > {code:java} > Caused by: com.google.api.gax.rpc.InvalidArgumentException: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size > exceeds the limit: 524288 bytes.{code} > Full stacktrace: > > {code:java} > [analytics-867c867ff6-l622h taskmanager] 2019-04-30 20:23:14,532 INFO > org.apache.flink.runtime.taskmanager.Task - Source: > Custom Source -> Process -> Timestamps/Watermarks -> app_events (1/1) > (9a01e96c0271025d5ba73b735847cd4c) switched from RUNNING to FAILED. > [analytics-867c867ff6-l622h taskmanager] java.lang.RuntimeException: Error > while confirming checkpoint > [analytics-867c867ff6-l622h taskmanager] at > org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1211) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [analytics-867c867ff6-l622h taskmanager] at > java.lang.Thread.run(Thread.java:748) > [analytics-867c867ff6-l622h taskmanager] Caused by: > com.google.api.gax.rpc.InvalidArgumentException: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size > exceeds the limit: 524288 bytes. > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958) > [analytics-867c867ff6-l622h taskmanager] at > com.g
[jira] [Commented] (FLINK-12376) GCS runtime exn: Request payload size exceeds the limit
[ https://issues.apache.org/jira/browse/FLINK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835503#comment-16835503 ] Henrik commented on FLINK-12376: > Either [~haf] used a version of the connector that did not have this fix > (could you confirm [~haf]?) I'm using the latest version; the one you pinged me and said that you had a fix for some exceptions in (the one that you said would work with 1.7.x) This is the code that was running in this issue. !Screenshot 2019-05-08 at 12.41.07.png! > GCS runtime exn: Request payload size exceeds the limit > --- > > Key: FLINK-12376 > URL: https://issues.apache.org/jira/browse/FLINK-12376 > Project: Flink > Issue Type: Bug > Components: Connectors / Google Cloud PubSub >Affects Versions: 1.7.2 > Environment: FROM flink:1.8.0-scala_2.11 > ARG version=0.17 > ADD > https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar > /opt/flink/lib > COPY target/analytics-${version}.jar /opt/flink/lib/analytics.jar >Reporter: Henrik >Assignee: Richard Deurwaarder >Priority: Major > Attachments: Screenshot 2019-04-30 at 22.32.34.png, Screenshot > 2019-05-08 at 12.41.07.png > > > I'm trying to use the google cloud storage file system, but it would seem > that the FLINK / GCS client libs are creating too-large requests far down in > the GCS Java client. > The Java client is added to the lib folder with this command in Dockerfile > (probably > [hadoop2-1.9.16|https://search.maven.org/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-1.9.16/jar] > at the time of writing): > > {code:java} > ADD > https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar > /opt/flink/lib{code} > This is the crash output. Focus lines: > {code:java} > java.lang.RuntimeException: Error while confirming checkpoint{code} > and > {code:java} > Caused by: com.google.api.gax.rpc.InvalidArgumentException: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size > exceeds the limit: 524288 bytes.{code} > Full stacktrace: > > {code:java} > [analytics-867c867ff6-l622h taskmanager] 2019-04-30 20:23:14,532 INFO > org.apache.flink.runtime.taskmanager.Task - Source: > Custom Source -> Process -> Timestamps/Watermarks -> app_events (1/1) > (9a01e96c0271025d5ba73b735847cd4c) switched from RUNNING to FAILED. > [analytics-867c867ff6-l622h taskmanager] java.lang.RuntimeException: Error > while confirming checkpoint > [analytics-867c867ff6-l622h taskmanager] at > org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1211) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [analytics-867c867ff6-l622h taskmanager] at > java.lang.Thread.run(Thread.java:748) > [analytics-867c867ff6-l622h taskmanager] Caused by: > com.google.api.gax.rpc.InvalidArgumentException: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size > exceeds the limit: 524288 bytes. > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.AbstractFuture
[jira] [Commented] (FLINK-12376) GCS runtime exn: Request payload size exceeds the limit
[ https://issues.apache.org/jira/browse/FLINK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835427#comment-16835427 ] Richard Deurwaarder commented on FLINK-12376: - [~StephanEwen] PubSub has no concept of ordering. This is by design: [https://cloud.google.com/pubsub/docs/ordering] So what is happening is that grpc has limits that are server side and are not exposed to the clients. I've actually added code to counter the exact exact issue [~haf] sees: [https://github.com/apache/flink/pull/6594/files#diff-ea875742509cef8c6f26e1b488447130R125] This code has been inspired/copied from the go pubsub client here: [https://code-review.googlesource.com/c/gocloud/+/9758/2/pubsub/service.go] In short: Instead of acknowledging all id's at once, it tries to split it up in chunks of <500kb and does multiple requests. So one of two things might've happened: * Either [~haf] used a version of the connector that did not have this fix (could you confirm [~haf]?) * The way the connector splits up acknowledgement ids is off, adjusting for overhead isn't the easiest in java :( I found this issue: [https://github.com/GoogleCloudPlatform/pubsub/pull/194] and they propose making ids per request configurable incase pubsub changes this limit on their side. I could make the connector to split into smaller chunks and/or add it as a configuration option but it's quite a technical and hard to tune option. Any thought on how to best approach this? > GCS runtime exn: Request payload size exceeds the limit > --- > > Key: FLINK-12376 > URL: https://issues.apache.org/jira/browse/FLINK-12376 > Project: Flink > Issue Type: Bug > Components: FileSystems >Affects Versions: 1.7.2 > Environment: FROM flink:1.8.0-scala_2.11 > ARG version=0.17 > ADD > https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar > /opt/flink/lib > COPY target/analytics-${version}.jar /opt/flink/lib/analytics.jar >Reporter: Henrik >Priority: Major > Attachments: Screenshot 2019-04-30 at 22.32.34.png > > > I'm trying to use the google cloud storage file system, but it would seem > that the FLINK / GCS client libs are creating too-large requests far down in > the GCS Java client. > The Java client is added to the lib folder with this command in Dockerfile > (probably > [hadoop2-1.9.16|https://search.maven.org/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-1.9.16/jar] > at the time of writing): > > {code:java} > ADD > https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar > /opt/flink/lib{code} > This is the crash output. Focus lines: > {code:java} > java.lang.RuntimeException: Error while confirming checkpoint{code} > and > {code:java} > Caused by: com.google.api.gax.rpc.InvalidArgumentException: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size > exceeds the limit: 524288 bytes.{code} > Full stacktrace: > > {code:java} > [analytics-867c867ff6-l622h taskmanager] 2019-04-30 20:23:14,532 INFO > org.apache.flink.runtime.taskmanager.Task - Source: > Custom Source -> Process -> Timestamps/Watermarks -> app_events (1/1) > (9a01e96c0271025d5ba73b735847cd4c) switched from RUNNING to FAILED. > [analytics-867c867ff6-l622h taskmanager] java.lang.RuntimeException: Error > while confirming checkpoint > [analytics-867c867ff6-l622h taskmanager] at > org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1211) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [analytics-867c867ff6-l622h taskmanager] at > java.lang.Thread.run(Thread.java:748) > [analytics-867c867ff6-l622h taskmanager] Caused by: > com.google.api.gax.rpc.InvalidArgumentException: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size > exceeds the limit: 524288 bytes. > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60) > [analytics-867c867ff6-l622h taskmanager] at >
[jira] [Commented] (FLINK-12376) GCS runtime exn: Request payload size exceeds the limit
[ https://issues.apache.org/jira/browse/FLINK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835366#comment-16835366 ] Stephan Ewen commented on FLINK-12376: -- This is actually not a GCS / checkpoint issue, it is an issue in the source (probably PubSub connector?) The checkpoint completes, then Flink notifies the source that the checkpoint is complete and the source task acks some IDs back. That ack message is too large for the PubSub client's RPC service. I think we need to rethink how the PubSub source works. Seems that keeping the IDS and acknowledging a large number or records is not feasible in a stable way. I am not a PubSub expert, but is there a way to keep something like a sequence number (or vector of sequence numbers), similar to Kafka's offsets? > GCS runtime exn: Request payload size exceeds the limit > --- > > Key: FLINK-12376 > URL: https://issues.apache.org/jira/browse/FLINK-12376 > Project: Flink > Issue Type: Bug > Components: FileSystems >Affects Versions: 1.7.2 > Environment: FROM flink:1.8.0-scala_2.11 > ARG version=0.17 > ADD > https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar > /opt/flink/lib > COPY target/analytics-${version}.jar /opt/flink/lib/analytics.jar >Reporter: Henrik >Priority: Major > Attachments: Screenshot 2019-04-30 at 22.32.34.png > > > I'm trying to use the google cloud storage file system, but it would seem > that the FLINK / GCS client libs are creating too-large requests far down in > the GCS Java client. > The Java client is added to the lib folder with this command in Dockerfile > (probably > [hadoop2-1.9.16|https://search.maven.org/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-1.9.16/jar] > at the time of writing): > > {code:java} > ADD > https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar > /opt/flink/lib{code} > This is the crash output. Focus lines: > {code:java} > java.lang.RuntimeException: Error while confirming checkpoint{code} > and > {code:java} > Caused by: com.google.api.gax.rpc.InvalidArgumentException: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size > exceeds the limit: 524288 bytes.{code} > Full stacktrace: > > {code:java} > [analytics-867c867ff6-l622h taskmanager] 2019-04-30 20:23:14,532 INFO > org.apache.flink.runtime.taskmanager.Task - Source: > Custom Source -> Process -> Timestamps/Watermarks -> app_events (1/1) > (9a01e96c0271025d5ba73b735847cd4c) switched from RUNNING to FAILED. > [analytics-867c867ff6-l622h taskmanager] java.lang.RuntimeException: Error > while confirming checkpoint > [analytics-867c867ff6-l622h taskmanager] at > org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1211) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [analytics-867c867ff6-l622h taskmanager] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [analytics-867c867ff6-l622h taskmanager] at > java.lang.Thread.run(Thread.java:748) > [analytics-867c867ff6-l622h taskmanager] Caused by: > com.google.api.gax.rpc.InvalidArgumentException: > io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size > exceeds the limit: 524288 bytes. > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97) > [analytics-867c867ff6-l622h taskmanager] at > com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30) > [analytics-867c867ff6-l622h taskmanager] at > com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138) > [analytics-867c867ff6-l622h taskmanager] at > com.googl