[jira] [Commented] (FLINK-12376) GCS runtime exn: Request payload size exceeds the limit

2021-06-08 Thread Henrik (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17359126#comment-17359126
 ] 

Henrik commented on FLINK-12376:


It's messed up that you let this linger for two years before acting on it and 
when you act on it, you let a bot do it.

> GCS runtime exn: Request payload size exceeds the limit
> ---
>
> Key: FLINK-12376
> URL: https://issues.apache.org/jira/browse/FLINK-12376
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Google Cloud PubSub
>Affects Versions: 1.7.2
> Environment: FROM flink:1.8.0-scala_2.11
> ARG version=0.17
> ADD 
> https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar
>  /opt/flink/lib
> COPY target/analytics-${version}.jar /opt/flink/lib/analytics.jar
>Reporter: Henrik
>Priority: Major
>  Labels: auto-unassigned, stale-major
> Attachments: Screenshot 2019-04-30 at 22.32.34.png, Screenshot 
> 2019-05-08 at 12.41.07.png
>
>
> I'm trying to use the google cloud storage file system, but it would seem 
> that the FLINK / GCS client libs are creating too-large requests far down in 
> the GCS Java client.
> The Java client is added to the lib folder with this command in Dockerfile 
> (probably 
> [hadoop2-1.9.16|https://search.maven.org/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-1.9.16/jar]
>  at the time of writing):
>  
> {code:java}
> ADD 
> https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar
>  /opt/flink/lib{code}
> This is the crash output. Focus lines:
> {code:java}
> java.lang.RuntimeException: Error while confirming checkpoint{code}
> and
> {code:java}
>  Caused by: com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size 
> exceeds the limit: 524288 bytes.{code}
> Full stacktrace:
>  
> {code:java}
> [analytics-867c867ff6-l622h taskmanager] 2019-04-30 20:23:14,532 INFO  
> org.apache.flink.runtime.taskmanager.Task - Source: 
> Custom Source -> Process -> Timestamps/Watermarks -> app_events (1/1) 
> (9a01e96c0271025d5ba73b735847cd4c) switched from RUNNING to FAILED.
> [analytics-867c867ff6-l622h taskmanager] java.lang.RuntimeException: Error 
> while confirming checkpoint
> [analytics-867c867ff6-l622h taskmanager]     at 
> org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1211)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.lang.Thread.run(Thread.java:748)
> [analytics-867c867ff6-l622h taskmanager] Caused by: 
> com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size 
> exceeds the limit: 524288 bytes.
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> [analytics-867c867ff6-l622h taskmanager]     at 
> io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:507)
> [analytics-867c867ff6-l622h taskmanager]     at 
> io.

[jira] [Commented] (FLINK-12376) GCS runtime exn: Request payload size exceeds the limit

2021-04-27 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17333945#comment-17333945
 ] 

Flink Jira Bot commented on FLINK-12376:


This issue was marked "stale-assigned" and has not received an update in 7 
days. It is now automatically unassigned. If you are still working on it, you 
can assign it to yourself again. Please also give an update about the status of 
the work.

> GCS runtime exn: Request payload size exceeds the limit
> ---
>
> Key: FLINK-12376
> URL: https://issues.apache.org/jira/browse/FLINK-12376
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Google Cloud PubSub
>Affects Versions: 1.7.2
> Environment: FROM flink:1.8.0-scala_2.11
> ARG version=0.17
> ADD 
> https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar
>  /opt/flink/lib
> COPY target/analytics-${version}.jar /opt/flink/lib/analytics.jar
>Reporter: Henrik
>Assignee: Richard Deurwaarder
>Priority: Major
>  Labels: stale-assigned
> Attachments: Screenshot 2019-04-30 at 22.32.34.png, Screenshot 
> 2019-05-08 at 12.41.07.png
>
>
> I'm trying to use the google cloud storage file system, but it would seem 
> that the FLINK / GCS client libs are creating too-large requests far down in 
> the GCS Java client.
> The Java client is added to the lib folder with this command in Dockerfile 
> (probably 
> [hadoop2-1.9.16|https://search.maven.org/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-1.9.16/jar]
>  at the time of writing):
>  
> {code:java}
> ADD 
> https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar
>  /opt/flink/lib{code}
> This is the crash output. Focus lines:
> {code:java}
> java.lang.RuntimeException: Error while confirming checkpoint{code}
> and
> {code:java}
>  Caused by: com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size 
> exceeds the limit: 524288 bytes.{code}
> Full stacktrace:
>  
> {code:java}
> [analytics-867c867ff6-l622h taskmanager] 2019-04-30 20:23:14,532 INFO  
> org.apache.flink.runtime.taskmanager.Task - Source: 
> Custom Source -> Process -> Timestamps/Watermarks -> app_events (1/1) 
> (9a01e96c0271025d5ba73b735847cd4c) switched from RUNNING to FAILED.
> [analytics-867c867ff6-l622h taskmanager] java.lang.RuntimeException: Error 
> while confirming checkpoint
> [analytics-867c867ff6-l622h taskmanager]     at 
> org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1211)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.lang.Thread.run(Thread.java:748)
> [analytics-867c867ff6-l622h taskmanager] Caused by: 
> com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size 
> exceeds the limit: 524288 bytes.
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> [analyt

[jira] [Commented] (FLINK-12376) GCS runtime exn: Request payload size exceeds the limit

2021-04-16 Thread Flink Jira Bot (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17323291#comment-17323291
 ] 

Flink Jira Bot commented on FLINK-12376:


This issue is assigned but has not received an update in 7 days so it has been 
labeled "stale-assigned". If you are still working on the issue, please give an 
update and remove the label. If you are no longer working on the issue, please 
unassign so someone else may work on it. In 7 days the issue will be 
automatically unassigned.

> GCS runtime exn: Request payload size exceeds the limit
> ---
>
> Key: FLINK-12376
> URL: https://issues.apache.org/jira/browse/FLINK-12376
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Google Cloud PubSub
>Affects Versions: 1.7.2
> Environment: FROM flink:1.8.0-scala_2.11
> ARG version=0.17
> ADD 
> https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar
>  /opt/flink/lib
> COPY target/analytics-${version}.jar /opt/flink/lib/analytics.jar
>Reporter: Henrik
>Assignee: Richard Deurwaarder
>Priority: Major
>  Labels: stale-assigned
> Attachments: Screenshot 2019-04-30 at 22.32.34.png, Screenshot 
> 2019-05-08 at 12.41.07.png
>
>
> I'm trying to use the google cloud storage file system, but it would seem 
> that the FLINK / GCS client libs are creating too-large requests far down in 
> the GCS Java client.
> The Java client is added to the lib folder with this command in Dockerfile 
> (probably 
> [hadoop2-1.9.16|https://search.maven.org/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-1.9.16/jar]
>  at the time of writing):
>  
> {code:java}
> ADD 
> https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar
>  /opt/flink/lib{code}
> This is the crash output. Focus lines:
> {code:java}
> java.lang.RuntimeException: Error while confirming checkpoint{code}
> and
> {code:java}
>  Caused by: com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size 
> exceeds the limit: 524288 bytes.{code}
> Full stacktrace:
>  
> {code:java}
> [analytics-867c867ff6-l622h taskmanager] 2019-04-30 20:23:14,532 INFO  
> org.apache.flink.runtime.taskmanager.Task - Source: 
> Custom Source -> Process -> Timestamps/Watermarks -> app_events (1/1) 
> (9a01e96c0271025d5ba73b735847cd4c) switched from RUNNING to FAILED.
> [analytics-867c867ff6-l622h taskmanager] java.lang.RuntimeException: Error 
> while confirming checkpoint
> [analytics-867c867ff6-l622h taskmanager]     at 
> org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1211)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.lang.Thread.run(Thread.java:748)
> [analytics-867c867ff6-l622h taskmanager] Caused by: 
> com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size 
> exceeds the limit: 524288 bytes.
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.g

[jira] [Commented] (FLINK-12376) GCS runtime exn: Request payload size exceeds the limit

2019-05-08 Thread Henrik (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835503#comment-16835503
 ] 

Henrik commented on FLINK-12376:


> Either [~haf] used a version of the connector that did not have this fix 
> (could you confirm [~haf]?)

I'm using the latest version; the one you pinged me and said that you had a fix 
for some exceptions in (the one that you said would work with 1.7.x)

This is the code that was running in this issue.

!Screenshot 2019-05-08 at 12.41.07.png!

> GCS runtime exn: Request payload size exceeds the limit
> ---
>
> Key: FLINK-12376
> URL: https://issues.apache.org/jira/browse/FLINK-12376
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Google Cloud PubSub
>Affects Versions: 1.7.2
> Environment: FROM flink:1.8.0-scala_2.11
> ARG version=0.17
> ADD 
> https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar
>  /opt/flink/lib
> COPY target/analytics-${version}.jar /opt/flink/lib/analytics.jar
>Reporter: Henrik
>Assignee: Richard Deurwaarder
>Priority: Major
> Attachments: Screenshot 2019-04-30 at 22.32.34.png, Screenshot 
> 2019-05-08 at 12.41.07.png
>
>
> I'm trying to use the google cloud storage file system, but it would seem 
> that the FLINK / GCS client libs are creating too-large requests far down in 
> the GCS Java client.
> The Java client is added to the lib folder with this command in Dockerfile 
> (probably 
> [hadoop2-1.9.16|https://search.maven.org/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-1.9.16/jar]
>  at the time of writing):
>  
> {code:java}
> ADD 
> https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar
>  /opt/flink/lib{code}
> This is the crash output. Focus lines:
> {code:java}
> java.lang.RuntimeException: Error while confirming checkpoint{code}
> and
> {code:java}
>  Caused by: com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size 
> exceeds the limit: 524288 bytes.{code}
> Full stacktrace:
>  
> {code:java}
> [analytics-867c867ff6-l622h taskmanager] 2019-04-30 20:23:14,532 INFO  
> org.apache.flink.runtime.taskmanager.Task - Source: 
> Custom Source -> Process -> Timestamps/Watermarks -> app_events (1/1) 
> (9a01e96c0271025d5ba73b735847cd4c) switched from RUNNING to FAILED.
> [analytics-867c867ff6-l622h taskmanager] java.lang.RuntimeException: Error 
> while confirming checkpoint
> [analytics-867c867ff6-l622h taskmanager]     at 
> org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1211)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.lang.Thread.run(Thread.java:748)
> [analytics-867c867ff6-l622h taskmanager] Caused by: 
> com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size 
> exceeds the limit: 524288 bytes.
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.AbstractFuture

[jira] [Commented] (FLINK-12376) GCS runtime exn: Request payload size exceeds the limit

2019-05-08 Thread Richard Deurwaarder (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835427#comment-16835427
 ] 

Richard Deurwaarder commented on FLINK-12376:
-

[~StephanEwen] PubSub has no concept of ordering. This is by design: 
[https://cloud.google.com/pubsub/docs/ordering] 

 

So what is happening is that grpc has limits that are server side and are not 
exposed to the clients.

I've actually added code to counter the exact exact issue [~haf] sees: 
[https://github.com/apache/flink/pull/6594/files#diff-ea875742509cef8c6f26e1b488447130R125]
 This code has been inspired/copied from the go pubsub client here: 
[https://code-review.googlesource.com/c/gocloud/+/9758/2/pubsub/service.go]

In short: Instead of acknowledging all id's at once, it tries to split it up in 
chunks of <500kb and does multiple requests. 

 

So one of two things might've happened:
 * Either [~haf] used a version of the connector that did not have this fix 
(could you confirm [~haf]?)
 * The way the connector splits up acknowledgement ids is off, adjusting for 
overhead isn't the easiest in java :(

I found this issue: [https://github.com/GoogleCloudPlatform/pubsub/pull/194] 
and they propose making ids per request configurable incase pubsub changes this 
limit on their side.

 

 

I could make the connector to split into smaller chunks and/or add it as a 
configuration option but it's quite a technical and hard to tune option.

Any thought on how to best approach this?

> GCS runtime exn: Request payload size exceeds the limit
> ---
>
> Key: FLINK-12376
> URL: https://issues.apache.org/jira/browse/FLINK-12376
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems
>Affects Versions: 1.7.2
> Environment: FROM flink:1.8.0-scala_2.11
> ARG version=0.17
> ADD 
> https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar
>  /opt/flink/lib
> COPY target/analytics-${version}.jar /opt/flink/lib/analytics.jar
>Reporter: Henrik
>Priority: Major
> Attachments: Screenshot 2019-04-30 at 22.32.34.png
>
>
> I'm trying to use the google cloud storage file system, but it would seem 
> that the FLINK / GCS client libs are creating too-large requests far down in 
> the GCS Java client.
> The Java client is added to the lib folder with this command in Dockerfile 
> (probably 
> [hadoop2-1.9.16|https://search.maven.org/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-1.9.16/jar]
>  at the time of writing):
>  
> {code:java}
> ADD 
> https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar
>  /opt/flink/lib{code}
> This is the crash output. Focus lines:
> {code:java}
> java.lang.RuntimeException: Error while confirming checkpoint{code}
> and
> {code:java}
>  Caused by: com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size 
> exceeds the limit: 524288 bytes.{code}
> Full stacktrace:
>  
> {code:java}
> [analytics-867c867ff6-l622h taskmanager] 2019-04-30 20:23:14,532 INFO  
> org.apache.flink.runtime.taskmanager.Task - Source: 
> Custom Source -> Process -> Timestamps/Watermarks -> app_events (1/1) 
> (9a01e96c0271025d5ba73b735847cd4c) switched from RUNNING to FAILED.
> [analytics-867c867ff6-l622h taskmanager] java.lang.RuntimeException: Error 
> while confirming checkpoint
> [analytics-867c867ff6-l622h taskmanager]     at 
> org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1211)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.lang.Thread.run(Thread.java:748)
> [analytics-867c867ff6-l622h taskmanager] Caused by: 
> com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size 
> exceeds the limit: 524288 bytes.
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
> [analytics-867c867ff6-l622h taskmanager]     at 
> 

[jira] [Commented] (FLINK-12376) GCS runtime exn: Request payload size exceeds the limit

2019-05-07 Thread Stephan Ewen (JIRA)


[ 
https://issues.apache.org/jira/browse/FLINK-12376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16835366#comment-16835366
 ] 

Stephan Ewen commented on FLINK-12376:
--

This is actually not a GCS / checkpoint issue, it is an issue in the source 
(probably PubSub connector?)

The checkpoint completes, then Flink notifies the source that the checkpoint is 
complete and the source task acks some IDs back. That ack message is too large 
for the PubSub client's RPC service.

I think we need to rethink how the PubSub source works. Seems that keeping the 
IDS and acknowledging a large number or records is not feasible in a stable way.

I am not a PubSub expert, but is there a way to keep something like a sequence 
number (or vector of sequence numbers), similar to Kafka's offsets? 

> GCS runtime exn: Request payload size exceeds the limit
> ---
>
> Key: FLINK-12376
> URL: https://issues.apache.org/jira/browse/FLINK-12376
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems
>Affects Versions: 1.7.2
> Environment: FROM flink:1.8.0-scala_2.11
> ARG version=0.17
> ADD 
> https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar
>  /opt/flink/lib
> COPY target/analytics-${version}.jar /opt/flink/lib/analytics.jar
>Reporter: Henrik
>Priority: Major
> Attachments: Screenshot 2019-04-30 at 22.32.34.png
>
>
> I'm trying to use the google cloud storage file system, but it would seem 
> that the FLINK / GCS client libs are creating too-large requests far down in 
> the GCS Java client.
> The Java client is added to the lib folder with this command in Dockerfile 
> (probably 
> [hadoop2-1.9.16|https://search.maven.org/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-1.9.16/jar]
>  at the time of writing):
>  
> {code:java}
> ADD 
> https://storage.googleapis.com/hadoop-lib/gcs/gcs-connector-latest-hadoop2.jar
>  /opt/flink/lib{code}
> This is the crash output. Focus lines:
> {code:java}
> java.lang.RuntimeException: Error while confirming checkpoint{code}
> and
> {code:java}
>  Caused by: com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size 
> exceeds the limit: 524288 bytes.{code}
> Full stacktrace:
>  
> {code:java}
> [analytics-867c867ff6-l622h taskmanager] 2019-04-30 20:23:14,532 INFO  
> org.apache.flink.runtime.taskmanager.Task - Source: 
> Custom Source -> Process -> Timestamps/Watermarks -> app_events (1/1) 
> (9a01e96c0271025d5ba73b735847cd4c) switched from RUNNING to FAILED.
> [analytics-867c867ff6-l622h taskmanager] java.lang.RuntimeException: Error 
> while confirming checkpoint
> [analytics-867c867ff6-l622h taskmanager]     at 
> org.apache.flink.runtime.taskmanager.Task$2.run(Task.java:1211)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [analytics-867c867ff6-l622h taskmanager]     at 
> java.lang.Thread.run(Thread.java:748)
> [analytics-867c867ff6-l622h taskmanager] Caused by: 
> com.google.api.gax.rpc.InvalidArgumentException: 
> io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Request payload size 
> exceeds the limit: 524288 bytes.
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:49)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> [analytics-867c867ff6-l622h taskmanager]     at 
> com.googl