Hi Kostas, and everyone,

Just some update to my issue: I have tried to:
 * changed s3 related configuration in hadoop as suggested by hadoop
document [1]: 
 increased /fs.s3a.threads.max/ from 10 to 100, and
/fs.s3a.connection.maximum/ from 15 to 120. For reference, I am having only
3 S3 sinks, with parallelisms of 4, 4, and 1.
 * followed AWS's document [2] to increase their EMRFS maxConnections to
200. However, I doubt that this would make any difference, as in creating
the S3 parquet bucket sink, I needed to use "s3a://..." path. "s3://..."
seems not supported by Flink yet. 
 * reduced the parallelism for my S3 continuous files reader.

However, the problem still randomly occurred (random by job executions. When
it occurred, the only solution is to cancel the job and restart from the
last successful checkpoint).

Thanks and regards,
Averell

[1]  Hadoop-AWS module: Integration with Amazon Web Services
<https://hadoop.apache.org/docs/stable/hadoop-aws/tools/hadoop-aws/index.html#aTimeout_waiting_for_connection_from_pool_when_writing_to_S3A>
  
[2]  emr-timeout-connection-wait
<https://aws.amazon.com/premiumsupport/knowledge-center/emr-timeout-connection-wait/>
  



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply via email to