yihua commented on issue #7487:
URL: https://github.com/apache/hudi/issues/7487#issuecomment-1362308145

   Hi @AdarshKadameriTR Thanks for raising this.  Does the read timeout happen 
in a write job or a query?  Could you ask the AWS support to clarify what types 
of quota limits are reached?
   
   I'm not aware of any hard quota limit on reading or writing files on S3.  S3 
[charges more](https://aws.amazon.com/s3/pricing/) for a higher number of 
requests going to the buckets.  There is [rate limiting or throttle on the 
requests going to S3](https://repost.aws/knowledge-center/http-5xx-errors-s3): 
for example, you can send 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests 
per second per prefix in an S3 bucket.  Even if the Spark job hits throttling, 
usually retries should get around it with longer job time.
   
   Looping in AWS EMR folks.  @umehrot2 @rahil-c have you folks seen such an 
issue for Hudi tables on S3?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to