[ 
https://issues.apache.org/jira/browse/IMPALA-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe McDonnell resolved IMPALA-9112.
-----------------------------------
    Fix Version/s: Not Applicable
       Resolution: Won't Fix

The consistency aspect of this will no longer apply. S3 now has strong 
consistency: [https://aws.amazon.com/s3/consistency/]

I'm going to close this. We can consider performance aspects in a separate JIRA.

> Consider removing hdfsExists calls when writing files to S3
> -----------------------------------------------------------
>
>                 Key: IMPALA-9112
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9112
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Major
>             Fix For: Not Applicable
>
>
> There are a few places in the backend where we call {{hdfsExists}} before 
> writing out a file. This can cause issues when writing data to S3, because S3 
> can cache 404 Not Found errors. This issue manifests itself with errors such 
> as:
> {code:java}
> ERROR: Error(s) moving partition files. First error (of 1) was: Hdfs op 
> (RENAME 
> s3a://[bucket-name]/[table-name]/_impala_insert_staging/3943ae7ccf00711e_59606d8800000000/.3943ae7ccf00711e-59606d880000000b_562151879_dir/year=2015/3943ae7ccf00711e-59606d880000000b_1994902389_data.0.parq
>  TO 
> s3a://[bucket-name]/[table-name]/3943ae7ccf00711e-59606d880000000b_1994902389_data.0.parq)
>  failed, error was: 
> s3a://[bucket-name]/[table-name]/_impala_insert_staging/3943ae7ccf00711e_59606d8800000000/.3943ae7ccf00711e-59606d880000000b_562151879_dir/year=2015/3943ae7ccf00711e-59606d880000000b_1994902389_data.0.parq
> Error(5): Input/output error
> Root cause: AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 
> 404; Error Code: 404 Not Found; Request ID: []; S3 Extended Request ID: 
> []){code}
> HADOOP-13884, HADOOP-13950, HADOOP-16490 - the HDFS clients allow specifying 
> an "overwrite" option when creating a file; this can avoid doing any HEAD 
> requests when opening a file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to