[ https://issues.apache.org/jira/browse/IMPALA-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022503#comment-17022503 ]
Steve Loughran commented on IMPALA-9112: ---------------------------------------- you can use createFile(path, false) to say "overwrite is not allowed" on HDFS, native FS this is an atomice create-no-overwrite call; for s3 and abfs we do a HEAD. saying overwrite=false means that s3A client will do that HEAD (and so may create a 404), but you at least save the overhead of your own round trip call to the store > Consider removing hdfsExists calls when writing files to S3 > ----------------------------------------------------------- > > Key: IMPALA-9112 > URL: https://issues.apache.org/jira/browse/IMPALA-9112 > Project: IMPALA > Issue Type: Improvement > Components: Backend > Reporter: Sahil Takiar > Assignee: Sahil Takiar > Priority: Major > > There are a few places in the backend where we call {{hdfsExists}} before > writing out a file. This can cause issues when writing data to S3, because S3 > can cache 404 Not Found errors. This issue manifests itself with errors such > as: > {code:java} > ERROR: Error(s) moving partition files. First error (of 1) was: Hdfs op > (RENAME > s3a://[bucket-name]/[table-name]/_impala_insert_staging/3943ae7ccf00711e_59606d8800000000/.3943ae7ccf00711e-59606d880000000b_562151879_dir/year=2015/3943ae7ccf00711e-59606d880000000b_1994902389_data.0.parq > TO > s3a://[bucket-name]/[table-name]/3943ae7ccf00711e-59606d880000000b_1994902389_data.0.parq) > failed, error was: > s3a://[bucket-name]/[table-name]/_impala_insert_staging/3943ae7ccf00711e_59606d8800000000/.3943ae7ccf00711e-59606d880000000b_562151879_dir/year=2015/3943ae7ccf00711e-59606d880000000b_1994902389_data.0.parq > Error(5): Input/output error > Root cause: AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: > 404; Error Code: 404 Not Found; Request ID: []; S3 Extended Request ID: > []){code} > HADOOP-13884, HADOOP-13950, HADOOP-16490 - the HDFS clients allow specifying > an "overwrite" option when creating a file; this can avoid doing any HEAD > requests when opening a file. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org