[ https://issues.apache.org/jira/browse/SPARK-27098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16787792#comment-16787792 ]
Steve Loughran commented on SPARK-27098: ---------------------------------------- It was my suggestion to file this. * If this was AWS S3 this would "just" be AWS S3's eventul consistency surfacing on renames: the directory listing needed to mimic the rename missing the newly committed files -most likely when the write is immediately before the rename (v2 task commit, v1 job commit + the straggler tasks), with the solutions being the standard ones: use a table in dynamo for list consistency, or a zero-rename-committer which doesn't need consistent listings. (Or: Iceberg) * But this is Ceph, which is, AFAIK, consistent. # Who has played with Ceph as the destination store for queries? Through the S3A libraries? # What do people think can be enabled/added to the spark-level committers to detect this problem. The tasks know the files they've actually created and can report to the job committer -it could do a post-job-commit audit of the output and fail if something is missing. [~mwlon] is going to be the one trying to debug this. Martin: # you can get some more logging of what's up in the S3A code by setting the log for {{org.apache.hadoop.fs.s3a.S3AFileSystem}} to debug and looking for log entries beginning "Rename path". At least I think so, that 2.7.x codebase is 3+ years old, frozen for all but security fixes for 12 months, and never going to get another release (related to the AWS SDK, ironically). # the Hadoop 2.9.x releases do have S3Guard in, and while using a remote DDB table to add consistency to a local Ceph store is pretty inefficient, it'd be interesting to see whether enabling it would make this problem go away. In which case, you've just found a bug in Ceph # [Ryan's S3 committers|https://github.com/rdblue/s3committer] do work with hadoop 2.7.x. Try them > Flaky missing file parts when writing to Ceph without error > ----------------------------------------------------------- > > Key: SPARK-27098 > URL: https://issues.apache.org/jira/browse/SPARK-27098 > Project: Spark > Issue Type: Bug > Components: Input/Output > Affects Versions: 2.4.0 > Reporter: Martin Loncaric > Priority: Major > > https://stackoverflow.com/questions/54935822/spark-s3a-write-omits-upload-part-without-failure/55031233?noredirect=1#comment96835218_55031233 > Using 2.4.0 with Hadoop 2.7, hadoop-aws 2.7.5, and the Ceph S3 endpoint. > occasionally a file part will be missing; i.e. part 00003 here: > ``` > > aws s3 ls my-bucket/folder/ > 2019-02-28 13:07:21 0 _SUCCESS > 2019-02-28 13:06:58 79428651 > part-00000-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:06:59 79586172 > part-00001-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:00 79561910 > part-00002-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:01 79192617 > part-00004-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:07 79364413 > part-00005-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:08 79623254 > part-00006-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:10 79445030 > part-00007-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:10 79474923 > part-00008-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:11 79477310 > part-00009-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:12 79331453 > part-00010-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:13 79567600 > part-00011-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:13 79388012 > part-00012-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:14 79308387 > part-00013-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:15 79455483 > part-00014-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:17 79512342 > part-00015-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:18 79403307 > part-00016-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:18 79617769 > part-00017-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:19 79333534 > part-00018-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:20 79543324 > part-00019-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > ``` > However, the write succeeds and leaves a _SUCCESS file. > This can be caught by additionally checking afterward whether the number of > written file parts agrees with the number of partitions, but Spark should at > least fail on its own and leave a meaningful stack trace in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org