[ https://issues.apache.org/jira/browse/SPARK-27098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16793082#comment-16793082 ]
Martin Loncaric commented on SPARK-27098: ----------------------------------------- [~ste...@apache.org] Does this make more sense to you? This seems to suggest a bug in either Spark or Hadoop, but do you have a better idea of where to look? > Flaky missing file parts when writing to Ceph without error > ----------------------------------------------------------- > > Key: SPARK-27098 > URL: https://issues.apache.org/jira/browse/SPARK-27098 > Project: Spark > Issue Type: Bug > Components: Input/Output > Affects Versions: 2.4.0 > Reporter: Martin Loncaric > Priority: Major > Attachments: sanitized_stdout_00001.txt > > > https://stackoverflow.com/questions/54935822/spark-s3a-write-omits-upload-part-without-failure/55031233?noredirect=1#comment96835218_55031233 > Using 2.4.0 with Hadoop 2.7, hadoop-aws 2.7.5, and the Ceph S3 endpoint. > occasionally a file part will be missing; i.e. part 00003 here: > ``` > > aws s3 ls my-bucket/folder/ > 2019-02-28 13:07:21 0 _SUCCESS > 2019-02-28 13:06:58 79428651 > part-00000-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:06:59 79586172 > part-00001-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:00 79561910 > part-00002-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:01 79192617 > part-00004-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:07 79364413 > part-00005-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:08 79623254 > part-00006-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:10 79445030 > part-00007-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:10 79474923 > part-00008-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:11 79477310 > part-00009-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:12 79331453 > part-00010-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:13 79567600 > part-00011-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:13 79388012 > part-00012-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:14 79308387 > part-00013-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:15 79455483 > part-00014-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:17 79512342 > part-00015-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:18 79403307 > part-00016-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:18 79617769 > part-00017-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:19 79333534 > part-00018-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > 2019-02-28 13:07:20 79543324 > part-00019-5789ebf5-b55d-4715-8bb5-dfc5c4e4b999-c000.snappy.parquet > ``` > However, the write succeeds and leaves a _SUCCESS file. > This can be caught by additionally checking afterward whether the number of > written file parts agrees with the number of partitions, but Spark should at > least fail on its own and leave a meaningful stack trace in this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org