Hi,

Occasionally, spark generates some parquet files having only 4 bytes. The
content is "PAR1". ETL spark jobs cannot handle such corrupted files and
ignore the whole partition containing such poison pill files, causing big
data loss.

Spark also generates 0 bytes parquet files but they can be handled by spark.

What could be cause for spark to generate such 4 bytes files? Any clue is
appreciated!

Reply via email to