Hi, Occasionally, spark generates some parquet files having only 4 bytes. The content is "PAR1". ETL spark jobs cannot handle such corrupted files and ignore the whole partition containing such poison pill files, causing big data loss.
Spark also generates 0 bytes parquet files but they can be handled by spark. What could be cause for spark to generate such 4 bytes files? Any clue is appreciated!