Hi all,

I'm trying to run a spark job (written in scala) that uses addFile to
download some small files to each node. However, one of the downloaded
files has an incorrect size (the other ones are ok), which causes an error
when using it in the code.

I have looked more into the issue and hexdump'ed both the original and the
spark-retrieved files. The beginning of the files are exactly equal, but
the spark-retrieved one just gets truncated at a "random" position. This
position appears random, however I noticed that it is exactly half the size
of the original file. Not sure if a coincidence or not.

The original file has a size of 296 bytes (the others are a little bit
bigger, around 13 kbytes).

I'm kinda new to spark, so I'm stuck at this point trying to figure out
what is the problem. Does anyone have any idea of what might be the problem
here?

Thank you,
Bernardo

Reply via email to