hi guys ,
i have this error after 5 hours of processing i make lot of joins 14 left
joins
with small table :



 i saw in the spark ui  and console log evrithing ok but when he save
last join i get this error

Py4JJavaError: An error occurred while calling o115.parquet. _metadata is
not a Parquet file (too small)

i use 4 containers  26 go each and 8 cores i increase number of partition
and  i use broadcast join  whithout succes i get log file but he s large 57
mo i can't share with you .

i use pyspark 1.5.0 on cloudera 5.5.1 and yarn  and i use
hivecontext  for dealing with data.

Reply via email to