I see. So there are actually 3000 tasks instead of 3000 jobs right?
Would you mind to provide the full stack trace of the GC issue? At first
I thought it's identical to the _metadata one in the mail thread you
mentioned.
Cheng
On 1/11/16 5:30 PM, Gavin Yue wrote:
Here is how I set the conf:
Hey Gavin,
Could you please provide a snippet of your code to show how did you
disabled "parquet.enable.summary-metadata" and wrote the files?
Especially, you mentioned you saw "3000 jobs" failed. Were you writing
each Parquet file with an individual job? (Usually people use
Hey,
I am trying to convert a bunch of json files into parquet, which would
output over 7000 parquet files. But tthere are too many files, so I want
to repartition based on id to 3000.
But I got the error of GC problem like this one: