I have wrote an spark streaming application reading kafka data and convert
the json data to parquet and save to hdfs.
What make me puzzled is, the processing time of app in yarn mode cost 20%
to 50% more time than in local mode. My cluster have three nodes with three
node managers, and all three hosts have same hardware, 40cores and 256GB
memory. .

Why? How to solve it?

Regard,
Junfeng Chen

Reply via email to