Hi All,

Our data job is very complex (e.g. 100+ joins), and we have switched from
RDD to Dataset recently.

We've found that the unit test takes much longer. We profiled it and have
found that it's the planning phase that is slow, not execution.

I wonder if anyone has encountered this issue before and if there's a way
to make the planning phase faster (e.g. maybe disabling certain optimizers).

Any thoughts or input would be appreciated.

Thank you,
Tanin

Reply via email to