Hi all,

I’m interested in hearing the community’s thoughts on best practices to do
integration testing for spark sql jobs. We run a lot of our jobs with cloud
infrastructure and hdfs - this makes debugging a challenge for us,
especially with problems that don’t occur from just initializing a
sparksession locally or testing with spark-shell. Ideally, we’d like some
sort of docker container emulating hdfs and spark cluster mode, that you
can run locally.

Any test framework, tips, or examples people can share? Thanks!
-- 
Cheers,
Ruijing Li

Reply via email to