Hi, I have some 5G of data. distributed in some 597 sequence files. My application does a flatmap on the union of all rdd's created from individual files. The flatmap statement throws java.lang.stackOverflowError with the default stack size. I increased the stack size to 1g (both system and jvm). Now, it has started printing "Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory" and is not moving forward. Just printing it in the continuous loop. Any ideas? Or suggestions would help. Archit.
-Thx.