I finally isolated the issue to be related to the ActorSystem I reuse from
SparkEnv.get.actorSystem. This ActorSystem will contain the configuration
defined in my application jar's reference.conf in both local cluster case,
and in the case I use it directly in an extension to BaseRelation's buildScan
method. However if used in my RDD which is returned in the buildScan, it
loses the configuration.
I solve / bypass the problem by checking if my configuration exists in the
SparkEnv.get.actorSystem(settings.config) .If it does not exist, I will
create a new ActorSystem using my class's classLoader to force config
reading from my application jar:
val classLoader = this.getClass.getClassLoader
val myconfig = ConfigFactory.load(classLoader)// force config
reading from my classloader
ActorSystem("somename..",myconfig,classLoader)
I wonder if this different behavior of SparkEnv.get.actorSystem is
working-as-designed, or something is missing in executor setup for this
custom RDD driven execution case.
Thanks.
Yang