Github user dalaro commented on the issue:
https://github.com/apache/incubator-tinkerpop/pull/325
After discussion on Slack, I think @okram and I tentatively agreed to
proceed with this PR after I do additional work to save users who have custom
serializers the effort of maintaining separate `IoRegistry` and
`spark.kryo.registrator` implementations with near-identical contents.
I may discover other complications during the process, but I think this
involves two changes:
1. I will attempt to subclass KryoSerializer so that I can access the
SparkConf passed to its constructior and check for
`GryoPool.CONFIG_IO_REGISTRY` (similar to what GryoSerializer does now), then
apply any registrations found therein so long as each registration either:
* specifies no explicit serializer (using Kryo's internal default) or
* specifies a shim serializer
In particular, if the registry contains an old-style TP shaded Kryo
serializer that hasn't been ported to the shim, then the KryoSerializer
subclass will throw an exception.
This change is necessary to support custom-serialized,
`IoRegistry`-listed datatypes that appear in most Spark data outside of
closures (say, in the RDD itself).
2. I will replace current callsites of `HadoopPools.initialize(conf)` with
some other static method that calls `HadoopPools.initialize(conf)` and then
calls some roughly equivalent `initialize(conf)` method for the static KryoPool
backing `KryoPoolShimService`.
This change is necessary to support custom-serialized,
`IoRegistry`-listed datatypes that appear in closures.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---