Hi all, I would like to be able to compile Spark UDF at runtime. Right now I am using Janino for that. My problem is, that in order to make my compiled functions visible to spark, I have to set janino classloader (janino gives me classloader with compiled UDF classes) as context class loader before I create Spark Session. This approach is working locally for debugging purposes but is not going to work in cluster mode, because the UDF classes will not be distributed to the worker nodes.
An alternative is to register UDF via Hive functionality and generate temporary jar somewhere, which at least in Standalone cluster mode will be made available to spark workers using embedded http server. As far as I understand, this is not going to work in yarn mode. I am wondering now, how is it better to approach this problem? My current best idea is to develop own small netty based file web server and use it in order to distribute my custom jar, which can be created on the fly, to workers both in standalone and in yarn modes. Can I reference the jar in form of http url using extra driver options and then register UDFs contained in this jar using spark.udf().* methods? Does anybody have any better ideas? Any assistance would be greatly appreciated! Thanks, Michael --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org