Hi Colleagues We need to call a Scala Class from pySpark in Ipython notebook.
We tried something like below : from py4j.java_gateway import java_import java_import(sparkContext._jvm,'<mynamespace>') myScalaClass = sparkContext._jvm.SimpleScalaClass () myScalaClass.sayHello("World") Works Fine But When we try to pass sparkContext to our class it fails like below myContext = _jvm.MySQLContext(sparkContext) fails with AttributeError Traceback (most recent call last) <ipython-input-19-34330244f574> in <module>() ----> 1 z = _jvm.MySQLContext(sparkContext) C:\Users\i033085\spark\python\lib\py4j-0.8.2.1-src.zip\py4j\java_gateway.py in __call__(self, *args) 690 691 args_command = ''.join( --> 692 [get_command_part(arg, self._pool) for arg in new_args]) 693 694 command = CONSTRUCTOR_COMMAND_NAME +\ C:\Users\i033085\spark\python\lib\py4j-0.8.2.1-src.zip\py4j\protocol.py in get_command_part(parameter, python_proxy_pool) 263 command_part += ';' + interface 264 else: --> 265 command_part = REFERENCE_TYPE + parameter._get_object_id() 266 267 command_part += '\n' attributeError: 'SparkContext' object has no attribute '_get_object_id' And myContext = _jvm.MySQLContext(sparkContext._jsc) fails with Constructor org.apache.spark.sql.MySQLContext([class org.apache.spark.api.java.JavaSparkContext]) does not exist Would this be possible ... or there are serialization issues and hence not possible. If not what are the options we have to instantiate our own SQLContext written in scala from pySpark... Best Regards, Santosh