Hi Colleagues

We need to call a Scala Class from pySpark in Ipython notebook.

We tried something like below :

from py4j.java_gateway import java_import

java_import(sparkContext._jvm,'<mynamespace>')

myScalaClass =  sparkContext._jvm.SimpleScalaClass ()

myScalaClass.sayHello("World") Works Fine

But

When we try to pass sparkContext to our class it fails  like below

myContext  = _jvm.MySQLContext(sparkContext) fails with


AttributeError                            Traceback (most recent call last)

<ipython-input-19-34330244f574> in <module>()

----> 1 z = _jvm.MySQLContext(sparkContext)



C:\Users\i033085\spark\python\lib\py4j-0.8.2.1-src.zip\py4j\java_gateway.py in 
__call__(self, *args)

    690

    691         args_command = ''.join(

--> 692                 [get_command_part(arg, self._pool) for arg in new_args])

    693

    694         command = CONSTRUCTOR_COMMAND_NAME +\



C:\Users\i033085\spark\python\lib\py4j-0.8.2.1-src.zip\py4j\protocol.py in 
get_command_part(parameter, python_proxy_pool)

    263             command_part += ';' + interface

    264     else:

--> 265         command_part = REFERENCE_TYPE + parameter._get_object_id()

    266

    267     command_part += '\n'
attributeError: 'SparkContext' object has no attribute '_get_object_id'




And

myContext  = _jvm.MySQLContext(sparkContext._jsc) fails with


Constructor org.apache.spark.sql.MySQLContext([class 
org.apache.spark.api.java.JavaSparkContext]) does not exist





Would this be possible ... or there are serialization issues and hence not 
possible.

If not what are the options we have to instantiate our own SQLContext written 
in scala from pySpark...



Best Regards,

Santosh




Reply via email to