Hello all,

Trying the example code from this package (
https://github.com/Parsely/pyspark-cassandra) , I always get this error...

Can you see what I am doing wrong? from googling arounf it seems to be that
the jar is not found somehow...  The spark log shows the JAR was processed
at least.

Thank you so much.

am using spark-1.2.1-bin-hadoop2.4.tgz

test2.py is simply:

from pyspark.context import SparkConf
from pyspark_cassandra import CassandraSparkContext, saveToCassandra
conf = SparkConf().setAppName("PySpark Cassandra Sample Driver")
conf.set("spark.cassandra.connection.host", "devzero")
sc = CassandraSparkContext(conf=conf)

[root@devzero spark]# /usr/local/bin/docker-enter  spark-master bash
-c "/spark/bin/spark-submit --py-files /spark/pyspark_cassandra.py
--jars /spark/pyspark-cassandra-0.1-SNAPSHOT.jar --driver-class-path
/spark/pyspark-cassandra-0.1-SNAPSHOT.jar /spark/test2.py"
...
15/02/16 05:58:45 INFO Slf4jLogger: Slf4jLogger started
15/02/16 05:58:45 INFO Remoting: Starting remoting
15/02/16 05:58:45 INFO Remoting: Remoting started; listening on
addresses :[akka.tcp://sparkDriver@devzero:38917]
15/02/16 05:58:45 INFO Utils: Successfully started service
'sparkDriver' on port 38917.
15/02/16 05:58:45 INFO SparkEnv: Registering MapOutputTracker
15/02/16 05:58:45 INFO SparkEnv: Registering BlockManagerMaster
15/02/16 05:58:45 INFO DiskBlockManager: Created local directory at
/tmp/spark-6cdca68b-edec-4a31-b3c1-a7e9d60191e7/spark-0e977468-6e31-4bba-959a-135d9ebda193
15/02/16 05:58:45 INFO MemoryStore: MemoryStore started with capacity 265.4 MB
15/02/16 05:58:45 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where
applicable
15/02/16 05:58:46 INFO HttpFileServer: HTTP File server directory is
/tmp/spark-af61f7f5-7c0e-412c-8352-263338335fa5/spark-10b3891f-0321-44fe-ba60-1a8c102fd647
15/02/16 05:58:46 INFO HttpServer: Starting HTTP Server
15/02/16 05:58:46 INFO Utils: Successfully started service 'HTTP file
server' on port 56642.
15/02/16 05:58:46 INFO Utils: Successfully started service 'SparkUI'
on port 4040.
15/02/16 05:58:46 INFO SparkUI: Started SparkUI at http://devzero:4040
15/02/16 05:58:46 INFO SparkContext: Added JAR
file:/spark/pyspark-cassandra-0.1-SNAPSHOT.jar at
http://10.212.55.42:56642/jars/pyspark-cassandra-0.1-SNAPSHOT.jar with
timestamp 1424066326632
15/02/16 05:58:46 INFO Utils: Copying /spark/test2.py to
/tmp/spark-e8cc013e-faae-4208-8bcd-0bb6c00b1b6c/spark-54f2c41d-ae35-4efd-860c-2e5c60979b4c/test2.py
15/02/16 05:58:46 INFO SparkContext: Added file file:/spark/test2.py
at http://10.212.55.42:56642/files/test2.py with timestamp
1424066326633
15/02/16 05:58:46 INFO Utils: Copying /spark/pyspark_cassandra.py to
/tmp/spark-e8cc013e-faae-4208-8bcd-0bb6c00b1b6c/spark-54f2c41d-ae35-4efd-860c-2e5c60979b4c/pyspark_cassandra.py
15/02/16 05:58:46 INFO SparkContext: Added file
file:/spark/pyspark_cassandra.py at
http://10.212.55.42:56642/files/pyspark_cassandra.py with timestamp
1424066326642
15/02/16 05:58:46 INFO Executor: Starting executor ID <driver> on host localhost
15/02/16 05:58:46 INFO AkkaUtils: Connecting to HeartbeatReceiver:
akka.tcp://sparkDriver@devzero:38917/user/HeartbeatReceiver
15/02/16 05:58:46 INFO NettyBlockTransferService: Server created on 32895
15/02/16 05:58:46 INFO BlockManagerMaster: Trying to register BlockManager
15/02/16 05:58:46 INFO BlockManagerMasterActor: Registering block
manager localhost:32895 with 265.4 MB RAM, BlockManagerId(<driver>,
localhost, 32895)
15/02/16 05:58:46 INFO BlockManagerMaster: Registered BlockManager
15/02/16 05:58:47 INFO SparkUI: Stopped Spark web UI at http://devzero:4040
15/02/16 05:58:47 INFO DAGScheduler: Stopping DAGScheduler
15/02/16 05:58:48 INFO MapOutputTrackerMasterActor:
MapOutputTrackerActor stopped!
15/02/16 05:58:48 INFO MemoryStore: MemoryStore cleared
15/02/16 05:58:48 INFO BlockManager: BlockManager stopped
15/02/16 05:58:48 INFO BlockManagerMaster: BlockManagerMaster stopped
15/02/16 05:58:48 INFO SparkContext: Successfully stopped SparkContext
15/02/16 05:58:48 INFO RemoteActorRefProvider$RemotingTerminator:
Shutting down remote daemon.
15/02/16 05:58:48 INFO RemoteActorRefProvider$RemotingTerminator:
Remote daemon shut down; proceeding with flushing remote transports.
15/02/16 05:58:48 INFO RemoteActorRefProvider$RemotingTerminator:
Remoting shut down.
Traceback (most recent call last):
  File "/spark/test2.py", line 5, in <module>
    sc = CassandraSparkContext(conf=conf)
  File "/spark/python/pyspark/context.py", line 105, in __init__
    conf, jsc)
  File "/spark/pyspark_cassandra.py", line 17, in _do_init
    self._jcsc = self._jvm.CassandraJavaUtil.javaFunctions(self._jsc)
  File "/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py",
line 726, in __getattr__
py4j.protocol.Py4JError: Trying to call a package.

Reply via email to