Jeon ChangBae created ZEPPELIN-2449:
---------------------------------------

             Summary: pyspark kafka streaming second run error
                 Key: ZEPPELIN-2449
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2449
             Project: Zeppelin
          Issue Type: Bug
          Components: pySpark
    Affects Versions: 0.7.1
            Reporter: Jeon ChangBae


run after stop(second run) below code, I have bleow message.
--------------------------------------------------------------------------------------------
%spark.pyspark
#from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils

ssc = StreamingContext(sc, 3)
kvs = KafkaUtils.createDirectStream(ssc, ['fw'], {"metadata.broker.list": 
'queue:9200'})
lines = kvs.map(lambda x: x[1])
counts = lines.flatMap(lambda line: line.split(" ")) \
    .map(lambda word: (word, 1)) \
    .reduceByKey(lambda a, b: a+b)
counts.pprint()

ssc.start()
try:
    ssc.awaitTermination()
except KeyboardInterrupt:
    ssc.stop()
--------------------------------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/zeppelin_pyspark-5081537614820689398.py", line 325, in <module>
    sc.setJobGroup(jobGroup, "Zeppelin")
  File 
"/opt/zeppelin-0.7.1-bin-all/interpreter/spark/pyspark/pyspark.zip/pyspark/context.py",
 line 902, in setJobGroup
    self._jsc.setJobGroup(groupId, description, interruptOnCancel)
AttributeError: 'NoneType' object has no attribute 'setJobGroup'



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to