Jeon ChangBae created ZEPPELIN-2449: ---------------------------------------
Summary: pyspark kafka streaming second run error Key: ZEPPELIN-2449 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2449 Project: Zeppelin Issue Type: Bug Components: pySpark Affects Versions: 0.7.1 Reporter: Jeon ChangBae run after stop(second run) below code, I have bleow message. -------------------------------------------------------------------------------------------- %spark.pyspark #from pyspark import SparkContext from pyspark.streaming import StreamingContext from pyspark.streaming.kafka import KafkaUtils ssc = StreamingContext(sc, 3) kvs = KafkaUtils.createDirectStream(ssc, ['fw'], {"metadata.broker.list": 'queue:9200'}) lines = kvs.map(lambda x: x[1]) counts = lines.flatMap(lambda line: line.split(" ")) \ .map(lambda word: (word, 1)) \ .reduceByKey(lambda a, b: a+b) counts.pprint() ssc.start() try: ssc.awaitTermination() except KeyboardInterrupt: ssc.stop() -------------------------------------------------------------------------------------------- Traceback (most recent call last): File "/tmp/zeppelin_pyspark-5081537614820689398.py", line 325, in <module> sc.setJobGroup(jobGroup, "Zeppelin") File "/opt/zeppelin-0.7.1-bin-all/interpreter/spark/pyspark/pyspark.zip/pyspark/context.py", line 902, in setJobGroup self._jsc.setJobGroup(groupId, description, interruptOnCancel) AttributeError: 'NoneType' object has no attribute 'setJobGroup' -- This message was sent by Atlassian JIRA (v6.3.15#6346)