Suchintak Patnaik created SPARK-31042:
-----------------------------------------

             Summary: Error in writing a pyspark streaming dataframe created 
from Kafka source to a csv file 
                 Key: SPARK-31042
                 URL: https://issues.apache.org/jira/browse/SPARK-31042
             Project: Spark
          Issue Type: Bug
          Components: PySpark, Structured Streaming
    Affects Versions: 2.4.5
            Reporter: Suchintak Patnaik


While writing a streaming dataframe created from Kafka source to a csv file 
gives following error in PySpark.

NOTE : The same streaming dataframe is getting displayed in the console.

sdf.writeStream.format("console").start().awaitTermination()  // Working

sdf.writeStream\
    .format("csv")\
    .option("path", "C://output")\
    .option("checkpointLocation", "C://Checkpoint")\
    .outputMode("append")\
    .start().awaitTermination()    // Not working


Error
---------
 *File "C:\Spark\python\pyspark\sql\utils.py", line 63, in deco
    return f(*a, **kw)
  File "C:\Spark\python\lib\py4j-0.10.7-src.zip\py4j\protocol.py", line 328, in 
get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling 
o63.awaitTermination.
: org.apache.spark.sql.streaming.StreamingQueryException: Expected e.g. 
{"topicA":{"0":23,"1":-1},"topicB":{"0":-2}}, got {"logOffset":1}
=== Streaming Query ===
Identifier: [id = 6718625c-489e-44c8-b273-0da3429e97a8, runId = 
b64887ba-ca32-499e-9ab5-f839fd44ec26]
Current Committed Offsets: {KafkaV2[Subscribe[test1]]: {"logOffset":1}}
Current Available Offsets: {KafkaV2[Subscribe[test1]]: {"logOffset":1}}

Current State: ACTIVE
Thread State: RUNNABLE*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to