Chaitanya created SPARK-32834: --------------------------------- Summary: from_avro is giving empty result Key: SPARK-32834 URL: https://issues.apache.org/jira/browse/SPARK-32834 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 3.0.0 Environment: Ubuntu 18
Spark 3.0 Kafka 2.0.0 Reporter: Chaitanya I am trying to read a Kafka topic with avro avro Code: df = spark\ .readStream\ .format("kafka")\ .option("kafka.bootstrap.servers", "host:6667")\ .option("subscribe", "utopic1")\ .option("failOnDataLoss", "false")\ .option("startingOffsets", "earliest")\ .option("checkpointLocation", "/home/abc/wspace/spark_test/data/")\ .load() outputDF = df\ .select(from_avro("value", jsonFormatSchema, options=\{"mode":"FASTFAIL"}).alias("user")) outputDF.printSchema() query = outputDF.writeStream.format("console").start() time.sleep(10) Input: avro schema file: [user.avsc|https://github.com/apache/spark/raw/4ad9bfd53b84a6d2497668c73af6899bae14c187/examples/src/main/resources/user.avsc] Kafka topic: \{'favorite_color': 'Red', 'name': 'Alyssa'} Expected Output: It should print values. Actual Output: +----+ |user| +----+ | [,]| +----+ Additional information: # Searched in the internet and found that other peson faced same issue. [https://stackoverflow.com/questions/59222774/spark-from-avro-function-returning-null-values] # I am able to print values to console if I cast to String using below line df.selectExpr("CAST(value AS STRING)") -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org