David Deuber created SPARK-43310:
------------------------------------

             Summary: Dataset.observe is ignored when writing to Kafka with 
batch query
                 Key: SPARK-43310
                 URL: https://issues.apache.org/jira/browse/SPARK-43310
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.4.0, 3.3.2
            Reporter: David Deuber


When writing to Kafka with a batch query, metrics defined with 
{{Dataset.observe}} are not recorded. 

For example, 
{code:java}
import org.apache.spark.sql.execution.QueryExecution
import org.apache.spark.sql.util.QueryExecutionListener

spark.listenerManager.register(new QueryExecutionListener {
  override def onSuccess(funcName: String, qe: QueryExecution, durationNs: 
Long): Unit = {
    println(qe.observedMetrics)
  }

  override def onFailure(funcName: String, qe: QueryExecution, exception: 
Exception): Unit = {
    //pass
  }
})

val df = Seq(("k", "v")).toDF("key", "value")
val observed = df.observe("my_observation", 
lit("metric_value").as("some_metric"))
observed
  .write
  .format("kafka")
  .option("kafka.bootstrap.servers", "host1:port1")
  .option("topic", "topic1")
  .save()
{code}
prints {{{}Map(){}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to