Re: Spark Streaming ElasticSearch
Hi Siva In that case u can use structured streaming foreach / foreachBatch function which can help you process each record and write it into some sink -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Spark Streaming ElasticSearch
Hi Siva To emit data into ES using spark structured streaming job you need to used ElasticSearch jar which has support for sink for spark structured streaming job. For this you can use this one my branch where we have integrated ES with spark 3.0 and scala 2.12 compatible https://github.com/ThalesGroup/spark/tree/guavus/v3.0.0 Also in this you need to build three jars elasticsearch-hadoop-sql elasticsearch-hadoop-core elasticsearch-hadoop-mr which help in writing data into ES through spark structured streaming. And in your application job u can use this way to sink the data, remember with ES there is only support of append mode of structured streaming. val esDf = aggregatedDF .writeStream .outputMode("append") .format("org.elasticsearch.spark.sql") .option(CHECKPOINTLOCATION, kafkaCheckpointDirPath + "/es") .start("aggregation-job-index-latest-1") Let me know if you face any issues, will be happy to help you :) -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Elastic Search sink showing -1 for numOutputRows
Thanks Jungtaek Lim-2 for replying. May i knw the reference of the API version for sink for both types (DSv1 and DSv2) in code ? Where could i see it ? Under what module of spark code ? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Re: Elastic Search sink showing -1 for numOutputRows
Hi, Using structured spark streaming and sink the data into ElasticSearch. In the stats emit for each batch the "numOutputRows" showing -1 for ElasticSearch sink always whereas when i see other sinks like Kafka it shows either 0 or some values when it emit data. What could be the reason for showing -1 for ElasticSearch ? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Elastic Search sink showing -1 for numOutputRows
Hi, Using structured spark streaming and sink the data into ElasticSearch. In the stats emit for each batch the "numOutputRows" showing -1 for ElasticSearch sink always whereas when i see other sinks like Kafka it shows either 0 or some values when it emit data. What could be the reason for showing -1 for ElasticSearch ? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org