You can try out "Dataset.observe" added in Spark 3, which enables arbitrary
metrics to be logged and exposed to streaming query listeners.
On Tue, Nov 3, 2020 at 3:25 AM meetwes wrote:
> Hi I am looking for the right approach to emit custom metrics for spark
> structured streaming job. *Actual S
Which Spark version do you use? There's a known issue on Kafka producer
pool in Spark 2.x which was fixed in Spark 3.0, so you'd like to check
whether your case is bound to the known issue or not.
https://issues.apache.org/jira/browse/SPARK-21869
On Tue, Nov 3, 2020 at 1:53 AM Eric Beabes wrote
What's the recommended way of associating authentication token (response to
a successful login) to the user session from a custom authenticator
(PasswdAuthenticationProvider)?
Thanks,
Mohammad
Hi I am looking for the right approach to emit custom metrics for spark
structured streaming job.*Actual Scenario:*
I have an aggregated dataframe let's say with (id, key, value) columns. One
of the kpis could be 'droppedRecords' and the corresponding value column has
the number of dropped records.
I know this is related to Kafka but it happens during the Spark Structured
Streaming job that's why I am asking on this mailing list.
How would you debug this or get around this in Spark Structured Streaming?
Any tips would be appreciated. Thanks.
java.lang.IllegalStateException: Cannot perform
Hi,
Sorry for the very slow reply - I am far behind in my mailing list
subscriptions.
You'll find a few slides covering the topic in this presentation:
https://www.slideshare.net/lallea/test-strategies-for-data-processing-pipelines-67244458
Video here: https://vimeo.com/192429554
Regards,
Lars
Hi,
I am running spark in cluster mode ( on K8 ). On running it on Word Count
example, the number of executors assigned are different across stages. Our
number of assigned executors is 20. While stage 1 gets all 20 of them
alloted, stage 2 gets only < 10 executors. Is there any particular reason
fo
Aurora Borealis