coming from a yarn background; log files can be found after job finishes...
with spark master/workers how to configure to get logs after job finishes?
we have setup our spark history server and spark-defaults include:
spark.eventLog.enabled true
spark.eventLog.dir
Hi
IMHO this is not the best use of spark. I would suggest to use simple azure
function to unzip.
Is there any specific reason to use gzip over event hub?
If you can wait 10-20 sec to process, you can use eventhub capture to write
data to storage and then process it.
It all depends on compute
Hi everyone,
**
Context: I have events coming into Databricks from an Azure Event Hub in a
Gzip compressed format. Currently, I extract the files with a UDF and send
the unzipped data into the silver layer in my Delta Lake with .write. Note
that even though data comes in continuously I do not
Hi,
In spark, it uses checkpoints to keep track of offsets in kafka. If there is
any data loss, can we edit the file and reduce the data loss? Please suggest
the best practices to reduce the data loss under exceptional scenarios.
Regards,
Gnana