RE: Spark Kafka Integration

2022-02-25 Thread Michael Williams (SSI)
Ahh, ok. So, Kafka 3.1 is supported for Spark 3.2.1. Thank you very much. From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: Friday, February 25, 2022 2:50 PM To: Michael Williams (SSI) Cc: user@spark.apache.org Subject: Re: Spark Kafka Integration these are the old and news ones

RE: Spark Kafka Integration

2022-02-25 Thread Michael Williams (SSI)
Thank you, that is good to know. From: Sean Owen [mailto:sro...@gmail.com] Sent: Friday, February 25, 2022 2:46 PM To: Michael Williams (SSI) Cc: Mich Talebzadeh ; user@spark.apache.org Subject: Re: Spark Kafka Integration Spark 3.2.1 is compiled vs Kafka 2.8.0; the forthcoming Spark 3.3

RE: Spark Kafka Integration

2022-02-25 Thread Michael Williams (SSI)
Thank you From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: Friday, February 25, 2022 2:35 PM To: Michael Williams (SSI) Cc: Sean Owen ; user@spark.apache.org Subject: Re: Spark Kafka Integration please see my earlier reply for 3.1.1 tested and worked in Google Dataproc

RE: Spark Kafka Integration

2022-02-25 Thread Michael Williams (SSI)
To: Michael Williams (SSI) Cc: user@spark.apache.org Subject: Re: Spark Kafka Integration and what version of kafka do you have 2.7? for spark 3.1.1 I needed these jar files to make it work kafka-clients-2.7.0.jar commons-pool2-2.9.0.jar spark-streaming_2.12-3.1.1.jar spark-sql-kafka-0-10_2.12

RE: Spark Kafka Integration

2022-02-25 Thread Michael Williams (SSI)
exist on disk. If that makes any sense. Thank you From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: Friday, February 25, 2022 2:16 PM To: Michael Williams (SSI) Cc: user@spark.apache.org Subject: Re: Spark Kafka Integration What is the use case? Is this for spark structured

Spark Kafka Integration

2022-02-25 Thread Michael Williams (SSI)
After reviewing Spark's Kafka Integration guide, it indicates that spark-sql-kafka-0-10_2.12_3.2.1.jar and its dependencies are needed for Spark 3.2.1 (+ Scala 2.12) to work with Kafka. Can anybody clarify the cleanest, most repeatable (reliable) way to acquire these jars for including in a

RE: Consuming from Kafka to delta table - stream or batch mode?

2022-02-24 Thread Michael Williams (SSI)
Thank you. From: Peyman Mohajerian [mailto:mohaj...@gmail.com] Sent: Thursday, February 24, 2022 9:00 AM To: Michael Williams (SSI) Cc: user@spark.apache.org Subject: Re: Consuming from Kafka to delta table - stream or batch mode? If you want to batch consume from Kafka, trigger-once config

Consuming from Kafka to delta table - stream or batch mode?

2022-02-24 Thread Michael Williams (SSI)
Hello, Our team is working with Spark (for the first time) and one of the sources we need to consume is Kafka (multiple topics). Are there any practical or operational issues to be aware of when deciding whether to a) consume in batches until all messages are consumed then shut down the spark

RE: Logging to determine why driver fails

2022-02-21 Thread Michael Williams (SSI)
Thank you. From: Artemis User [mailto:arte...@dtechspace.com] Sent: Monday, February 21, 2022 8:23 AM To: Michael Williams (SSI) Subject: Re: Logging to determine why driver fails Spark uses Log4j for logging. There is a log4j properties template file located in the conf directory. You can

Logging to determine why driver fails

2022-02-21 Thread Michael Williams (SSI)
Hello, We have a POC using Spark 3.2.1 and none of us have any prior Spark experience. Our setup uses the native Spark REST api (http://localhost:6066/v1/submissions/create) on the master node (not Livy, not Spark Job server). While we have been successful at submitting python jobs via this

triggering spark python app using native REST api

2022-01-24 Thread Michael Williams (SSI)
Hello, I've been trying to work out how to replicate execution of a python app using spark-submit via the CLI using the native spark REST api (http://localhost:6066/v1/submissions/create) for a couple of weeks without success. The environment is docker using the latest docker for spark 3.2