[jira] [Commented] (SPARK-32151) Kafka does not allow Partition Rebalance Handling
[ https://issues.apache.org/jira/browse/SPARK-32151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159330#comment-17159330 ] Ed Mitchell commented on SPARK-32151: - Yeah - I think you're right. It's going to be in the write ahead logs or the checkpoint. > Kafka does not allow Partition Rebalance Handling > - > > Key: SPARK-32151 > URL: https://issues.apache.org/jira/browse/SPARK-32151 > Project: Spark > Issue Type: Improvement > Components: DStreams >Affects Versions: 2.4.5 >Reporter: Ed Mitchell >Priority: Minor > > When a consumer group rebalance occurs when the Spark driver is using the > Subscribe or Subscribe Pattern ConsumerStrategy, driver's offsets are cleared > when partitions are revoked and then reassigned. > While this doesn't happen in the normal rebalance scenario of more consumers > joining the group (though it could), it does happen when the partition leader > is reelected because of a Kafka node being stopped or decommissioned. > This seems to only occur when users specify their own offsets and do not use > Kafka as the persistent store of offsets (they use their own database, and > possibly if using checkpointing). > This could probably affect Structured Streaming. > This presents itself as an "NoOffsetForPartitionException": > {code:java} > 20/05/13 01:37:00 ERROR JobScheduler: Error generating jobs for time > 158933382 ms > org.apache.kafka.clients.consumer.NoOffsetForPartitionException: Undefined > offset with no reset policy for partitions: [production-ad-metrics-1, > production-ad-metrics-2, production-ad-metrics-0, production-ad-metrics-5, > production-ad-metrics-6, production-ad-metrics-3, production-ad-metrics-4, > production-ad-metrics-7] > at > org.apache.kafka.clients.consumer.internals.SubscriptionState.resetMissingPositions(SubscriptionState.java:391) > at > org.apache.kafka.clients.consumer.KafkaConsumer.updateFetchPositions(KafkaConsumer.java:2185) > at > org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1222) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1181) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:172) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:191) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:228) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334) > at scala.Option.orElse(Option.scala:289) > at > org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331) > at > org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:48) > at > org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:122) > at > org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:121) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) > at > org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:121) > at > org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:249) > at > org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:247) > at scala.util.Try$.apply(Try.scala:192) > at > org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGener
[jira] [Commented] (SPARK-32151) Kafka does not allow Partition Rebalance Handling
[ https://issues.apache.org/jira/browse/SPARK-32151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17159294#comment-17159294 ] Ed Mitchell commented on SPARK-32151: - I could do that if I wanted to start back at the beginning or the end of the topic, but in this case, I would like it to restart back at the offsets defined by my datastore. > Kafka does not allow Partition Rebalance Handling > - > > Key: SPARK-32151 > URL: https://issues.apache.org/jira/browse/SPARK-32151 > Project: Spark > Issue Type: Improvement > Components: DStreams >Affects Versions: 2.4.5 >Reporter: Ed Mitchell >Priority: Minor > > When a consumer group rebalance occurs when the Spark driver is using the > Subscribe or Subscribe Pattern ConsumerStrategy, driver's offsets are cleared > when partitions are revoked and then reassigned. > While this doesn't happen in the normal rebalance scenario of more consumers > joining the group (though it could), it does happen when the partition leader > is reelected because of a Kafka node being stopped or decommissioned. > This seems to only occur when users specify their own offsets and do not use > Kafka as the persistent store of offsets (they use their own database, and > possibly if using checkpointing). > This could probably affect Structured Streaming. > This presents itself as an "NoOffsetForPartitionException": > {code:java} > 20/05/13 01:37:00 ERROR JobScheduler: Error generating jobs for time > 158933382 ms > org.apache.kafka.clients.consumer.NoOffsetForPartitionException: Undefined > offset with no reset policy for partitions: [production-ad-metrics-1, > production-ad-metrics-2, production-ad-metrics-0, production-ad-metrics-5, > production-ad-metrics-6, production-ad-metrics-3, production-ad-metrics-4, > production-ad-metrics-7] > at > org.apache.kafka.clients.consumer.internals.SubscriptionState.resetMissingPositions(SubscriptionState.java:391) > at > org.apache.kafka.clients.consumer.KafkaConsumer.updateFetchPositions(KafkaConsumer.java:2185) > at > org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1222) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1181) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:172) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:191) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:228) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334) > at scala.Option.orElse(Option.scala:289) > at > org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331) > at > org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:48) > at > org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:122) > at > org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:121) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) > at > org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:121) > at > org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:249) > at > org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:247) > at scala.util.Try$.apply(Try.scala:
[jira] [Created] (SPARK-32151) Kafka does not allow Partition Rebalance Handling
Ed Mitchell created SPARK-32151: --- Summary: Kafka does not allow Partition Rebalance Handling Key: SPARK-32151 URL: https://issues.apache.org/jira/browse/SPARK-32151 Project: Spark Issue Type: Improvement Components: DStreams Affects Versions: 2.4.5 Reporter: Ed Mitchell When a consumer group rebalance occurs when the Spark driver is using the Subscribe or Subscribe Pattern ConsumerStrategy, driver's offsets are cleared when partitions are revoked and then reassigned. While this doesn't happen in the normal rebalance scenario of more consumers joining the group (though it could), it does happen when the partition leader is reelected because of a Kafka node being stopped or decommissioned. This seems to only occur when users specify their own offsets and do not use Kafka as the persistent store of offsets (they use their own database, and possibly if using checkpointing). This could probably affect Structured Streaming. This presents itself as an "NoOffsetForPartitionException": {noformat} 20/05/13 01:37:00 ERROR JobScheduler: Error generating jobs for time 158933382 msorg.apache.kafka.clients.consumer.NoOffsetForPartitionException: Undefined offset with no reset policy for partitions: [production-ad-metrics-1, production-ad-metrics-2, production-ad-metrics-0, production-ad-metrics-5, production-ad-metrics-6, production-ad-metrics-3, production-ad-metrics-4, production-ad-metrics-7] at org.apache.kafka.clients.consumer.internals.SubscriptionState.resetMissingPositions(SubscriptionState.java:391) at org.apache.kafka.clients.consumer.KafkaConsumer.updateFetchPositions(KafkaConsumer.java:2185) at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1222) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1181) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:172) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:191) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:228) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334) at scala.Option.orElse(Option.scala:289) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331) at org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:48) at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:122) at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:121) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:121) at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:249) at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:247) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGenerator.scala:247) at org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processEvent(JobGenerator.scala:183) at org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:89) at org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:88) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49){noformat} This can be fixed by allowing the user to specify an {code:java} org.
[jira] [Updated] (SPARK-32151) Kafka does not allow Partition Rebalance Handling
[ https://issues.apache.org/jira/browse/SPARK-32151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ed Mitchell updated SPARK-32151: Description: When a consumer group rebalance occurs when the Spark driver is using the Subscribe or Subscribe Pattern ConsumerStrategy, driver's offsets are cleared when partitions are revoked and then reassigned. While this doesn't happen in the normal rebalance scenario of more consumers joining the group (though it could), it does happen when the partition leader is reelected because of a Kafka node being stopped or decommissioned. This seems to only occur when users specify their own offsets and do not use Kafka as the persistent store of offsets (they use their own database, and possibly if using checkpointing). This could probably affect Structured Streaming. This presents itself as an "NoOffsetForPartitionException": {code:java} 20/05/13 01:37:00 ERROR JobScheduler: Error generating jobs for time 158933382 ms org.apache.kafka.clients.consumer.NoOffsetForPartitionException: Undefined offset with no reset policy for partitions: [production-ad-metrics-1, production-ad-metrics-2, production-ad-metrics-0, production-ad-metrics-5, production-ad-metrics-6, production-ad-metrics-3, production-ad-metrics-4, production-ad-metrics-7] at org.apache.kafka.clients.consumer.internals.SubscriptionState.resetMissingPositions(SubscriptionState.java:391) at org.apache.kafka.clients.consumer.KafkaConsumer.updateFetchPositions(KafkaConsumer.java:2185) at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1222) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1181) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1115) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:172) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:191) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:228) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334) at scala.Option.orElse(Option.scala:289) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331) at org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:48) at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:122) at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:121) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:121) at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:249) at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:247) at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGenerator.scala:247) at org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processEvent(JobGenerator.scala:183) at org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:89) at org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:88) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) {code} This can be fixed by allowing the user to specify an {code:java} org.apache.kafka.clients.consumer.ConsumerRebalanceListener{code} in the KafkaConsumer#subscribe method. The documentation for ConsumerRebalanceListener states that you can use KafkaConsumer
[jira] [Commented] (SPARK-30055) Allow configurable restart policy of driver and executor pods
[ https://issues.apache.org/jira/browse/SPARK-30055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17087857#comment-17087857 ] Ed Mitchell commented on SPARK-30055: - I agree with this. Having Never defaulted limits the flexibility that allows Kubernetes to restart pods if they run out of memory or terminate in some undefined way. You can also access logs of previously restarted containers by doing: {noformat} kubectl -n logs --previous{noformat} I understand not wanting to set "Always" to the Executor pod, to allow Spark to control graceful termination of executors, but shouldn't we at least set it to "OnFailure", to allow OOMKilled executors to come back up? As far as the driver is concerned, our client mode setup has the driver pod living as a deployment, which means the restart policy is Always. No reason we can't allow Always or OnFailure in the driver restart policy imo. > Allow configurable restart policy of driver and executor pods > - > > Key: SPARK-30055 > URL: https://issues.apache.org/jira/browse/SPARK-30055 > Project: Spark > Issue Type: Improvement > Components: Kubernetes >Affects Versions: 3.1.0 >Reporter: Kevin Hogeland >Priority: Major > > The current Kubernetes scheduler hard-codes the restart policy for all pods > to be "Never". To restart a failed application, all pods have to be deleted > and rescheduled, which is very slow and clears any caches the processes may > have built. Spark should allow a configurable restart policy for both drivers > and executors for immediate restart of crashed/killed drivers/executors as > long as the pods are not evicted. (This is not about eviction resilience, > that's described in this issue: SPARK-23980) > Also, as far as I can tell, there's no reason the executors should be set to > never restart. Should that be configurable or should it just be changed to > Always? > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31482) spark.kubernetes.driver.podTemplateFile Configuration not used by the job
[ https://issues.apache.org/jira/browse/SPARK-31482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17087195#comment-17087195 ] Ed Mitchell commented on SPARK-31482: - Yeah. It's actually there but there's some unescaped HTML that was fixed in a later commit: [https://github.com/apache/spark/commit/44e314edb4b86ca3a8622124539073397dbe68de#diff-b5527f236b253e0d9f5db5164bdb43e9] > spark.kubernetes.driver.podTemplateFile Configuration not used by the job > - > > Key: SPARK-31482 > URL: https://issues.apache.org/jira/browse/SPARK-31482 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.0.0 >Reporter: Pradeep Misra >Priority: Blocker > > Spark 3.0 - Running Spark Submit as below and point to a MinKube cluster > {code:java} > bin/spark-submit \ > --master k8s://https://192.168.99.102:8443 \ > --deploy-mode cluster \ > --name spark-pi \ > --class org.apache.spark.examples.SparkPi \ > --conf spark.kubernetes.driver.podTemplateFile=../driver_1E.template \ > --conf spark.kubernetes.executor.podTemplateFile=../executor.template \ > --conf spark.kubernetes.container.image=spark:spark3 \ > local:///opt/spark/examples/jars/spark-examples_2.12-3.0.0-preview2.jar 1 > {code} > > Spark Binaries - spark-3.0.0-preview2-bin-hadoop2.7.tgz > Driver Template - > {code:java} > apiVersion: v1 > kind: Pod > metadata: > labels: > spark-app-id: my-custom-id > annotations: > spark-driver-cpu: 1 > spark-driver-mem: 1 > spark-executor-cpu: 1 > spark-executor-mem: 1 > spark-executor-count: 1 > spec: > schedulerName: spark-scheduler{code} > Executor Template > > {code:java} > apiVersion: v1 > kind: Pod > metadata: > labels: > spark-app-id: my-custom-id > spec: > schedulerName: spark-scheduler{code} > Kubernetes Pods Launched - Two Executor Pods were launched which was default > {code:java} > spark-pi-e608e7718f11cc69-driver 1/1 Running 0 10s > spark-pi-e608e7718f11cc69-exec-1 1/1 Running 0 5s > spark-pi-e608e7718f11cc69-exec-2 1/1 Running 0 5s{code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-31482) spark.kubernetes.driver.podTemplateFile Configuration not used by the job
[ https://issues.apache.org/jira/browse/SPARK-31482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17087150#comment-17087150 ] Ed Mitchell commented on SPARK-31482: - I'm not sure where the bug is here TBH. I don't see anything in the documentation or the Spark code that implies that setting annotations on the Driver Pod is a valid substitute for setting "spark.executor.memory", "spark.executor.instances", or "spark.executor.cores" on the Spark Submit command. > spark.kubernetes.driver.podTemplateFile Configuration not used by the job > - > > Key: SPARK-31482 > URL: https://issues.apache.org/jira/browse/SPARK-31482 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.0.0 >Reporter: Pradeep Misra >Priority: Blocker > > Spark 3.0 - Running Spark Submit as below and point to a MinKube cluster > {code:java} > bin/spark-submit \ > --master k8s://https://192.168.99.102:8443 \ > --deploy-mode cluster \ > --name spark-pi \ > --class org.apache.spark.examples.SparkPi \ > --conf spark.kubernetes.driver.podTemplateFile=../driver_1E.template \ > --conf spark.kubernetes.executor.podTemplateFile=../executor.template \ > --conf spark.kubernetes.container.image=spark:spark3 \ > local:///opt/spark/examples/jars/spark-examples_2.12-3.0.0-preview2.jar 1 > {code} > > Spark Binaries - spark-3.0.0-preview2-bin-hadoop2.7.tgz > Driver Template - > {code:java} > apiVersion: v1 > kind: Pod > metadata: > labels: > spark-app-id: my-custom-id > annotations: > spark-driver-cpu: 1 > spark-driver-mem: 1 > spark-executor-cpu: 1 > spark-executor-mem: 1 > spark-executor-count: 1 > spec: > schedulerName: spark-scheduler{code} > Executor Template > > {code:java} > apiVersion: v1 > kind: Pod > metadata: > labels: > spark-app-id: my-custom-id > spec: > schedulerName: spark-scheduler{code} > Kubernetes Pods Launched - Two Executor Pods were launched which was default > {code:java} > spark-pi-e608e7718f11cc69-driver 1/1 Running 0 10s > spark-pi-e608e7718f11cc69-exec-1 1/1 Running 0 5s > spark-pi-e608e7718f11cc69-exec-2 1/1 Running 0 5s{code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org