Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-01 Thread Prem Sahoo
Hello Mich, thanks for your reply. As an engineer I can chip in. You may have partial execution and retries meaning when spark encounters a *FetchFailedException*, it may retry fetching the data from the unavailable (the one being rebooted) node a few times before marking it permanently

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-01 Thread Mich Talebzadeh
Hi, Your point -> "When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why. We have scenario when spark job complains *FetchFailedException as one of the data node got ** rebooted middle of job running ."* As an engineer I

Re: Vote on Dynamic resource allocation for structured streaming [SPARK-24815]

2024-03-01 Thread Mich Talebzadeh
Hi Bhuwan et al, Thank you for passing on the DataBricks Structured Streaming team's review of the SPIP document. FYI, I work closely with Pawan and other members to help deliver this piece of work. We appreciate your insights, especially regarding the cost savings potential from the PoC. Pavan

Vote on Dynamic resource allocation for structured streaming [SPARK-24815]

2024-03-01 Thread Nivedita VY
+1 Nivi

Re: Vote on Dynamic resource allocation for structured streaming [SPARK-24815]

2024-03-01 Thread Pavan Kotikalapudi
Thanks Bhuwan and rest of the databricks team for the reviews, I appreciate your reviews, was very helpful in evaluating a few options that were overlooked earlier (especially about mixed spark apps running on notebooks). Regarding the use-cases, It could handle multiple streaming queries

RE: Vote on Dynamic resource allocation for structured streaming [SPARK-24815]

2024-03-01 Thread Nivedita VY
+1 Nivi

Re: Vote on Dynamic resource allocation for structured streaming [SPARK-24815]

2024-03-01 Thread Bhuwan Sahni
Hi Pavan, I am from the DataBricks Structured Streaming team, and we did a review of the SPIP internally. Wanted to pass on the points discussed in the meeting. Thanks for putting together the SPIP document. It's useful to have dynamic resource allocation for Streaming queries, and it's

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-01 Thread Prem Sahoo
Hello All, in the list of JIRAs i didn't find anything related to fetchFailedException. as mentioned above "When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why. We have a scenario when spark job complains