Retry.
From: Anil Dasari
Date: Tuesday, July 12, 2022 at 3:42 PM
To: user@spark.apache.org
Subject: Spark streaming pending mircobatches queue max length
Hello,
Spark is adding entry to pending microbatches queue at periodic batch interval.
Is there config to set the max size for pending
Hello,
As of now Spark Continous Processing does not support logical relation
operations like "dataframe.join()". Are there any plans to make it happen
in future relases?
Thanks in advance for your work.
Mikołaj
Hi,
I think that this is a pure example of over engineering.
Ayan's advice is the best. Please use SPARK SQL function called as
input_file_name() to join the tables. People do not think in terms of RDD
anymore unless absolutely required.
Also if you have different JSON schemas, just use the
Hi Team,
I have a dataset like the below one in .dat file:
13/07/2022abc
PWJ PWJABC 513213217ABC GM20 05. 6/20/39
#01000count
Now I want to extract the header and tail records which I was able to do
it. Now, from the header, I need to extract the date and match it with the
current system
Yeah, I understood that now.
Thanks for the explanation, Bjorn.
Sid
On Wed, Jul 6, 2022 at 1:46 AM Bjørn Jørgensen
wrote:
> Ehh.. What is "*duplicate column*" ? I don't think Spark supports that.
>
> duplicate column = duplicate rows
>
>
> tir. 5. jul. 2022 kl. 22:13 skrev Bjørn Jørgensen <
>