I had the similar issue reading the external parquet table . In my case I
had permission issue in one partition so I added filter to exclude that
partition but still the spark didn’t prune it. Then I read that in order
for spark to be aware of all the partitions it first read the folders and
then
Does Spark 2.4.0 support Python UDFs with Continuous Processing mode?
I try it and occur error like below:
WARN scheduler.TaskSetManager: Lost task 4.0 in stage 0.0 (TID 4,
172.22.9.179, executor 1): java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:347)
at
Hi All,
I'm attempting to clean up some Spark code which performs groupByKey /
mapGroups to compute custom aggregations, and I could use some help
understanding the Spark API's necessary to make my code more modular and
maintainable.
In particular, my current approach is as follows:
- Start
Hi,
I think that it should be possible to write a query on the streaming data
frame and then write the output of the query to S3 or any other sink layer.
Regards,
Gourav Sengupta
On Sat, Aug 10, 2019 at 9:24 AM zenglong chen
wrote:
> How to extract some message in streaming dataframe and make