Interesting You've piqued my interest. Will the sessons be available
after the conference? (I'm in the wrong timezone to see this during
daylight hours)
On Wed, Jul 8, 2020 at 2:40 AM ldazaa11 wrote:
> Hello Sparkers,
>
> If you’re interested in how Spark is being applied in cloud data la
Write to a local temp directory via file:// ?
> Am 07.07.2020 um 20:07 schrieb Dark Crusader :
>
>
> Hi everyone,
>
> I have a function which reads and writes a parquet file from HDFS. When I'm
> writing a unit test for this function, I want to mock this read & write.
>
> How do you achieve
Hi everyone,
I have a function which reads and writes a parquet file from HDFS. When I'm
writing a unit test for this function, I want to mock this read & write.
How do you achieve this?
Any help would be appreciated. Thank you.
Hello Sparkers,
If you’re interested in how Spark is being applied in cloud data lake
environments, then you should check out a new 1-day LIVE, virtual conference
on July 30. This conference is called Subsurface and the focus is technical
talks tailored specifically for data architects and engine
If not set explicitly with spark.default.parallelism, it will default
to the number of cores currently available (minimum 2). At the very
start, some executors haven't completed registering, which I think
explains why it goes up after a short time. (In the case of dynamic
allocation it will change
Does anyone know if ANALYZE TABLE is supported on Spark 2.3.2? The command
doesnt appear in the documentation
(spark.apache.org/docs/2.3.2/sql-programming-guide.html) although we can
launch it with estrange results
The analyse table job takes hours and doesnt launch any executors, it just
runs in
Hi community,I am running hundreds of Spark jobs at the same time, which cause Hive Metastore connection numbers to be very high (> 1K), since the jobs do not use HMS really, so I wish to disable that, I have tried setting spark.s