Usually job never reaches that point fails during shuffle. And storage
memory and executor memory when it failed is usually low
On Fri, Sep 8, 2023 at 16:49 Jack Wells wrote:
> Assuming you’re not writing to HDFS in your code, Spark can spill to HDFS
> if it runs out of memory on a per-executor b
Assuming you’re not writing to HDFS in your code, Spark can spill to HDFS
if it runs out of memory on a per-executor basis. This could happen when
evaluating a cache operation like you have below or during shuffle
operations in joins, etc. You might try to increase executor memory, tune
shuffle op
Sure
df = spark.read.option("basePath",
some_path).parquet(*list_of_s3_file_paths())
(
df
.where(SOME FILTER)
.repartition(6)
.cache()
)
On Fri, Sep 8, 2023 at 14:56 Jack Wells wrote:
> Hi Nebi, can you share the code you’re using to read and write from S3?
>
> On Sep 8, 2023
Hi Nebi, can you share the code you’re using to read and write from S3?
On Sep 8, 2023 at 10:59:59, Nebi Aydin
wrote:
> Hi all,
> I am using spark on EMR to process data. Basically i read data from AWS S3
> and do the transformation and post transformation i am loading/writing data
> to s3.
>
>
Hi all,
I am using spark on EMR to process data. Basically i read data from AWS S3
and do the transformation and post transformation i am loading/writing data
to s3.
Recently we have found that hdfs(/mnt/hdfs) utilization is going too high.
I disabled `yarn.log-aggregation-enable` by setting it t
Hi Yasukazu,
I tried by replacing the jar though the spark code didn’t work but the
vulnerability was removed. But I agree that even 3.1.3 has other
vulnerabilities listed on maven page but these are medium level
vulnerabilities. We are currently targeting Critical and High vulnerabilities
onl
@Alfie Davidson : Awesome, it worked with
"“org.elasticsearch.spark.sql”"
But as soon as I switched to *elasticsearch-spark-20_2.12, *"es" also
worked.
On Fri, Sep 8, 2023 at 12:45 PM Dipayan Dev wrote:
>
> Let me try that and get back. Just wondering, if there a change in the
> way we pass t
Let me try that and get back. Just wondering, if there a change in the way
we pass the format in connector from Spark 2 to 3?
On Fri, 8 Sep 2023 at 12:35 PM, Alfie Davidson
wrote:
> I am pretty certain you need to change the write.format from “es” to
> “org.elasticsearch.spark.sql”
>
> Sent fr