Hi All,
What will happen in case the jar is available on the local network, which
is accessible to Driver but not to executors.
Is there any good study resource where the deployment of exernal is
explained nicely?
Regards,
Rishi
On Fri, Aug 14, 2020 at 11:15 AM Henoc wrote:
> If you are
If you are running Spark on Yarn, the spark-submit utility will download
the jar from S3 and copy it to HDFS in a distributed cache. The driver
shares this location with Yarn NodeManagers via the container
LaunchContext. NodeManagers localize the jar and place it on container
classpath before they
Looking back at the code
All --jar Args and such run through
https://github.com/apache/spark/blob/7f275ee5978e00ac514e25f5ef1d4e3331f8031b/core/src/main/scala/org/apache/spark/SparkContext.scala#L493-L500
Which calls
The driver hosts a file server which the executors download the jar from.
On Thu, Aug 13, 2020, 5:33 PM James Yu wrote:
> Hi,
>
> When I spark submit a Spark app with my app jar located in S3, obviously
> the Driver will download the jar from the s3 location. What is not clear
> to me is:
It depends on how much memory is available and how much data you are
processing. Please provide data size and cluster details to help.
On Fri, Aug 14, 2020 at 12:54 AM km.santanu wrote:
> Hi
> I am using Kafka stateless structure streaming.i have enabled watermark as
> 1
> hour.after long
Can you keep option field in your case class.
Thanks
Amit
On Thu, Aug 13, 2020 at 12:47 PM manjay kumar
wrote:
> Hi ,
>
> I have a use case,
>
> where i need to merge three data set and build one where ever data is
> available.
>
> And my dataset is a complex object.
>
> Customer
> - name -
Hi,
When I spark submit a Spark app with my app jar located in S3, obviously the
Driver will download the jar from the s3 location. What is not clear to me is:
where do the Executors get the jar from? From the same s3 location, or somehow
from the Driver, or they don't need the jar?
Thanks
Hi
I am using Kafka stateless structure streaming.i have enabled watermark as 1
hour.after long running about 2 hour my job is terminating
automatically.check point has been enabled.
I am doing average on input data.
Can you please suggest how to avoid out of memory error
--
Sent from:
Hi ,
I have a use case,
where i need to merge three data set and build one where ever data is
available.
And my dataset is a complex object.
Customer
- name - string
- accounts - List
Account
- type - String
- Adressess - List
Address
-name - String
---
And it goes on.
These file
That's kind of solution Ed, can you elaborate how can I do this on Spark
side? Or do I need to update table configuration in the DB
Siavash
On Wed, Aug 12, 2020 at 5:55 PM ed elliott wrote:
> You’ll need to do an insert and use a trigger on the table to change it
> into an upsert, also make
Hi guys,
Does anyone try Spark3 on k8s reading data from HDFS encrypted with KMS in
HA mode (with kerberos)?
I have a wordcount job running with Spark3 reading data on HDFS (hadoop
3.1) everything secure with kerberos. Everything works fine if the data
folder is not encrypted (spark on k8s). If
Hi,
we have spark jobs written totally in python similar to repo
https://github.com/AlexIoannides/pyspark-example-project,
we are using spark-submit to submit the application in the local mode, but
want to send metrics when the job ends (on SIGTERM as well), to do so we
need something similar to
12 matches
Mail list logo