Re: Can we access files on Cluster mode

Anastasios Zouzias Sun, 25 Jun 2017 02:38:01 -0700

Hi Mich,

If the driver starts on the edge node with cluster mode, then I don't see
the difference between client and cluster deploy mode.


In cluster mode, it is the responsibility of the resource manager (yarn,
etc) to decide where to run the driver (at least for spark 1.6 this is what
I have experienced).

Best,
Anastasios

On Sun, Jun 25, 2017 at 11:14 AM, Mich Talebzadeh <mich.talebza...@gmail.com
> wrote:

> Hi Anastasios.
>
> Are you implying that in Yarn cluster mode even if you submit your Spark
> application on an Edge node the driver can start on any node. I was under
> the impression that the driver starts from the Edge node? and the executors
> can be on any node in the cluster (where Spark agents are running)?
>
> Thanks
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 25 June 2017 at 09:39, Anastasios Zouzias <zouz...@gmail.com> wrote:
>
>> Just to note that in cluster mode the spark driver might run on any node
>> of the cluster, hence you need to make sure that the file exists on *all*
>> nodes. Push the file on all nodes or use client deploy-mode.
>>
>> Best,
>> Anastasios
>>
>>
>> Am 24.06.2017 23:24 schrieb "Holden Karau" <hol...@pigscanfly.ca>:
>>
>>> addFile is supposed to not depend on a shared FS unless the semantics
>>> have changed recently.
>>>
>>> On Sat, Jun 24, 2017 at 11:55 AM varma dantuluri <dvsnva...@gmail.com>
>>> wrote:
>>>
>>>> Hi Sudhir,
>>>>
>>>> I believe you have to use a shared file system that is accused by all
>>>> nodes.
>>>>
>>>>
>>>> On Jun 24, 2017, at 1:30 PM, sudhir k <k.sudhi...@gmail.com> wrote:
>>>>
>>>>
>>>> I am new to Spark and i need some guidance on how to fetch files from
>>>> --files option on Spark-Submit.
>>>>
>>>> I read on some forums that we can fetch the files from
>>>> Spark.getFiles(fileName) and can use it in our code and all nodes should
>>>> read it.
>>>>
>>>> But i am facing some issue
>>>>
>>>> Below is the command i am using
>>>>
>>>> spark-submit --deploy-mode cluster --class com.check.Driver --files
>>>> /home/sql/first.sql test.jar 20170619
>>>>
>>>> so when i use SparkFiles.get(first.sql) , i should be able to read the
>>>> file Path but it is throwing File not Found exception.
>>>>
>>>> I tried SpackContext.addFile(/home/sql/first.sql) and then
>>>> SparkFiles.get(first.sql) but still the same error.
>>>>
>>>> Its working on the stand alone mode but not on cluster mode. Any help
>>>> is appreciated.. Using Spark 2.1.0 and Scala 2.11
>>>>
>>>> Thanks.
>>>>
>>>>
>>>> Regards,
>>>> Sudhir K
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Sudhir K
>>>>
>>>>
>>>> --
>>> Cell : 425-233-8271 <(425)%20233-8271>
>>> Twitter: https://twitter.com/holdenkarau
>>>
>>
>


-- 
-- Anastasios Zouzias
<a...@zurich.ibm.com>

Re: Can we access files on Cluster mode

Reply via email to