RE: Spark-Locality: Hinting Spark location of the executor does not take effect.

2020-09-18 Thread Nasrulla Khan Haris
Was providing IP address instead of FQDN. Providing FQDN helped.

Thanks,

From: Nasrulla Khan Haris
Sent: Wednesday, September 16, 2020 4:11 PM
To: dev@spark.apache.org
Subject: Spark-Locality: Hinting Spark location of the executor does not take 
effect.

HI Spark developers,

If I want to hint spark to use particular list of hosts to execute tasks on. I 
see that getBlockLocations is used to get the list of hosts from HDFS.

https://github.com/apache/spark/blob/7955b3962ac46b89564e0613db7bea98a1478bf2/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L386


Hinting Spark by custom getBlockLocation which return Array of BlockLocations 
with host ip address doesn’t help, Spark continues to host it on other 
executors hosts.

Is there something I am doing wrong ?

Test:
Spark.read.csv()


Appreciate your inputs 😊

Thanks,
Nasrulla



Re: Spark-Locality: Hinting Spark location of the executor does not take effect.

2020-09-17 Thread nakhanha
Even though i provide hint of wn4's ip address

Spark schedules on wn1
2020-09-17 01:21:50,038 DEBUG [dag-scheduler-event-loop]
scheduler.TaskSetManager: Valid locality levels for TaskSet 11.0:
NODE_LOCAL, RACK_LOCAL, ANY
2020-09-17 01:21:50,039 DEBUG [dispatcher-event-loop-0]
cluster.YarnScheduler: parentName: , name: TaskSet_11.0, runningTasks: 0
2020-09-17 01:21:50,040 INFO  [dispatcher-event-loop-0]
scheduler.TaskSetManager: Starting task 0.0 in stage 11.0 (TID 13,
wn1-vegasr.r0erhw3gxezevknl0vbc42vctb.dx.internal.cloudapp.net, executor 6,
partition 5, NODE_LOCAL, 4994 bytes)
2020-09-17 01:21:50,040 DEBUG [dispatcher-event-loop-0]
scheduler.TaskSetManager: No tasks for locality level NODE_LOCAL, so moving
to locality level RACK_LOCAL
2020-09-17 01:21:50,041 DEBUG [dispatcher-event-loop-0]
scheduler.TaskSetManager: No tasks for locality level RACK_LOCAL, so moving
to locality level ANY
2020-09-17 01:21:50,042 DEBUG [dispatcher-event-loop-0]
cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 13 on
executor id: 6 hostname:
wn1-vegasr.r0erhw3gxezevknl0vbc42vctb.dx.internal.cloudapp.net.





--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Spark-Locality: Hinting Spark location of the executor does not take effect.

2020-09-17 Thread nakhanha
Even though i provide hint of wn4's ip address

Spark schedules on wn1
2020-09-17 01:21:50,038 DEBUG [dag-scheduler-event-loop]
scheduler.TaskSetManager: Valid locality levels for TaskSet 11.0:
NODE_LOCAL, RACK_LOCAL, ANY
2020-09-17 01:21:50,039 DEBUG [dispatcher-event-loop-0]
cluster.YarnScheduler: parentName: , name: TaskSet_11.0, runningTasks: 0
2020-09-17 01:21:50,040 INFO  [dispatcher-event-loop-0]
scheduler.TaskSetManager: Starting task 0.0 in stage 11.0 (TID 13,
wn1-vegasr.r0erhw3gxezevknl0vbc42vctb.dx.internal.cloudapp.net, executor 6,
partition 5, NODE_LOCAL, 4994 bytes)
2020-09-17 01:21:50,040 DEBUG [dispatcher-event-loop-0]
scheduler.TaskSetManager: No tasks for locality level NODE_LOCAL, so moving
to locality level RACK_LOCAL
2020-09-17 01:21:50,041 DEBUG [dispatcher-event-loop-0]
scheduler.TaskSetManager: No tasks for locality level RACK_LOCAL, so moving
to locality level ANY
2020-09-17 01:21:50,042 DEBUG [dispatcher-event-loop-0]
cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 13 on
executor id: 6 hostname:
wn1-vegasr.r0erhw3gxezevknl0vbc42vctb.dx.internal.cloudapp.net.





--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: Spark-Locality: Hinting Spark location of the executor does not take effect.

2020-09-17 Thread nakhanha
Even though i provide hint of wn4's ip addressSpark schedules on
wn12020-09-17 01:21:50,038 DEBUG [dag-scheduler-event-loop]
scheduler.TaskSetManager: Valid locality levels for TaskSet 11.0:
NODE_LOCAL, RACK_LOCAL, ANY2020-09-17 01:21:50,039 DEBUG
[dispatcher-event-loop-0] cluster.YarnScheduler: parentName: , name:
TaskSet_11.0, runningTasks: 02020-09-17 01:21:50,040 INFO 
[dispatcher-event-loop-0] scheduler.TaskSetManager: Starting task 0.0 in
stage 11.0 (TID 13,
wn1-vegasr.r0erhw3gxezevknl0vbc42vctb.dx.internal.cloudapp.net, executor 6,
partition 5, NODE_LOCAL, 4994 bytes)2020-09-17 01:21:50,040 DEBUG
[dispatcher-event-loop-0] scheduler.TaskSetManager: No tasks for locality
level NODE_LOCAL, so moving to locality level RACK_LOCAL2020-09-17
01:21:50,041 DEBUG [dispatcher-event-loop-0] scheduler.TaskSetManager: No
tasks for locality level RACK_LOCAL, so moving to locality level
ANY2020-09-17 01:21:50,042 DEBUG [dispatcher-event-loop-0]
cluster.YarnSchedulerBackend$YarnDriverEndpoint: Launching task 13 on
executor id: 6 hostname:
wn1-vegasr.r0erhw3gxezevknl0vbc42vctb.dx.internal.cloudapp.net.



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

Spark-Locality: Hinting Spark location of the executor does not take effect

2020-09-16 Thread Priyanka Gomatam
Sending on behalf of a colleague whose mail isn’t reaching the dev list for 
some reason 😊

===

HI Spark developers,

If I want to hint spark to use particular list of hosts to execute tasks on. I 
see that getBlockLocations is used to get the list of hosts from HDFS.

https://github.com/apache/spark/blob/7955b3962ac46b89564e0613db7bea98a1478bf2/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L386


Hinting Spark by custom getBlockLocation which return Array of BlockLocations 
with host ip address doesn’t help, Spark continues to host it on other 
executors hosts.

Is there something I am doing wrong ?

Test:
Spark.read.csv()


Appreciate your inputs 😊

Thanks,
Nasrulla



Spark-Locality: Hinting Spark location of the executor does not take effect.

2020-09-16 Thread Nasrulla Khan Haris
HI Spark developers,

If I want to hint spark to use particular list of hosts to execute tasks on. I 
see that getBlockLocations is used to get the list of hosts from HDFS.

https://github.com/apache/spark/blob/7955b3962ac46b89564e0613db7bea98a1478bf2/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L386


Hinting Spark by custom getBlockLocation which return Array of BlockLocations 
with host ip address doesn’t help, Spark continues to host it on other 
executors hosts.

Is there something I am doing wrong ?

Test:
Spark.read.csv()


Appreciate your inputs 😊

Thanks,
Nasrulla



Spark-Locality: Hinting Spark location of the executor does not take effect.

2020-09-16 Thread Nasrulla Khan Haris
HI Spark developers,

If I want to hint spark to use particular list of hosts to execute tasks on. I 
see that getBlockLocations is used to get the list of hosts from HDFS.

https://github.com/apache/spark/blob/7955b3962ac46b89564e0613db7bea98a1478bf2/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L386


Hinting Spark by custom getBlockLocation which return Array of BlockLocations 
with host ip address doesn't help, Spark continues to host it on other 
executors hosts.

Is there something I am doing wrong ?

Test:
Spark.read.csv()


Thanks,
Nasrulla