[jira] [Commented] (SPARK-25982) Dataframe write is non blocking in fair scheduling mode

2019-03-03 Thread Sean Owen (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782808#comment-16782808
 ] 

Sean Owen commented on SPARK-25982:
---

Can you clarify with a more complete example? what is running in parallel and 
what next stage of what starts executing?

> Dataframe write is non blocking in fair scheduling mode
> ---
>
> Key: SPARK-25982
> URL: https://issues.apache.org/jira/browse/SPARK-25982
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.1
>Reporter: Ramandeep Singh
>Priority: Major
>
> Hi,
> I have noticed that expected behavior of dataframe write operation to block 
> is not working in fair scheduling mode.
> Ideally when a dataframe write is occurring and a future is blocking on 
> AwaitResult, no other job should be started, but this is not the case. I have 
> noticed that other jobs are started when the partitions are being written.  
>  
> Regards,
> Ramandeep Singh
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25982) Dataframe write is non blocking in fair scheduling mode

2019-03-02 Thread Ramandeep Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782527#comment-16782527
 ] 

Ramandeep Singh commented on SPARK-25982:
-

No, as I said those operations at a stage are independent. And I explicitly 
await for them to complete before launching the next stage. It's the fact that 
operation from next stage start running before all futures have completed. 

> Dataframe write is non blocking in fair scheduling mode
> ---
>
> Key: SPARK-25982
> URL: https://issues.apache.org/jira/browse/SPARK-25982
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.1
>Reporter: Ramandeep Singh
>Priority: Major
>
> Hi,
> I have noticed that expected behavior of dataframe write operation to block 
> is not working in fair scheduling mode.
> Ideally when a dataframe write is occurring and a future is blocking on 
> AwaitResult, no other job should be started, but this is not the case. I have 
> noticed that other jobs are started when the partitions are being written.  
>  
> Regards,
> Ramandeep Singh
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25982) Dataframe write is non blocking in fair scheduling mode

2019-03-02 Thread Sean Owen (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782521#comment-16782521
 ] 

Sean Owen commented on SPARK-25982:
---

I don't understand this; you're running operations in parallel on purpose, but 
expecting one to wait for the other?

> Dataframe write is non blocking in fair scheduling mode
> ---
>
> Key: SPARK-25982
> URL: https://issues.apache.org/jira/browse/SPARK-25982
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.1
>Reporter: Ramandeep Singh
>Priority: Major
>
> Hi,
> I have noticed that expected behavior of dataframe write operation to block 
> is not working in fair scheduling mode.
> Ideally when a dataframe write is occurring and a future is blocking on 
> AwaitResult, no other job should be started, but this is not the case. I have 
> noticed that other jobs are started when the partitions are being written.  
>  
> Regards,
> Ramandeep Singh
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25982) Dataframe write is non blocking in fair scheduling mode

2018-11-14 Thread Ramandeep Singh (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16687034#comment-16687034
 ] 

Ramandeep Singh commented on SPARK-25982:
-

Sure,

a) The setting for scheduler is fair scheduler

--conf 'spark.scheduler.mode'='FAIR'

b) There are independent jobs at one stage that are scheduled. This is okay, 
all of them block on dataframe write to complete. 

```

val futures = steps.par.map(stepId => Future {
 processWrite(stepsMap(stepId))
}).par
futures.foreach(Await.result(_, Duration.create(timeout, TimeUnit.MINUTES)))

```

Here, the processWrite processes write operations in parallel and awaits on 
each of them to complete, but the persist or write operation returns before it 
has written all the partitions of the dataframes, so other jobs at a later 
stage end up being run.

 

> Dataframe write is non blocking in fair scheduling mode
> ---
>
> Key: SPARK-25982
> URL: https://issues.apache.org/jira/browse/SPARK-25982
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.1
>Reporter: Ramandeep Singh
>Priority: Major
>
> Hi,
> I have noticed that expected behavior of dataframe write operation to block 
> is not working in fair scheduling mode.
> Ideally when a dataframe write is occurring and a future is blocking on 
> AwaitResult, no other job should be started, but this is not the case. I have 
> noticed that other jobs are started when the partitions are being written.  
>  
> Regards,
> Ramandeep Singh
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25982) Dataframe write is non blocking in fair scheduling mode

2018-11-08 Thread Hyukjin Kwon (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16680827#comment-16680827
 ] 

Hyukjin Kwon commented on SPARK-25982:
--

Can you post reproducible codes to describe your idea, and elaborate the 
current input and expected input?

> Dataframe write is non blocking in fair scheduling mode
> ---
>
> Key: SPARK-25982
> URL: https://issues.apache.org/jira/browse/SPARK-25982
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.3.1
>Reporter: Ramandeep Singh
>Priority: Major
>
> Hi,
> I have noticed that expected behavior of dataframe write operation to block 
> is not working in fair scheduling mode.
> Ideally when a dataframe write is occurring and a future is blocking on 
> AwaitResult, no other job should be started, but this is not the case. I have 
> noticed that other jobs are started when the partitions are being written.  
>  
> Regards,
> Ramandeep Singh
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org