[ 
https://issues.apache.org/jira/browse/SPARK-15247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281221#comment-15281221
 ] 

Johnny W. edited comment on SPARK-15247 at 5/12/16 5:28 AM:
------------------------------------------------------------

Takeshi, thanks for your reply. However,  I don't think they are relevant, 
because even though I have only 1 (very small) file, spark creates n_executors 
* n_cores tasks, and this is only true for parquet. This issue has been 
confirmed by others from the community:
--
https://www.mail-archive.com/user@spark.apache.org/msg50568.html

Is there any walkaround for this without rebuilding spark?


was (Author: johnnyws):
Takeshi, thanks for your reply. However,  I don't think they are relevant, 
because even though I have only 1 (very small) file, spark creates n_executors 
* n_cores tasks, and this is only true for parquet. This issue has been 
confirmed by others from the community:
--
https://www.mail-archive.com/user@spark.apache.org/msg50568.html

Could you try to reproduce? Is there any walkaround for this?

> sqlCtx.read.parquet yields at least n_executors * n_cores tasks
> ---------------------------------------------------------------
>
>                 Key: SPARK-15247
>                 URL: https://issues.apache.org/jira/browse/SPARK-15247
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Johnny W.
>
> sqlCtx.read.parquet always yields at least n_executors * n_cores tasks, even 
> though this is only 1 very small file
> This issue can increase the latency for small jobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to