[ 
https://issues.apache.org/jira/browse/SPARK-24674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro resolved SPARK-24674.
--------------------------------------
    Resolution: Invalid

> Spark on Kubernetes BLAS performance
> ------------------------------------
>
>                 Key: SPARK-24674
>                 URL: https://issues.apache.org/jira/browse/SPARK-24674
>             Project: Spark
>          Issue Type: Question
>          Components: Build, Kubernetes, MLlib
>    Affects Versions: 2.3.1
>         Environment: Spark 2.3.1 SNAPSHOT (as of June 25th)
> Kubernetes version 1.7.5
> Kubernetes cluster, consisting of 4 Nodes with 16 GB RAM, 8 core Intel 
> processors.
>            Reporter: Dennis Aumiller
>            Priority: Minor
>              Labels: performance
>
>  
> Usually native BLAS libraries speed up the execution time of CPU-heavy 
> operations as for example in MLlib quite significantly.
>  Of course, the initial error
> {code:java}
> WARN  BLAS:61 - Failed to load implementation from: 
> com.github.fommil.netlib.NativeSystemBLAS
> {code}
> can be resolved not so easily, since, as reported 
> [here|[https://github.com/apache/spark/pull/19717/files/7d2b30373b2e4d8d5311e10c3f9a62a2d900d568],]
>  this seems to be the issue because of the underlying image used by the Spark 
> Dockerfile.
>  Re-building spark with
> {code:java}
> -Pnetlib-lgpl
> {code}
> also does not solve the problem, but I managed to build BLAS and LAPACK into 
> Alpine, with a lot of tricks involved.
> Interestingly, I noticed that the performance of PCA in my case dropped quite 
> significantly (with BLAS support, compared to the netlib-java fallback). I am 
> aware of [#SPARK-21305] as well, but that did not help my case, either.
>  Furthermore, calling SVD on a matrix of only size 5000x5000 (density 1%) 
> already throws an error when trying to use native ARPACK, but runs perfectly 
> fine with the fallback version.
> The question would be whether there has been some investigation in that 
> direction already.
>  Or, if not, whether it would be interesting for the Spark community to 
> provide a
>  * more detailed report with respect to timings/configurations/test setup
>  * a provided Dockerfile to build Spark with BLAS/LAPACK/ARPACK using the 
> shipped Dockerfile as a basis
>   
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to