Github user srowen commented on the issue:
https://github.com/apache/spark/pull/18936
Interesting, I wouldn't have expected much difference at all. Once it's in
native code these are all just SSE instructions on the silicon... I don't know
how it could be much different. But
Github user VinceShieh commented on the issue:
https://github.com/apache/spark/pull/18936
Hi Sean, sorry for late reply. Yeah, actually we do have some performance
data on F2J vs. OpenBLAS. It seems there is no performance gain from openblas,
not even on the unit test level. We are
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/18936
BTW I do think this is a promising idea. I'd welcome more info about the
performance implications, but if it seems like a net win for most users we
should do it.
---
Github user VinceShieh commented on the issue:
https://github.com/apache/spark/pull/18936
Okay. We will benchmark on OpenBLAS. Thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/18936
I don't see what that has to do with it. The threading setting only affects
what native code does. A single-thread configuration should be able to saturate
CPU, too. Allowing multiple threads on top
Github user VinceShieh commented on the issue:
https://github.com/apache/spark/pull/18936
@srowen currently, what we see is, with default thread setting(take up all
computation resource available) for native blas, the No. 1 hot spot (with 95%+
self time) is
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/18936
Yeah I'd believe it's faster with better native threading settings. I think
we're back to the issue that with default settings for BLAS libs, this setting
would make things somewhat slower.
Github user VinceShieh commented on the issue:
https://github.com/apache/spark/pull/18936
thanks, Sean and Nick.
To @srowen , I think the difference is the finding from our previous
investigation that, thread setting in the native BLAS impacts the overall
performance of a
Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/18936
I'm not actually sure why the f2jblas implementation was originally forced
in level 1, but IIRC it was due to some benchmarking at that time. The
intuition is that the overhead of calling into
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/18936
BLAS doesn't work on sparse data. All of those invocations are on dense
data of some kind. Many of the remaining ones operate on dense matrices even;
they're not even level 1. I think all of them
Github user VinceShieh commented on the issue:
https://github.com/apache/spark/pull/18936
Yes, they are not the only place, but we only tested on the dense dataset
and got the performance data shown above. We are conservative on sparse data,
so keep the sparse path the way it was.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18936
**[Test build #80620 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80620/testReport)**
for PR 18936 at commit
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18936
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80620/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/18936
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/18936
These aren't the only places these operations are called from F2JBLAS.
You'd need all of them right?
CC @dbtsai
---
If your project is set up for it, you can reply to this email and have
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/18936
**[Test build #80620 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80620/testReport)**
for PR 18936 at commit
16 matches
Mail list logo