quick update: The poor runtime of scenario (3) is now fixed in master. The
reasons were unnecessary shuffle and load imbalance for spark rexpand
operations with small input vector and large, ultra-sparse output matrix.
Thanks for pointing this out Mingyang.
Regards,
Matthias
On Mon, May 8, 2017
ok thanks for sharing - I'll have a look later this week.
Regards,
Matthias
On Mon, May 8, 2017 at 2:20 PM, Mingyang Wang wrote:
> Hi Matthias,
>
> With a driver memory of 10GB, all operations were executed on CP, and I did
> observe that the version of reading FK as a vector and then convertin
Hi Matthias,
With a driver memory of 10GB, all operations were executed on CP, and I did
observe that the version of reading FK as a vector and then converting it
was faster, which took 8.337s (6.246s on GC) while the version of reading
FK as a matrix took 31.680s (26.256s on GC).
For the distrib
yes, even with the previous patch for improved memory efficiency of
ultra-sparse matrices in MCSR format, there is still some unnecessary
overhead that leads to garbage collection. For this reason, I would
recommend to read it as vector and convert it in memory to an ultra-sparse
matrix. I also jus
Out of curiosity, I increased the driver memory to 10GB, and then all
operations were executed on CP. It took 37.166s but JVM GC took 30.534s. I
was wondering whether this is the expected behavior?
Total elapsed time: 38.093 sec.
Total compilation time: 0.926 sec.
Total execution time: 37.166 sec.
Hi Matthias,
Thanks for the patch.
I have re-run the experiment and observed that there was indeed no more
memory pressure, but it still took ~90s for this simple script. I was
wondering what is the bottleneck for this case?
Total elapsed time: 94.800 sec.
Total compilation time: 1.826 sec.
Tot
to summarize, this was an issue of selecting serialized representations
for large ultra-sparse matrices. Thanks again for sharing your feedback
with us.
1) In-memory representation: In CSR every non-zero will require 12 bytes
- this is 240MB in your case. The overall memory consumption, howeve