Hi all,
Is there anyway to specify the "rows_in_block" and "cols_in_block" when
writing out a matrix in binary format?
I have tried to specify the values of these two attributes in write(...),
but clearly it is not working.
Regards,
Mingyang
gain (e.g., num executors, data distribution, etc). Thanks.
>
> Regards,
> Matthias
>
>
> On Thu, May 4, 2017 at 9:55 PM, Mingyang Wang wrote:
>
> > Out of curiosity, I increased the driver memory to 10GB, and then all
> > operations were executed on CP. It took 37.166s
-- 3) + 0.000 sec 1
-- 4) print 0.000 sec 1
-- 5) rmvar 0.000 sec 5
-- 6) createvar 0.000 sec 1
-- 7) assignvar 0.000 sec 1
-- 8) cpvar 0.000 sec 1
Regards,
Mingyang
On Thu, May 4, 2017 at 9:48 PM Mingyang Wang wrote:
> Hi Matthias,
>
> Thanks for the patch.
>
> I have re-run the
when introducing an improvement that stores sparse matrices in MCSR
> format in CSR format on checkpoints which eliminated the need to use a
> serialized storage level. I just deliver a fix. Now we store such
> ultra-sparse matrices again in serialized form which should
> significantl
Hi all,
I was playing with a super sparse matrix FK, 2e7 by 1e6, with only one
non-zero value on each row, that is 2e7 non-zero values in total.
With driver memory of 1GB and executor memory of 100GB, I found the HOP
"Spark chkpoint", which is used to pin the FK matrix in memory, is really
expens
ore it as a column vector with FK2 = rowIndexMax(FK) and
>> subsequently reconstruct it via FK = table(seq(1,nrow(FK2)), FK2,
>> nrow(FK2), N), for which we will compile a dedicated operator that does row
>> expansions. You don't necessarily need the last two argument which only
e largest operation
> fits into 70% of the max heap. Additionally, memory configurations also
> impact operator selection - for example, we only compile broadcast-based
> matrix multiplications if the smaller input fits twice in the driver and in
> the broadcast budget of executors (w
parately?
4. Any rule of thumb to estimate the memory needed for a program in
SystemML?
I really appreciate your inputs!
Best,
Mingyang Wang
t; would however, not recommend this.
>
> Thanks again for the feedback. While writing this comment, I actually came
> to the conclusion that we could handle even the case with input csv better
> in order to avoid evictions in these scenarios.
>
>
> Regards,
> Matthias
>
>
ks in my case. Any suggestion about how to choose a better
configuration or make some detours so I can obtain fair benchmarks on a
wide range of data dimensions?
If needed, I can attach the logs.
I really appreciate your help!
Regards,
Mingyang Wang
Graduate Student in UCSD CSE Dept.
10 matches
Mail list logo