You could consider streaming data to multiple instances of OnlineStats.jl
in parallel. There should be no problem with memory usage as long as you
don't explicitly load your whole data set at once.

On Friday, June 24, 2016, Matthew Pearce <matt...@refute.me.uk> wrote:

> Hello
>
> I'm trying to solve a largish elasticnet type problem (convex
> optimisation).
>
>    - The LARS.jl package produces Out of Memory errors for a test (1000,
>    262144) problem. /proc/meminfo suggests I have 17x this array size free so
>    not sure what's going on there.
>    - I have access to multiple GPUs and nodes.
>    - I would potentially need to solve problems of the above sort of size
>    or bigger (10k, 200k) many, many times.
>
> Looking for thoughts on the appropriate way to go about tackling this:
>
>    - Rewrap an existing glmnet library for Julia (e.g. this CUDA enabled
>    one https://github.com/jeffwong/cudaglmnet or
>    http://www-hsc.usc.edu/~garykche/gpulasso.pdf)
>    - Go back to basics and use and optimisation package on the objective
>    function (https://github.com/JuliaOpt), but which one? Would this be
>    inefficient compared to specific glmnet solvers which do some kind of
>    coordinate descent?
>    - Rewrite some CUDA library from scratch (OK - probably a bad idea).
>
> Thoughts on the back of a postcard would be gratefully received.
>
>
> Cheers
>
>
> Matthew
>

Reply via email to