Re: [R] Matrix scalar operation that saves memory?

2023-04-13 Thread Richard O'Keefe
"wear your disc quite badly"?
If you can afford a computer with 512 GB of memory,
you can afford to pay $100 for a 2 TB external SSD,
use it as scratch space, and throw it away after a
month of use.  A hard drive is expected to last for
more than 40,000 hours of constant use.  Are you
sure that your own disc is so fragile?  Hard drives
are pretty cheap these days.  You could afford to
pay $50 for a 2 TB external hard drive, use it as
scratch space, and throw it away.

I've been around long enough to remember when the idea
of processing a 1000x1000 matrix in memory was greeted
with hysterical laughter and a recommendation to stop
smoking whatever I was smoking.   (But not old enough
to remember shuffling matrices around on tape.  Shudder.)

If you want to work on two 300GB matrices using a machine
with 512GB of RAM, you are going to be using a disc or
SSD, like it or not.  You can leave it up to the paging
subsystem of your OS, which will do its poor best, or you
can explicitly schedule reads and writes in your program,
and if you use asynchronous I/O it might overlap quite nicely.

Assuming for the sake of arithmetic that your matrix
elements are complex numbers represented as pairs of
double precision floats, that's 16 bytes per element,
or 300e9/16 = 1.975e10 elements = n^2 elements where n = 136,930.
Other than adding, subtracting, and multiplying by a scalar,
there's not much you can do with an nxn matrix that won't take
time proportional to n^3.

Is there any way you can divide the matrix into (possibly
overlapping) blocks and do the work on a cluster?  Or a
block at a time?


On Wed, 12 Apr 2023 at 15:54, Shunran Zhang <
szh...@ngs.gen-info.osaka-u.ac.jp> wrote:

> Thanks for the info.
>
> For the data type, my matrix as of now is indeed a matrix in a perfect
> square shape filled in a previous segment of code, but I believe I could
> extract one row/column at a time to do some processing. I can also
> change that previous part of code to change the data type of it to
> something else if that helps.
>
> Saving it to a file for manipulation and reading it back seems to be
> quite IO intensive - writing 600G of data and reading 300G back from a
> hard drive would make the code extremely heavy as well as wear my disk
> quite badly.
>
> For now I'll try the row-by-row method and hope it works...
>
> Sincerely,
> S. Zhang
>
>
> On 2023/04/12 12:39, avi.e.gr...@gmail.com wrote:
> > The example given does not leave room for even a single copy of your
> matrix
> > so, yes, you need alternatives.
> >
> > Your example was fairly trivial as all you wanted to do is subtract each
> > value from 100 and replace it. Obviously something like squaring a matrix
> > has no trivial way to do without multiple copies out there that won't
> fit.
> >
> > One technique that might work is a nested loop that changes one cell of
> the
> > matrix at a time and in-place. A variant of this might be a singe loop
> that
> > changes a single row (or column) at a time and in place.
> >
> > Another odd concept is to save your matrix in a file with some format you
> > can read back in such as a line or row at a time, and then do the
> > subtraction from 100 and write it back to disk in another file. If you
> need
> > it again, I assume you can read it in but perhaps you should consider
> how to
> > increase some aspects of your "memory".
> >
> > Is your matrix a real matrix type or something like a list of lists or a
> > data.frame? You may do better with some data structures that are more
> > efficient than others.
> >
> > Some OS allow you to use virtual memory that is mapped in and out from
> the
> > disk that allows larger things to be done, albeit often much slower. I
> also
> > note that you can remove some things you are not using and hope garbage
> > collection happens soon enough.
> >
> > -Original Message-
> > From: R-help  On Behalf Of Shunran Zhang
> > Sent: Tuesday, April 11, 2023 10:21 PM
> > To: r-help@r-project.org
> > Subject: [R] Matrix scalar operation that saves memory?
> >
> > Hi all,
> >
> > I am currently working with a quite large matrix that takes 300G of
> > memory. My computer only has 512G of memory. I would need to do a few
> > arithmetic on it with a scalar value. My current code looks like this:
> >
> > mat <- 100 - mat
> >
> > However such code quickly uses up all of the remaining memory and got
> > the R script killed by OOM killer.
> >
> > Are there any more memory-efficient way of doing such operation?
> >
> > Thanks,
> >
> > S. Zhang
> >
> >   [[alternative HTML version deleted]]

Re: [R] Matrix scalar operation that saves memory?

2023-04-12 Thread Iago Giné Vázquez
You may take a look at the bigmemory package or other which deal with large 
memory data in 
https://cran.r-project.org/web/views/HighPerformanceComputing.html#large-memory-and-out-of-memory-data
Some extra explanation is in https://stackoverflow.com/a/11127229/997979

Iago


De: R-help  de part de Eric Berger 

Enviat el: dimecres, 12 d’abril de 2023 8:38
Per a: Bert Gunter 
A/c: R-help 
Tema: Re: [R] Matrix scalar operation that saves memory?

One possibility might be to use Rcpp.
An R matrix is stored in contiguous memory, which can be considered as a
vector.
Define a C++ function which operates on a vector in place, as in the
following:

library(Rcpp)
cppFunction(
  'void subtractConst(NumericVector x, double c) {
  for ( int i = 0; i < x.size(); ++i)
x[i] = x[i] - c;
}')

Try this function out on a matrix. Here we define a 5x2 matrix

m <- matrix(150.5 + 1:10, nrow=5)
print(m)

  [,1]  [,2]
[1,] 151.5 156.5
[2,] 152.5 157.5
[3,] 153.5 158.5
[4,] 154.5 159.5
[5,] 155.5 160.5

Now call the C++ function

subtractConst(m,100.0)
print(m)

[,1] [,2]
[1,] 51.5 56.5
[2,] 52.5 57.5
[3,] 53.5 58.5
[4,] 54.5 59.5
[5,] 55.5 60.5

HTH,
Eric


On Wed, Apr 12, 2023 at 7:34 AM Bert Gunter  wrote:

> I doubt that R's basic matrix capabilities can handle this, but have a look
> at the Matrix package, especially if your matrix is some special form.
>
> Bert
>
> On Tue, Apr 11, 2023, 19:21 Shunran Zhang <
> szh...@ngs.gen-info.osaka-u.ac.jp>
> wrote:
>
> > Hi all,
> >
> > I am currently working with a quite large matrix that takes 300G of
> > memory. My computer only has 512G of memory. I would need to do a few
> > arithmetic on it with a scalar value. My current code looks like this:
> >
> > mat <- 100 - mat
> >
> > However such code quickly uses up all of the remaining memory and got
> > the R script killed by OOM killer.
> >
> > Are there any more memory-efficient way of doing such operation?
> >
> > Thanks,
> >
> > S. Zhang
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix scalar operation that saves memory?

2023-04-12 Thread Eric Berger
One possibility might be to use Rcpp.
An R matrix is stored in contiguous memory, which can be considered as a
vector.
Define a C++ function which operates on a vector in place, as in the
following:

library(Rcpp)
cppFunction(
  'void subtractConst(NumericVector x, double c) {
  for ( int i = 0; i < x.size(); ++i)
x[i] = x[i] - c;
}')

Try this function out on a matrix. Here we define a 5x2 matrix

m <- matrix(150.5 + 1:10, nrow=5)
print(m)

  [,1]  [,2]
[1,] 151.5 156.5
[2,] 152.5 157.5
[3,] 153.5 158.5
[4,] 154.5 159.5
[5,] 155.5 160.5

Now call the C++ function

subtractConst(m,100.0)
print(m)

[,1] [,2]
[1,] 51.5 56.5
[2,] 52.5 57.5
[3,] 53.5 58.5
[4,] 54.5 59.5
[5,] 55.5 60.5

HTH,
Eric


On Wed, Apr 12, 2023 at 7:34 AM Bert Gunter  wrote:

> I doubt that R's basic matrix capabilities can handle this, but have a look
> at the Matrix package, especially if your matrix is some special form.
>
> Bert
>
> On Tue, Apr 11, 2023, 19:21 Shunran Zhang <
> szh...@ngs.gen-info.osaka-u.ac.jp>
> wrote:
>
> > Hi all,
> >
> > I am currently working with a quite large matrix that takes 300G of
> > memory. My computer only has 512G of memory. I would need to do a few
> > arithmetic on it with a scalar value. My current code looks like this:
> >
> > mat <- 100 - mat
> >
> > However such code quickly uses up all of the remaining memory and got
> > the R script killed by OOM killer.
> >
> > Are there any more memory-efficient way of doing such operation?
> >
> > Thanks,
> >
> > S. Zhang
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix scalar operation that saves memory?

2023-04-11 Thread Bert Gunter
I doubt that R's basic matrix capabilities can handle this, but have a look
at the Matrix package, especially if your matrix is some special form.

Bert

On Tue, Apr 11, 2023, 19:21 Shunran Zhang 
wrote:

> Hi all,
>
> I am currently working with a quite large matrix that takes 300G of
> memory. My computer only has 512G of memory. I would need to do a few
> arithmetic on it with a scalar value. My current code looks like this:
>
> mat <- 100 - mat
>
> However such code quickly uses up all of the remaining memory and got
> the R script killed by OOM killer.
>
> Are there any more memory-efficient way of doing such operation?
>
> Thanks,
>
> S. Zhang
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix scalar operation that saves memory?

2023-04-11 Thread Shunran Zhang

Thanks for the info.

For the data type, my matrix as of now is indeed a matrix in a perfect 
square shape filled in a previous segment of code, but I believe I could 
extract one row/column at a time to do some processing. I can also 
change that previous part of code to change the data type of it to 
something else if that helps.


Saving it to a file for manipulation and reading it back seems to be 
quite IO intensive - writing 600G of data and reading 300G back from a 
hard drive would make the code extremely heavy as well as wear my disk 
quite badly.


For now I'll try the row-by-row method and hope it works...

Sincerely,
S. Zhang


On 2023/04/12 12:39, avi.e.gr...@gmail.com wrote:

The example given does not leave room for even a single copy of your matrix
so, yes, you need alternatives.

Your example was fairly trivial as all you wanted to do is subtract each
value from 100 and replace it. Obviously something like squaring a matrix
has no trivial way to do without multiple copies out there that won't fit.

One technique that might work is a nested loop that changes one cell of the
matrix at a time and in-place. A variant of this might be a singe loop that
changes a single row (or column) at a time and in place.

Another odd concept is to save your matrix in a file with some format you
can read back in such as a line or row at a time, and then do the
subtraction from 100 and write it back to disk in another file. If you need
it again, I assume you can read it in but perhaps you should consider how to
increase some aspects of your "memory".

Is your matrix a real matrix type or something like a list of lists or a
data.frame? You may do better with some data structures that are more
efficient than others.

Some OS allow you to use virtual memory that is mapped in and out from the
disk that allows larger things to be done, albeit often much slower. I also
note that you can remove some things you are not using and hope garbage
collection happens soon enough.

-Original Message-
From: R-help  On Behalf Of Shunran Zhang
Sent: Tuesday, April 11, 2023 10:21 PM
To: r-help@r-project.org
Subject: [R] Matrix scalar operation that saves memory?

Hi all,

I am currently working with a quite large matrix that takes 300G of
memory. My computer only has 512G of memory. I would need to do a few
arithmetic on it with a scalar value. My current code looks like this:

mat <- 100 - mat

However such code quickly uses up all of the remaining memory and got
the R script killed by OOM killer.

Are there any more memory-efficient way of doing such operation?

Thanks,

S. Zhang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix scalar operation that saves memory?

2023-04-11 Thread avi.e.gross
The example given does not leave room for even a single copy of your matrix
so, yes, you need alternatives.

Your example was fairly trivial as all you wanted to do is subtract each
value from 100 and replace it. Obviously something like squaring a matrix
has no trivial way to do without multiple copies out there that won't fit.

One technique that might work is a nested loop that changes one cell of the
matrix at a time and in-place. A variant of this might be a singe loop that
changes a single row (or column) at a time and in place.

Another odd concept is to save your matrix in a file with some format you
can read back in such as a line or row at a time, and then do the
subtraction from 100 and write it back to disk in another file. If you need
it again, I assume you can read it in but perhaps you should consider how to
increase some aspects of your "memory".

Is your matrix a real matrix type or something like a list of lists or a
data.frame? You may do better with some data structures that are more
efficient than others.

Some OS allow you to use virtual memory that is mapped in and out from the
disk that allows larger things to be done, albeit often much slower. I also
note that you can remove some things you are not using and hope garbage
collection happens soon enough.

-Original Message-
From: R-help  On Behalf Of Shunran Zhang
Sent: Tuesday, April 11, 2023 10:21 PM
To: r-help@r-project.org
Subject: [R] Matrix scalar operation that saves memory?

Hi all,

I am currently working with a quite large matrix that takes 300G of 
memory. My computer only has 512G of memory. I would need to do a few 
arithmetic on it with a scalar value. My current code looks like this:

mat <- 100 - mat

However such code quickly uses up all of the remaining memory and got 
the R script killed by OOM killer.

Are there any more memory-efficient way of doing such operation?

Thanks,

S. Zhang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix scalar operation that saves memory?

2023-04-11 Thread Shunran Zhang
Hi all,

I am currently working with a quite large matrix that takes 300G of 
memory. My computer only has 512G of memory. I would need to do a few 
arithmetic on it with a scalar value. My current code looks like this:

mat <- 100 - mat

However such code quickly uses up all of the remaining memory and got 
the R script killed by OOM killer.

Are there any more memory-efficient way of doing such operation?

Thanks,

S. Zhang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.