Re: [R] Autofilling a large matrix in R

2012-10-12 Thread Pieter Schoonees
I think the issue is that the with expand.grid and times >= 4 you are likely to 
run out of memory before subscripting (at least on my machine). 

A simplification is to realize that you are looking for points in a lattice in 
the interior of a (p - 1)-dimensional simplex for p columns/factors/groups. 

As a start the xsimplex() function in the combinat package generates all the 
points in such a simplex which sums to a specific value (and nsimplex() 
calculates the number). 

If you then still want to remove the instances on the edges of the simplex 
(where one of the percentages is 0), at least you have a more memory efficient 
base within which to search.

For p = 4 then you will start with 

> require(combinat)
> nsimplex(4,100)
[1] 176851

candidate points instead of 

> 100^4
[1] 1e+08

points.

As an example, to generate all combinations for 4 factors excluding any 0's, 
you could do

> mat <- xsimplex(4,100)

> ncol(mat)
[1] 176851

> print(object.size(mat),unit="Mb")
5.4 Mb

> mat <- mat[,apply(mat,2,function(x)!any(x==0))]

> ncol(mat)
[1] 156849

Of course the curse of dimensionality will still get you as the number of 
factors increases. E.g.

> mat <- xsimplex(5,100)

> ncol(mat)
[1] 4598125

> print(object.size(mat),unit="Mb")
175.4 Mb

which is still manageable (but for p = 6 your lattice has nearly 100 million 
points).

Perhaps you can modify the code of xsimplex to automatically discard zeros.

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Rui Barradas
> Sent: Friday, October 12, 2012 18:04
> To: wwreith
> Cc: r-help@r-project.org
> Subject: Re: [R] Autofilling a large matrix in R
> 
> Hello,
> 
> Something like this?
> 
> g[rowSums(g) == 100, ]
> 
> Hope this helps,
> 
> Rui Barradas
> Em 12-10-2012 15:30, wwreith escreveu:
> > I wish to create a matrix of all possible percentages with two decimal
> > place percision. I then want each row  to sum to 100%. I started with
> > the code below with the intent to then subset the data based on the
> > row sum. This works great for 2 or 3 columns, but if I try 4 or more
> > columns the number of rows become to large. I would like to find a way
> > to break it down into some kind of for loop, so that I can remove the
> > rows that don't sum to 100% inside the for loop rather than outside
> > it. My first thought was to take list from 1:10, 11:20, etc. but that does 
> > not
> get all of the points.
> >
> > g<-as.matrix(expand.grid(rep(list(1:100), times=3)))
> >
> > Any thoughts how to split this into pieces?
> >
> >
> >
> > --
> > View this message in context:
> > http://r.789695.n4.nabble.com/Autofilling-a-large-matrix-in-R-tp464599
> > 1.html Sent from the R help mailing list archive at Nabble.com.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Autofilling a large matrix in R

2012-10-12 Thread Rui Barradas

Hello,

Something like this?

g[rowSums(g) == 100, ]

Hope this helps,

Rui Barradas
Em 12-10-2012 15:30, wwreith escreveu:

I wish to create a matrix of all possible percentages with two decimal place
percision. I then want each row  to sum to 100%. I started with the code
below with the intent to then subset the data based on the row sum. This
works great for 2 or 3 columns, but if I try 4 or more columns the number of
rows become to large. I would like to find a way to break it down into some
kind of for loop, so that I can remove the rows that don't sum to 100%
inside the for loop rather than outside it. My first thought was to take
list from 1:10, 11:20, etc. but that does not get all of the points.

g<-as.matrix(expand.grid(rep(list(1:100), times=3)))

Any thoughts how to split this into pieces?



--
View this message in context: 
http://r.789695.n4.nabble.com/Autofilling-a-large-matrix-in-R-tp4645991.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Autofilling a large matrix in R

2012-10-12 Thread jim holtman
To avoid FAQ 7.31, you probably should use:

seq(0, 1) / 1

On Fri, Oct 12, 2012 at 11:12 AM, Mark Lamias  wrote:
> If you are after all the possible percentages with two decimal places, why 
> don't you use this:
>
> seq(from=0, to=100, by=.01)/100
>
> I'm not really sure what you are trying to do in terms of rows and columns, 
> however.  Can you be a bit more specific on what each row/column is?
> Are you trying to group the numbers so that all the entries in a row add up 
> to 100% and then, once it does, split the following entries onto the next row 
> until they add up to 100%, etc.?
> Thanks.
>
>
>
> 
>  From: wwreith 
> To: r-help@r-project.org
> Sent: Friday, October 12, 2012 10:30 AM
> Subject: [R] Autofilling a large matrix in R
>
> I wish to create a matrix of all possible percentages with two decimal place
> percision. I then want each row  to sum to 100%. I started with the code
> below with the intent to then subset the data based on the row sum. This
> works great for 2 or 3 columns, but if I try 4 or more columns the number of
> rows become to large. I would like to find a way to break it down into some
> kind of for loop, so that I can remove the rows that don't sum to 100%
> inside the for loop rather than outside it. My first thought was to take
> list from 1:10, 11:20, etc. but that does not get all of the points.
>
> g<-as.matrix(expand.grid(rep(list(1:100), times=3)))
>
> Any thoughts how to split this into pieces?
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Autofilling-a-large-matrix-in-R-tp4645991.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Autofilling a large matrix in R

2012-10-12 Thread Mark Lamias
If you are after all the possible percentages with two decimal places, why 
don't you use this:

seq(from=0, to=100, by=.01)/100

I'm not really sure what you are trying to do in terms of rows and columns, 
however.  Can you be a bit more specific on what each row/column is?  
Are you trying to group the numbers so that all the entries in a row add up to 
100% and then, once it does, split the following entries onto the next row 
until they add up to 100%, etc.?
Thanks.




 From: wwreith 
To: r-help@r-project.org 
Sent: Friday, October 12, 2012 10:30 AM
Subject: [R] Autofilling a large matrix in R

I wish to create a matrix of all possible percentages with two decimal place
percision. I then want each row  to sum to 100%. I started with the code
below with the intent to then subset the data based on the row sum. This
works great for 2 or 3 columns, but if I try 4 or more columns the number of
rows become to large. I would like to find a way to break it down into some
kind of for loop, so that I can remove the rows that don't sum to 100%
inside the for loop rather than outside it. My first thought was to take
list from 1:10, 11:20, etc. but that does not get all of the points. 

g<-as.matrix(expand.grid(rep(list(1:100), times=3)))

Any thoughts how to split this into pieces?



--
View this message in context: 
http://r.789695.n4.nabble.com/Autofilling-a-large-matrix-in-R-tp4645991.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Autofilling a large matrix in R

2012-10-12 Thread wwreith
I wish to create a matrix of all possible percentages with two decimal place
percision. I then want each row  to sum to 100%. I started with the code
below with the intent to then subset the data based on the row sum. This
works great for 2 or 3 columns, but if I try 4 or more columns the number of
rows become to large. I would like to find a way to break it down into some
kind of for loop, so that I can remove the rows that don't sum to 100%
inside the for loop rather than outside it. My first thought was to take
list from 1:10, 11:20, etc. but that does not get all of the points. 

g<-as.matrix(expand.grid(rep(list(1:100), times=3)))

Any thoughts how to split this into pieces?



--
View this message in context: 
http://r.789695.n4.nabble.com/Autofilling-a-large-matrix-in-R-tp4645991.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.