In addition, if you go the route of a data frame then the functions to look at are tapply, aggregate, and ave.
On Wed, Nov 28, 2012 at 2:02 PM, Greg Snow <538...@gmail.com> wrote: > Yes, I meant FAQ 7.21, must have stuttered in typing, but it is always > good to read other FAQs while looking for a specific one. > > I did read your full description, though whether I fully understand or not > is yet to be seen. > > It seems like a lot of what you want to do could be simplified by using > the apply and sweep functions, or possibly by the aaply function from the > plyr package. > > The apply function works on any dimension of arrays, for example a piece > of code like out <- apply(myarray, c(1,5,7), sum) will sum over all the > dimensions except 1, 5, and 7 and will return a 3 dimensional array, then > myarray2 <- sweep(myarray, c(1,5,7), out, FUN="/") will divide each element > of the original array by the appropriate value of out. In this case running > the apply again would give all 1's, but you could divide the out array by > your target margins. If you had 2 lists, the first has the vectors of > margins to sum/sweep and the 2nd has your target margin arrays, then you > could do a for loop passing the current elements of the lists to the > correct place in the function calls. > > Here is some example code (but the forced margins probably don't make any > sense): > > dims <- list( 1:2, c(1,3), 2:3 ) > margins <- list( 10*matrix(1:16, 4), > 20*17/9*matrix(1:8, ncol=2), > 20*17/9*matrix(1:8, ncol=2) ) > > old <- array(0, c(4,4,2)) > new <- HairEyeColor > i <- 0 > while( max(abs(old-new)) > 0.0000001 ) { > i <- i + 1 > cat('Iteration ',i,'\n') > flush.console() > if(i > 100) { > cat('did not converge\n') > break > } > > old <- new > > for(j in seq_along(dims) ) { > new <- sweep(new, dims[[j]], > apply(new, dims[[j]], sum)/margins[[j]], > FUN="/") > } > } > > > You might also want to look at the loglin function, as part of its > computations it starts with a starting matrix/array (which by default is > all 1's) then finds the array that has the same margins as the passed in > table. It probably uses an algorithm similar to what you want to do. If > you can pass the appropriate pieces to loglin then it may compute what you > want for you (and probably much quicker since it uses compiled code). > > > > > On Tue, Nov 27, 2012 at 8:29 PM, andrewH <ahoer...@rprogress.org> wrote: > >> Dear Greg >> >> You mean FAQ 7.21, not 7.22, correct? Though 7.12 also seems relevant. >> Though I would say I was asking about turning a string into an expression >> rather than a variable. At any rate, thanks for the pointer. I sure I >> would >> benefit from rereading the FAQ on a monthly basis, until I actually know >> most of what is in it. >> >> As to your question about my question, Ive wanted to do this exact thing >> several times in different contexts. However, you are quite correct that I >> am struggling with this problem in a particular context. I have of a >> large, >> multi-dimensional object containing count data. Currently this object is >> implemented as a 26 dimensional (and growing) array with two to thirteen >> dimnames per dimension, though I am thinking of switching it to a data >> frame >> with dimensions as factors and dimname-equivilent factor levels. >> >> I need to take a lot of complicated partitions of this object, mainly, >> though not always, summing to the entire object. Most of the partitions >> are >> subsets of -- >> >> OK, now I have to digress to address a terminological uncertainty. Think >> of a 4X4X4 cube. It has three dimensions, and each dimension has four of >> what? Im going to call them levels right now, though I dont think that >> is >> right -- it would be confusing if there were factors in the picture. Also, >> the dimnames do not name the dimensions, but the thing I am calling >> levels, >> which is also confusing. -- >> >> Anyway, most of the partitions consist of two to four dimensions out of >> 24, >> but sometimes with some levels omitted or summed, and occasionally the >> partitions that are much more complicated (to deal with censored data, >> mainly). I have to use each partition multiple times, doing a very >> different >> thing each time (and then repeat the whole set many times) The next 4 >> paragraphs describe what I am actually doing with the partitions, but you >> can skip over them and cut to the chase if you are not so interested. >> >> I am summing over the dimensions in each partition, dividing a table of >> forcing totals for that partition by those sums (element by element), >> and >> then taking the resulting ratios and multiplying each of the terms in the >> original, non-summed object by the corresponding ratio. >> >> This is easiest to understand by analogy to the two-dimensional case. You >> take the row sums and divide them element by element by a vector of >> pre-determined row forcing totals, to get a vector of forcing ratios. >> Then >> you multiply each row by the corresponding forcing ratio, so that the row >> sum will then match the forcing total. Then you do the same thing with the >> columns. Repeat, alternating row and columns, to convergence. Each column >> has a corresponding column forcing total, and each row has a corresponding >> row forcing total. The elements of the matrix have two partitions that we >> use, one into rows, and the other into columns. This is sometimes called >> RAS balancing, or biproportional matrix adjustment. It is an algorithm >> that >> is used a lot to update big matrices in national income accounting and >> input-output analysis. >> >> What I am doing is the same, but I have forcing totals in two to four >> dimensional tables instead of a one dimensional vectors. Each partition >> divides the array into groups of elements that I want to sum to my forcing >> totals. Again, you go around in a circle, doing forcing with each of the >> (currently 18) tables, to convergence. On count data it should always >> converge. >> >> The thing is, I need to keep track of all these partitions, and then >> multiply the forcing totals by the exact same elements of the array as I >> previously summed. I got up to five dimensions, coding by hand, and then >> realized that 1: the amount of work in going from, e.g., 19 dimensions to >> 20 >> was going to very great, and 2. the likelihood that I would get all the >> nesting and partition-matching right was vanishingly small. >> >> So I am looking for a way to encode the partitions that I use, that would >> allow me to use the same encoding to represent both the subsets of the >> array >> to sum over, crunching the array down to a set of totals corresponding to >> my >> forcing totals, and also defining the subsets of the array that should be >> multiplied by each forcing ratio. And I thought, maybe I could do it with >> strings of indexing commands, one per table of forcing totals. But this >> will >> only work If I can sum the array over the subdivisions that the partition >> defines, multiply all the elements in partition subdivisions by the >> corresponding constants, and then assign the results back to the array, or >> to a new array. Hence my question. >> >> Im afraid that this explanation is too long for people to read, but hope >> springs eternal. Id be remarkably pleased and eternally grateful if I >> got >> a solution to the problem of keeping track of partitions that can be used >> in >> the three ways described in the previous paragraph, even if it has nothing >> to do with executing strings. >> >> Warmest regards, >> andrewH >> >> >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Can-you-turn-a-string-into-a-working-symbol-tp4648343p4651073.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Gregory (Greg) L. Snow Ph.D. > 538...@gmail.com > -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.