Yes, I meant FAQ 7.21, must have stuttered in typing, but it is always good
to read other FAQs while looking for a specific one.

I did read your full description, though whether I fully understand or not
is yet to be seen.

It seems like a lot of what you want to do could be simplified by using the
apply and sweep functions, or possibly by the aaply function from the plyr
package.

The apply function works on any dimension of arrays, for example a piece of
code like  out <- apply(myarray, c(1,5,7), sum) will sum over all the
dimensions except 1, 5, and 7 and will return a 3 dimensional array, then
myarray2 <- sweep(myarray, c(1,5,7), out, FUN="/") will divide each element
of the original array by the appropriate value of out. In this case running
the apply again would give all 1's, but you could divide the out array by
your target margins.  If you had 2 lists, the first has the vectors of
margins to sum/sweep and the 2nd has your target margin arrays, then you
could do a for loop passing the current elements of the lists to the
correct place in the function calls.

Here is some example code (but the forced margins probably don't make any
sense):

dims <- list( 1:2, c(1,3), 2:3 )
margins <- list( 10*matrix(1:16, 4),
20*17/9*matrix(1:8, ncol=2),
20*17/9*matrix(1:8, ncol=2) )

old <- array(0, c(4,4,2))
new <- HairEyeColor
i <- 0
while( max(abs(old-new)) > 0.0000001 ) {
i <- i + 1
cat('Iteration ',i,'\n')
flush.console()
if(i > 100) {
cat('did not converge\n')
break
}

old <- new

for(j in seq_along(dims) ) {
new <- sweep(new, dims[[j]],
apply(new, dims[[j]], sum)/margins[[j]],
FUN="/")
}
}


You might also want to look at the loglin function, as part of its
computations it starts with a starting matrix/array (which by default is
all 1's) then finds the array that has the same margins as the passed in
table.  It probably uses an algorithm  similar to what you want to do.  If
you can pass the appropriate pieces to loglin then it may compute what you
want for you (and probably much quicker since it uses compiled code).




On Tue, Nov 27, 2012 at 8:29 PM, andrewH <ahoer...@rprogress.org> wrote:

> Dear Greg—
>
> You mean FAQ 7.21, not 7.22, correct? Though 7.12 also seems relevant.
> Though I would say I was asking about turning a string into an expression
> rather than a variable. At any rate, thanks for the pointer. I sure I would
> benefit from rereading the FAQ on a monthly basis, until I actually know
> most of what is in it.
>
> As to your question about my question, I’ve wanted to do this exact thing
> several times in different contexts. However, you are quite correct that I
> am struggling with this problem in a particular context.  I have of a
> large,
> multi-dimensional object containing count data. Currently this object is
> implemented as a 26 dimensional (and growing) array with two to thirteen
> dimnames per dimension, though I am thinking of switching it to a data
> frame
> with dimensions as factors and dimname-equivilent factor levels.
>
> I need to take a lot of complicated partitions of this object, mainly,
> though not always, summing to the entire object. Most of the partitions are
> subsets of --
>
> – OK, now I have to digress to address a terminological uncertainty. Think
> of a 4X4X4 cube. It has three dimensions, and each dimension has four of
> what?  I’m going to call them levels right now, though I don’t think that
> is
> right -- it would be confusing if there were factors in the picture. Also,
> the dimnames do not name the dimensions, but the thing I am calling levels,
> which is also confusing. --
>
> Anyway, most of the partitions consist of two to four dimensions out of 24,
> but sometimes with some levels omitted or summed, and occasionally the
> partitions that are much more complicated (to deal with censored data,
> mainly). I have to use each partition multiple times, doing a very
> different
> thing each time (and then repeat the whole set many times) The next 4
> paragraphs describe what I am actually doing with the partitions, but you
> can skip over them and cut to the chase if you are not so interested.
>
> I am summing over the dimensions in each partition, dividing a table of
> “forcing totals” for that partition by those sums (element by element), and
> then taking the resulting ratios and multiplying each of the terms in the
> original, non-summed object by the corresponding ratio.
>
> This is easiest to understand by analogy to the two-dimensional case. You
> take the row sums and divide them element by element by a vector of
> pre-determined row “forcing totals,” to get a vector of forcing ratios.
> Then
> you multiply each row by the corresponding forcing ratio, so that the row
> sum will then match the forcing total. Then you do the same thing with the
> columns. Repeat, alternating row and columns, to convergence. Each column
> has a corresponding column forcing total, and each row has a corresponding
> row forcing total. The elements of the matrix have two partitions that we
> use, one into rows, and the other into columns.  This is sometimes called
> RAS balancing, or biproportional matrix adjustment. It is an algorithm that
> is used a lot to update big matrices in national income accounting and
> input-output analysis.
>
> What I am doing is the same, but I have forcing totals in two to four
> dimensional tables instead of a one dimensional vectors.  Each partition
> divides the array into groups of elements that I want to sum to my forcing
> totals. Again, you go around in a circle, doing forcing with each of the
> (currently 18) tables, to convergence. On count data it should always
> converge.
>
> The thing is, I need to keep track of all these partitions, and then
> multiply the forcing totals by the exact same elements of the array as I
> previously summed.  I got up to five dimensions, coding by hand, and then
> realized that 1: the amount of work in going from, e.g., 19 dimensions to
> 20
> was going to very great, and 2. the likelihood that I would get all the
> nesting and partition-matching right was vanishingly small.
>
> So I am looking for a way to encode the partitions that I use, that would
> allow me to use the same encoding to represent both the subsets of the
> array
> to sum over, crunching the array down to a set of totals corresponding to
> my
> forcing totals, and also defining the subsets of the array that should be
> multiplied by each forcing ratio.  And I thought, maybe I could do it with
> strings of indexing commands, one per table of forcing totals. But this
> will
> only work If I can sum the array over the subdivisions that the partition
> defines, multiply all the elements in partition subdivisions by the
> corresponding constants, and then assign the results back to the array, or
> to a new array. Hence my question.
>
> I’m afraid that this explanation is too long for people to read, but hope
> springs eternal.  I’d be remarkably pleased and eternally grateful if I got
> a solution to the problem of keeping track of partitions that can be used
> in
> the three ways described in the previous paragraph, even if it has nothing
> to do with executing strings.
>
> Warmest regards,
> andrewH
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Can-you-turn-a-string-into-a-working-symbol-tp4648343p4651073.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to