In addition, if you go the route of a data frame then the functions to look
at are tapply, aggregate, and ave.


On Wed, Nov 28, 2012 at 2:02 PM, Greg Snow <538...@gmail.com> wrote:

> Yes, I meant FAQ 7.21, must have stuttered in typing, but it is always
> good to read other FAQs while looking for a specific one.
>
> I did read your full description, though whether I fully understand or not
> is yet to be seen.
>
> It seems like a lot of what you want to do could be simplified by using
> the apply and sweep functions, or possibly by the aaply function from the
> plyr package.
>
> The apply function works on any dimension of arrays, for example a piece
> of code like  out <- apply(myarray, c(1,5,7), sum) will sum over all the
> dimensions except 1, 5, and 7 and will return a 3 dimensional array, then
> myarray2 <- sweep(myarray, c(1,5,7), out, FUN="/") will divide each element
> of the original array by the appropriate value of out. In this case running
> the apply again would give all 1's, but you could divide the out array by
> your target margins.  If you had 2 lists, the first has the vectors of
> margins to sum/sweep and the 2nd has your target margin arrays, then you
> could do a for loop passing the current elements of the lists to the
> correct place in the function calls.
>
> Here is some example code (but the forced margins probably don't make any
> sense):
>
> dims <- list( 1:2, c(1,3), 2:3 )
> margins <- list( 10*matrix(1:16, 4),
>  20*17/9*matrix(1:8, ncol=2),
> 20*17/9*matrix(1:8, ncol=2) )
>
> old <- array(0, c(4,4,2))
> new <- HairEyeColor
> i <- 0
> while( max(abs(old-new)) > 0.0000001 ) {
> i <- i + 1
> cat('Iteration ',i,'\n')
>  flush.console()
> if(i > 100) {
> cat('did not converge\n')
>  break
> }
>
> old <- new
>
>  for(j in seq_along(dims) ) {
> new <- sweep(new, dims[[j]],
> apply(new, dims[[j]], sum)/margins[[j]],
>  FUN="/")
> }
> }
>
>
> You might also want to look at the loglin function, as part of its
> computations it starts with a starting matrix/array (which by default is
> all 1's) then finds the array that has the same margins as the passed in
> table.  It probably uses an algorithm  similar to what you want to do.  If
> you can pass the appropriate pieces to loglin then it may compute what you
> want for you (and probably much quicker since it uses compiled code).
>
>
>
>
> On Tue, Nov 27, 2012 at 8:29 PM, andrewH <ahoer...@rprogress.org> wrote:
>
>> Dear Greg—
>>
>> You mean FAQ 7.21, not 7.22, correct? Though 7.12 also seems relevant.
>> Though I would say I was asking about turning a string into an expression
>> rather than a variable. At any rate, thanks for the pointer. I sure I
>> would
>> benefit from rereading the FAQ on a monthly basis, until I actually know
>> most of what is in it.
>>
>> As to your question about my question, I’ve wanted to do this exact thing
>> several times in different contexts. However, you are quite correct that I
>> am struggling with this problem in a particular context.  I have of a
>> large,
>> multi-dimensional object containing count data. Currently this object is
>> implemented as a 26 dimensional (and growing) array with two to thirteen
>> dimnames per dimension, though I am thinking of switching it to a data
>> frame
>> with dimensions as factors and dimname-equivilent factor levels.
>>
>> I need to take a lot of complicated partitions of this object, mainly,
>> though not always, summing to the entire object. Most of the partitions
>> are
>> subsets of --
>>
>> – OK, now I have to digress to address a terminological uncertainty. Think
>> of a 4X4X4 cube. It has three dimensions, and each dimension has four of
>> what?  I’m going to call them levels right now, though I don’t think that
>> is
>> right -- it would be confusing if there were factors in the picture. Also,
>> the dimnames do not name the dimensions, but the thing I am calling
>> levels,
>> which is also confusing. --
>>
>> Anyway, most of the partitions consist of two to four dimensions out of
>> 24,
>> but sometimes with some levels omitted or summed, and occasionally the
>> partitions that are much more complicated (to deal with censored data,
>> mainly). I have to use each partition multiple times, doing a very
>> different
>> thing each time (and then repeat the whole set many times) The next 4
>> paragraphs describe what I am actually doing with the partitions, but you
>> can skip over them and cut to the chase if you are not so interested.
>>
>> I am summing over the dimensions in each partition, dividing a table of
>> “forcing totals” for that partition by those sums (element by element),
>> and
>> then taking the resulting ratios and multiplying each of the terms in the
>> original, non-summed object by the corresponding ratio.
>>
>> This is easiest to understand by analogy to the two-dimensional case. You
>> take the row sums and divide them element by element by a vector of
>> pre-determined row “forcing totals,” to get a vector of forcing ratios.
>> Then
>> you multiply each row by the corresponding forcing ratio, so that the row
>> sum will then match the forcing total. Then you do the same thing with the
>> columns. Repeat, alternating row and columns, to convergence. Each column
>> has a corresponding column forcing total, and each row has a corresponding
>> row forcing total. The elements of the matrix have two partitions that we
>> use, one into rows, and the other into columns.  This is sometimes called
>> RAS balancing, or biproportional matrix adjustment. It is an algorithm
>> that
>> is used a lot to update big matrices in national income accounting and
>> input-output analysis.
>>
>> What I am doing is the same, but I have forcing totals in two to four
>> dimensional tables instead of a one dimensional vectors.  Each partition
>> divides the array into groups of elements that I want to sum to my forcing
>> totals. Again, you go around in a circle, doing forcing with each of the
>> (currently 18) tables, to convergence. On count data it should always
>> converge.
>>
>> The thing is, I need to keep track of all these partitions, and then
>> multiply the forcing totals by the exact same elements of the array as I
>> previously summed.  I got up to five dimensions, coding by hand, and then
>> realized that 1: the amount of work in going from, e.g., 19 dimensions to
>> 20
>> was going to very great, and 2. the likelihood that I would get all the
>> nesting and partition-matching right was vanishingly small.
>>
>> So I am looking for a way to encode the partitions that I use, that would
>> allow me to use the same encoding to represent both the subsets of the
>> array
>> to sum over, crunching the array down to a set of totals corresponding to
>> my
>> forcing totals, and also defining the subsets of the array that should be
>> multiplied by each forcing ratio.  And I thought, maybe I could do it with
>> strings of indexing commands, one per table of forcing totals. But this
>> will
>> only work If I can sum the array over the subdivisions that the partition
>> defines, multiply all the elements in partition subdivisions by the
>> corresponding constants, and then assign the results back to the array, or
>> to a new array. Hence my question.
>>
>> I’m afraid that this explanation is too long for people to read, but hope
>> springs eternal.  I’d be remarkably pleased and eternally grateful if I
>> got
>> a solution to the problem of keeping track of partitions that can be used
>> in
>> the three ways described in the previous paragraph, even if it has nothing
>> to do with executing strings.
>>
>> Warmest regards,
>> andrewH
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://r.789695.n4.nabble.com/Can-you-turn-a-string-into-a-working-symbol-tp4648343p4651073.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538...@gmail.com
>



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to