Hi,
 
Thanks for your reply. Based on your suggestions, I managed to simplify the 
code, but only a little. I don't see how I could do without a loop, given the 
nestedness of the hierachy. See the code below, which is working, but I'd like 
to simplify it.
 
# sample data
theCodes <- c('STAT.01', 'STAT.01.01', 'STAT.01.01.01', 'STAT.01.01.02', 
'STAT.01.01.03', 'STAT.01.01.04', 'STAT.01.01.05', 'STAT.01.01.06', 
'STAT.01.01.06.01', 'STAT.01.01.06.02', 'STAT.01.01.06.03', 'STAT.01.01.06.04', 
'STAT.01.01.06.05', 'STAT.01.01.06.06', 'STAT.01.02', 'STAT.01.02.01', 
'STAT.01.02.02', 'STAT.01.02.03', 'STAT.01.02.03.01', 'STAT.01.02.03.02', 
'STAT.01.02.03.03', 'STAT.01.02.03.04', 'STAT.01.02.03.05', 'STAT.01.03')
theValues <- as.numeric(c(NA, NA, 15074.23366, 4882.942034, 1619.59628, 
1801.722877, 1019.973666, NA, 503.9239317, 917.2189347, 6018.830465, 
1944.11311, 1427.575402, 1965.725428, NA, 5857.293612, 5933.770263, NA, 
6077.089518, 1427.180073, 455.9387993, 859.766603, 1002.983331, 2225.328211))
df <- as.data.frame(cbind(code=theCodes, value=theValues))
df$value <- as.numeric(df$value)
 
# actual code
getDepth <- function(df) {
    df$diepte <- do.call(rbind, lapply(strsplit(df$code, "\\."), length)) - 1
    return(df)
    }
getParents <- function(df) {
    df$parent <- substr(df$code, 1, 4 + (df$diepte - 1) * 3)
    return(df)
    }
getTotals <- function(df, depth) {
    s <- subset(df, diepte==depth)
    if(!"parent" %in% names(df)) s <- getParents(s)
    agg <- aggregate(s["value"], s["parent"], FUN=sum, na.rm=TRUE)
    merged <- merge(df, agg, by.x="code", by.y="parent", all=TRUE, 
suffixes=c("", "_summed"))
    isSum <- !is.na(merged$value_summed)
    merged[isSum, "value"] <- merged[isSum, "value_summed"] 
    merged$value_summed <- merged$parent <- NULL
    return(merged)
    }
#library(debug)
#mtrace(getTotals)
df <- getDepth(df)
for( depth in max(df$diepte):2 ) {
    if (depth == max(df$diepte)) {
        x <- getTotals(df, depth) 
    } else {
        x <- getTotals(x, depth)
        }
    }

Cheers!!
Albert-Jan


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


>________________________________
>From: "ONKELINX, Thierry" <thierry.onkel...@inbo.be>
>To: Albert-Jan Roskam <fo...@yahoo.com>; R Mailing List <r-help@r-project.org>
>Sent: Wednesday, November 16, 2011 2:34 PM
>Subject: RE: [R] hierachical code system
>
>Dear Albert-Jan,
>
>The easiest way is to create extra variables with the corresponding 
>aggregation level. substr() en strsplit() can be your friends. Once you have 
>those variables you can use aggregate() or any other aggregating function. You 
>don't need loops.
>
>Best regards,
>
>Thierry
>
>> -----Oorspronkelijk bericht-----
>> Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
>> Namens Albert-Jan Roskam
>> Verzonden: woensdag 16 november 2011 14:28
>> Aan: R Mailing List
>> Onderwerp: [R] hierachical code system
>> 
>> Hi,
>> 
>> I have a hierachical code system such as the example below (the printed data
>> are easiest to read). I would like to write a function that returns an 
>> 'imputed'
>> data frame, ie. where the the parent values are calculated as the sum of the
>> child values. So, for instance, STAT.01.01.06  is the sum of STAT.01.01.06.01
>> through STAT.01.01.06.06. The code I have written uses two for loops, and,
>> moreover, does not work as intended. My starting point was to determine the
>> code depth by counting the dots in the variable 'code' (using strsplit), then
>> iterate over the tree from deep to shallow. Does anybody have a good idea as
>> to how to approach this in R?
>> 
>> theCodes <- c('STAT.01', 'STAT.01.01', 'STAT.01.01.01', 'STAT.01.01.02',
>> 'STAT.01.01.03', 'STAT.01.01.04', 'STAT.01.01.05', 'STAT.01.01.06',
>> 'STAT.01.01.06.01', 'STAT.01.01.06.02', 'STAT.01.01.06.03', 
>> 'STAT.01.01.06.04',
>> 'STAT.01.01.06.05', 'STAT.01.01.06.06', 'STAT.01.02', 'STAT.01.02.01',
>> 'STAT.01.02.02', 'STAT.01.02.03', 'STAT.01.02.03.01', 'STAT.01.02.03.02',
>> 'STAT.01.02.03.03', 'STAT.01.02.03.04', 'STAT.01.02.03.05', 'STAT.01.03')
>> theValues <- c('NA', 'NA', '15074.23366', '4882.942034', '1619.59628',
>> '1801.722877', '1019.973666', 'NA', '503.9239317', '917.2189347',
>> '6018.830465', '1944.11311', '1427.575402', '1965.725428', 'NA', 
>> '5857.293612',
>> '5933.770263', '6077.089518', 'NA', '1427.180073', '455.9387993', 
>> '859.766603',
>> '1002.983331', '2225.328211') df <- as.data.frame(cbind(code=theCodes,
>> value=theValues))
>> print(df)
>>                code       value
>> 1           STAT.01          NA
>> 2        STAT.01.01          NA
>> 3     STAT.01.01.01 15074.23366
>> 4     STAT.01.01.02 4882.942034
>> 5     STAT.01.01.03  1619.59628
>> 6     STAT.01.01.04 1801.722877
>> 7     STAT.01.01.05 1019.973666
>> 8     STAT.01.01.06          NA
>> 9  STAT.01.01.06.01 503.9239317
>> 10 STAT.01.01.06.02 917.2189347
>> 11 STAT.01.01.06.03 6018.830465
>> 12 STAT.01.01.06.04  1944.11311
>> 13 STAT.01.01.06.05 1427.575402
>> 14 STAT.01.01.06.06 1965.725428
>> 15       STAT.01.02          NA
>> 16    STAT.01.02.01 5857.293612
>> 17    STAT.01.02.02 5933.770263
>> 18    STAT.01.02.03 6077.089518
>> 19 STAT.01.02.03.01          NA
>> 20 STAT.01.02.03.02 1427.180073
>> 21 STAT.01.02.03.03 455.9387993
>> 22 STAT.01.02.03.04  859.766603
>> 23 STAT.01.02.03.05 1002.983331
>> 24       STAT.01.03 2225.328211
>> >
>> 
>> Thank you in advance!
>> 
>> Cheers!!
>> Albert-Jan
>> 
>> 
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> ~~~~~
>> All right, but apart from the sanitation, the medicine, education, wine, 
>> public
>> order, irrigation, roads, a fresh water system, and public health, what have 
>> the
>> Romans ever done for us?
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> ~~~~~
>>     [[alternative HTML version deleted]]
>
>
>
>
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to