on 11/20/2008 02:28 PM Tul Gan wrote: > Hi ! > ���������� I am new to R. Can somebody help me in reformatting�huge output > files ,i.e, rearranging sets of columns in specific order. > For example: I have data for three compunds 1, 2 and 3 > file1: > ID CA1 CA3 CA2 MA2 MA1 MA3 > 1 14 15 13 7 12 3 > 2 19 7 12 10 14 5 > 3 21 12 19 6 8 9 > � > to > File 2: > � > ID CA1 CA2 CA3 MA1 MA2 MA3 > 1 14 13 15 12 7 3 > 2 19 12 7 14 10 5 > 3 21 19 12 8 6 9 > � > or File3: > ID CA1 MA1 CA2 MA2 CA3 MA3 > 1 14 12 13 7 15 3 > 2 19 14 12 10 7 5 > 3 21 8 19 6 12 9 > � > Thanks for your help, > Tul Gan
A general approach to the first case, where 'DF' is the initial data frame: > DF[, c("ID", sort(names(DF)[-1]))] ID CA1 CA2 CA3 MA1 MA2 MA3 1 1 14 13 15 12 7 3 2 2 19 12 7 14 10 5 3 3 21 19 12 8 6 9 Essentially, you are sorting the column names less "ID", then reordering the columns of 'DF' using indexing. > names(DF)[-1] [1] "CA1" "CA3" "CA2" "MA2" "MA1" "MA3" > sort(names(DF)[-1]) [1] "CA1" "CA2" "CA3" "MA1" "MA2" "MA3" The second case is a little more complicated, since you are independently sorting on both alpha and numeric values, rather than just alpha. If there is a predictable sequence of names, you could do something like: > paste(c("CA", "MA"), rep(1:3, each = 2), sep = "") [1] "CA1" "MA1" "CA2" "MA2" "CA3" "MA3" Thus: > DF[, c("ID", paste(c("CA", "MA"), rep(1:3, each = 2), sep = ""))] ID CA1 MA1 CA2 MA2 CA3 MA3 1 1 14 12 13 7 15 3 2 2 19 14 12 10 7 5 3 3 21 8 19 6 12 9 If the sequence is not predictable, such that you would generate non-existing column names with the above, then we need to split the existing column names and sort them separately. Something along the lines of the following: # use gsub() to split out the alpha and numeric values. Note that we # need to create a data frame to handle an alpha and a numeric column names.df <- data.frame(Alpha = gsub("[[:digit:]]", "", names(DF)[-1]), Digits = as.numeric(gsub("[[:alpha:]]", "", names(DF)[-1]))) > names.df Alpha Digits 1 CA 1 2 CA 3 3 CA 2 4 MA 2 5 MA 1 6 MA 3 Now, get the indices for the sorted rows, based first upon the digits value, subsorted by the Alpha value: > order(names.df$Digits, names.df$Alpha) [1] 1 5 3 4 2 6 Now use those values to index the columns in 'DF'. Remember that we need to add 1 to each index, to account for the "ID" column, which is not included in 'names.mat': > c(1, order(names.df$Digits, names.df$Alpha) + 1) [1] 1 2 6 4 5 3 7 Thus: > DF[, c(1, order(names.df$Digits, names.df$Alpha) + 1)] ID CA1 MA1 CA2 MA2 CA3 MA3 1 1 14 12 13 7 15 3 2 2 19 14 12 10 7 5 3 3 21 8 19 6 12 9 Note, that we could also solve the first case, by reversing the sorting levels from names.df: > DF[, c(1, order(names.df$Alpha, names.df$Digits) + 1)] ID CA1 CA2 CA3 MA1 MA2 MA3 1 1 14 13 15 12 7 3 2 2 19 12 7 14 10 5 3 3 21 19 12 8 6 9 See ?gsub, ?sort and ?order HTH, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.