Re: [datatable-help] Create matrix of columns from grouped rows
Thanks! That's about 2-3x faster. I wasn't familiar with stack/unstack. Now I have another tool. Cheers, e. *Eric Archer, Ph.D.* Southwest Fisheries Science Center (NMFS/NOAA) 8901 La Jolla Shores Drive La Jolla, CA 92037 USA 858-546-7121 (work) 858-546-7003 (FAX) Marine Mammal Genetics Group: swfsc.noaa.gov/mmtd-mmgenetics GitHub: github/ericarcher & Adjunct Professor, Marine Biology Scripps Institution of Oceanography University of California, San Diego http://profiles.ucsd.edu/frederick.archer " *The universe doesn't care what you believe. The wonderful thing about science is that it doesn't ask for your faith, it just asks for your eyes.*" - Randall Munroe "*Lighthouses are more helpful than churches.*" - Benjamin Franklin "*...but I'll take a GPS over either one.*" - John C. "Craig" George On Wed, Nov 23, 2016 at 11:11 AM, nachti [via R] < ml-node+s789695n472677...@n4.nabble.com> wrote: > Try this: > > t1 <- dt[, unlist(.SD), by = id] > t(unstack(t1, form = V1 ~ id)) > > (I think you get done with the colnames yourself ...) > > Cheers, > ~g > > -- > If you reply to this email, your message will be added to the discussion > below: > http://r.789695.n4.nabble.com/Create-matrix-of-columns-from-grouped-rows- > tp4726777p4726778.html > To unsubscribe from Create matrix of columns from grouped rows, click here > <http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code=4726777=RXJpYy5BcmNoZXJAbm9hYS5nb3Z8NDcyNjc3N3w5MjQwOTkzNjU=> > . > NAML > <http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer=instant_html%21nabble%3Aemail.naml=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> > -- View this message in context: http://r.789695.n4.nabble.com/Create-matrix-of-columns-from-grouped-rows-tp4726777p4726779.html Sent from the datatable-help mailing list archive at Nabble.com.___ datatable-help mailing list datatable-help@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
[datatable-help] Create matrix of columns from grouped rows
I am trying to find the most efficient (fastest) way of manipulating a data.table object that contains genetic data. The format is in the following toy example, where rows represent alleles for individuals and columns are separate loci. Each individual will be represented in one or more rows depending on what the ploidy of the loci are. In this example, there are three tetraploid loci (4 alleles per locus) genotyped for three individuals (1:3). In my real data, all loci will always have the same ploidy. > library(data.table) > dt <- data.table( + id = as.character(rep(1:3, each = 4)), + loc1 = factor(sample(c("C", "T"), 12, rep = T)), + loc2 = factor(sample(c("C", "T"), 12, rep = T)), + loc3 = factor(sample(c("C", "T"), 12, rep = T)), + key = "id" + ) > dt id loc1 loc2 loc3 1: 1TTT 2: 1CTC 3: 1TCT 4: 1CCT 5: 2TCC 6: 2TTT 7: 2CTT 8: 2CTC 9: 3CTT 10: 3TTT 11: 3TCT 12: 3TTC What I'm looking for is the fastest way to convert this data.table to a matrix where each row has the entire genotypes for one individual with the alleles for a locus in sequential columns. The code I currently have for this follows. > ids <- dt[, unique(id)] > .cbindColFunc <- function(x) { + do.call(cbind, as.list(as.character(x))) + } > mat <- do.call(rbind, lapply(ids, function(i) { + dt[i, do.call(cbind, lapply(.SD, .cbindColFunc)), .SDcols = !"id"] + })) > num.alleles <- ncol(mat) / (ncol(dt) - 1) > colnames(mat) <- paste(rep(colnames(dt)[-1], each = num.alleles), > 1:num.alleles, sep = ".") > mat <- cbind(id = ids, mat) > mat id loc1.1 loc1.2 loc1.3 loc1.4 loc2.1 loc2.2 loc2.3 loc2.4 loc3.1 loc3.2 loc3.3 loc3.4 [1,] "1" "T""C""T""C""T""T""C""C""T""C" "T""T" [2,] "2" "T""T""C""C""C""T""T""T""C""T" "T""C" [3,] "3" "C""T""T""T""T""T""C""T""T""T" "T""C" Is there a faster, more data.table friendly way to do it? Thanks in advance! Eric -- View this message in context: http://r.789695.n4.nabble.com/Create-matrix-of-columns-from-grouped-rows-tp4726777.html Sent from the datatable-help mailing list archive at Nabble.com. ___ datatable-help mailing list datatable-help@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help