Re: [R] Question: RMySQL bulk load/update one column, dbWriteTable()?
> I have a question reading using RMySQL trying to load one R vector into a > table column. To be more specifically, the table is there populated. Now I > add a new column and want to populate this. > Okay, this is more of an SQL question now, but you could just use dbWriteTable and then do an multi-table update. dbGetQuery(con, "select * from tmp") id name 1 1A 2 2B 3 3C 4 4D 5 5E dbSendQuery(con, "alter table tmp add column r2 float") ## calculate some statistic for all or some ids in table x<-dataframe(id=1:5, r2=c(.1, .4, .9, .4,.7)) dbWriteTable(con, "r2tmp", x ) dbSendQuery(con, "update tmp t, r2tmp r set t.r2=r.r2 where t.id=r.id") dbGetQuery(con, "select * from tmp") id name r2 1 1A 0.1 2 2B 0.4 3 3C 0.9 4 4D 0.4 5 5E 0.7 Chris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RMySQL question, sql with R vector or list
> dynamically, rather than statically like using paste. One concern using > paste to construct the SQL command is this: what about if the ID list in > your sample becomes very large, is this a problem to construct this way? > I have not messed with procedures in mysql 5, so I have no idea about the first question. However, the IN operator should be fast, even for large vectors. It takes me a second to retrieve 10,000 random records from a table with 100,000+ rows. system.time(id.in( sample(1:10, 1) ))[3] [1] 1.07 Also, I forgot to mention this before - if your ID is a character field, you can quote strings in the IN clause using the shell quote function. paste(shQuote(ids),collapse=",") Chris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RMySQL question, sql with R vector or list
> I am trying to write a RMySQL sql script inside R such that part of the SQL > would be R list or vector. For example, I want to select * from Atable > where ID would equal to a members of R list or vector of "1, 2, 3". Here > the ID list was generated inside R and then try to feed to mysql to call to > get additional data sets. > You could pass a comma-separated list of IDs to the sql IN operator ## in MySQL CREATE table tmp (id int, name char(1)); insert into tmp values (1, "A"), (2, "B"), (3, "C"), (4, "D"), (5, "E"); ### in R library(RMySQL) con <- dbConnect("MySQL", dbname="test" ) id.in<-function(ids) { dbGetQuery(con, paste("select * from tmp where id IN (", paste(ids,collapse=","), ")") ) } id.in(2:4) id name 1 2B 2 3C 3 4D ## simple lists also work id.in(list(1,4,5)) id name 1 1A 2 4D 3 5E Chris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying last record in individual growth data over different time intervalls
> Finally I would like to have a data.frame t2 which only contains the > entries of the last measurements. > You could also use aggregate to get the max year per plate then join that back to the original dataframe using merge on year and plate (common columns in both dataframes). x<-data.frame(id=(1:8), plate=c(15,15,15,20,20,33,43,43), year=c(2004,2005,2006,2004,2005,2004,2005,2006), height=c(0.40,0.43,0.44,0.90,0.94,0.15,0.30,0.38)) merge(x, aggregate(list(year=x$year), list(plate=x$plate), max)) plate year id height 115 2006 3 0.44 220 2005 5 0.94 333 2004 6 0.15 443 2006 8 0.38 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] abbreviate dataframe for Sweave output
Thanks Charles and Max, both functions work great. I also used a hack to replace row names with '.', '. ', '. ', etc. Chris dot.df <- function(x, head = 3, tail=1, dotrows=2) { x <- format(rbind(head(x,head + dotrows), tail(x,tail))) if(dotrows>0) { x[(head + 1):(head + dotrows),] <- "." for(i in 1:dotrows){ rownames(x)[head+i]<-paste(".", substring(" ", 1, i-1))} } x } dot.df(crabs) sp sex index FL RW CL CW BD 1B M 1 8.1 6.7 16.1 19.0 7.0 2B M 2 8.8 7.7 18.1 20.8 7.4 3B M 3 9.2 7.8 19.0 22.4 7.7 .. . ...... .. . ...... 200 O F50 23.1 20.2 46.2 52.5 21.1 ## or dot.matrix (replacing latex commands for display here) dot.matrix( crabs) sp sex ... CW BD 1B M ... 197 2B M ... 20.8 7.4 .. . . .. 199 O F ... 48.7 19.8 200 O F ... 52.5 21.1 >> foo(crabs) > sp sex index FL RW CL CW BD > 1 "B" "M" " 1" " 8.1" " 6.7" "16.1" "19.0" " 7.0" > 2 "B" "M" " 2" " 8.8" " 7.7" "18.1" "20.8" " 7.4" > 3 "B" "M" " 3" " 9.2" " 7.8" "19.0" "22.4" " 7.7" > . "." "." "." "."".""."".""." > . "." "." "." "."".""."".""." > 200 "O" "F" "50" "23.1" "20.2" "46.2" "52.5" "21.1" > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] abbreviate dataframe for Sweave output
I wanted to print the first and last rows of some dataframes in Sweave using dots in columns to separate the two parts. Head and tail almost work, but I have problems with factors and row names. z<-data.frame(id=letters[1:26], x=sample(1:26,26)) rbind(head(z,3), ".", tail(z,1)) id x 1 a 18 2 b 8 3 c 14 4. 26z 10 Warning message: invalid factor level, NAs generated in... I would like something like this if possible. Any ideas? id x 1 a 18 2 b 8 3 c 14 . . . . . . 26z 10 Thanks, Chris Stubben -- - Los Alamos National Lab BioScience Division MS M888 Los Alamos, NM 87545 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to store an array in MySQL
> Hi all, does somebody know how to store an array in MySQL with the > package RMySQL. Thanks in advance. > A similar question was asked last month. http://finzi.psych.upenn.edu/R/Rhelp02a/archive/81429.html In a normalized database, you should store the index and value in separate columns. Try this... a<-array(1:3, c(2, 4)) [,1] [,2] [,3] [,4] [1,]1321 [2,]2132 d <-dim(a) z<-cbind(expand.grid(r=1:d[1],c=1:d[2]),x=as.vector(a)) r c x 1 1 1 1 2 2 1 2 3 1 2 3 4 2 2 1 5 1 3 2 6 2 3 3 7 1 4 1 8 2 4 2 library(RMySQL) con<-dbConnect(MySQL(), dbname="test") dbWriteTable(con, "array", z, row.names=FALSE) [1] TRUE --- Now in Mysql select * from array; +--+--+--+ | r| c| x| +--+--+--+ |1 |1 |1 | |2 |1 |2 | |1 |2 |3 | |2 |2 |1 | |1 |3 |2 | |2 |3 |3 | |1 |4 |1 | |2 |4 |2 | +--+--+--+ select group_concat(x order by c separator ' ' ) as a from array group by r; +-+ | a | +-+ | 1 3 2 1 | | 2 1 3 2 | +-+ Chris Stubben __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bootstraping for groups and subgroups and joing with other table
> I have a table with following collumns: State, SamplePlot, Species and BodySize. I sampled bird species at > 34 SamplePlots and 5 States (regions) monthly during two years. On each bird record I measured bodysize > and identified the species. So I have many records of each species (about 150 species) at each SamplePlot > and each Region (State). > > Now I would like bootstrap these data, selecting 50 records for each State/SamplePlot combinations and > count how many species (richness) were sampled at bootstrap. I need to do this 1.000 times. > > After that and need join the number of species [obtained at each bootstrap and for each State/SamplePlot > combination] with a dataframe that have other attributes for SamplePlot (like Area, Perimeter etc). I asked a similar question earlier... IF you have data frame birds and bird.plots, maybe something like this. #initialize empty variable boot<-NULL for(i in 1:100) { ## split on state and site and create list a a <-split(birds, paste(birds$State,birds$SampleSite), drop=T) # sample 50 rows each OR by number of observations (better?) # b<-lapply( a, function(x) x[sample(nrow(x), 50, replace=T),]) b<-lapply( a, function(x) x[sample(nrow(x), replace=T),]) ## count number of unique species or other statistic? ### and add row to boot matrix boot<-rbind(boot, unlist( lapply(b, function (x) length(unique(x$Species)) ) )) } ## mean y<-apply(boot, 2, mean) ## convert to data frame for merge y<-data.frame(y) names(y)<-"boot.count" ## add row names to bird.plots for easy join rownames(bird.plots)<-paste(bird.plots$State,bird.plots$SampleSite) merge(bird.plots,y, by=0) Row.names State SampleSite Area boot.count 1 Bahia Site1 Bahia Site1 10 1.00 2 Bahia Site2 Bahia Site2 25 1.96 3 Bahia Site3 Bahia Site3 70 1.72 4 Bahia Site4 Bahia Site4 15 1.73 5 Bahia Site5 Bahia Site55 1.42 6 RioJaneiro Site1 RioJaneiro Site1 32 2.49 7 RioJaneiro Site2 RioJaneiro Site2 45 1.63 8 RioJaneiro Site3 RioJaneiro Site3 10 2.37 9SaoPaulo Site1 SaoPaulo Site1 23 2.41 10 SaoPaulo Site2 SaoPaulo Site2 45 2.57 Chris Stubben __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] analyze amino acid sequence (composition)of proteins
Have you checked the package seqinR at CRAN? It may help. Also, if you split a string into single characters using strsplit, then use table to count characters. seq<-"ATGAAC" table(strsplit(seq, "")) A C G T 3 1 1 1 > 3. based on "AAABBB",how can i get some statistics of this string such as how > many letters,how many "A"s in the string. -- - Chris Stubben Los Alamos National Lab BioScience Division MS M888 Los Alamos, NM 87545 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] In which application areas is R used?
Population biology It's dominated by matlab, but I see R used more and more. http://popstudies.stanford.edu/summer_course.html Chris -- - Chris Stubben Los Alamos National Lab BioScience Division MS M888 Los Alamos, NM 87545 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Sample rows in data frame by subsets
Hi, I need to resample rows in a data frame by subsets L3 <- LETTERS[1:3] d <- data.frame(cbind(x=1, y=1:10), fac=sample(L3, 10, repl=TRUE)) x y fac 1 1 1 A 2 1 2 A 3 1 3 A 4 1 4 A 5 1 5 C 6 1 6 C 7 1 7 B 8 1 8 A 9 1 9 C 10 1 10 A I have seen this used to sample rows with replacement d[sample(nrow(d), replace=T), ] x y fac 7 1 7 B 2 1 2 A 1 1 1 A 3 1 3 A 2.1 1 2 A 10 1 10 A 8 1 8 A 9 1 9 C 1.1 1 1 A 8.1 1 8 A but I would like to sample based on the original number in fac summary(d$fac) A B C 6 1 3 rbind(subset(d, fac=="A")[sample(6, replace=T), ], subset(d, fac=="B")[sample(1, replace=T), ], subset(d, fac=="C")[sample(3, replace=T), ] ) x y fac 2 1 2 A 3 1 3 A 3.1 1 3 A 1 1 1 A 10 1 10 A 1.1 1 1 A 7 1 7 B 5 1 5 C 6 1 6 C 5.1 1 5 C Is there an easy way to do this in one step or with a short function? I have lots of dataframes to resample. Thanks, Chris -- - Chris Stubben Los Alamos National Lab BioScience Division MS M888 Los Alamos, NM 87545 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Constructing a transition matrix
Hi Peter, Thanks for pointing out the set functions. I can use setdiff to find missing rows setdiff(dev, rownames(A)) [1] "seed" and intersect to find common rows d1<- intersect(dev, rownames(A) ) [1] "veg" "rep" I was trying to use a negative index like A[-1,] to remove the dead row, but d1 is a better solution. Now I can add the missing seed row and get a square matrix. rbind( seed=numeric(3), A[d1,] )[dev,dev] Another post by Hans Gardfjell suggested reordering factor levels before using prop.table(table()) and this solution works great! trans$class <- ordered(trans$class, levels=dev) trans$fate <- ordered(trans$fate, levels=c(dev,"dead") ) A <- t(prop.table(table(trans$class, trans$fate),1))[-4,] seed veg rep seed 0.000 0 0.0 veg 0.667 0 0.5 rep 0.000 1 0.5 Thanks for the help, Chris Peter Dalgaard wrote: > Are you looking for something like > > d1 <- setdiff(dev,"seed") > A0[d1,dev] <- A[d1,dev] > > ? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Constructing a transition matrix
Hi again, I almost figured this out, but still need some help on the last part. I can use prop.table to get survival probabilities... A <- t(prop.table( table(trans$class, trans$fate),1) ) rep seed veg dead 0.000 0.333 0.000 rep 0.500 0.000 1.000 veg 0.500 0.667 0.000 so now I just need to format the matrix. I thought I could create a matrix of zeroes using size class names, dev<- c("seed","veg", "rep"). A0<-matrix(numeric(9), nrow=3, dimnames=list(dev,dev) ) seed veg rep seed0 0 0 veg 0 0 0 rep 0 0 0 but how do I assign values in A to the corresponding rows and columns in A0? I hope there is an easy solution that I'm overlooking. seed veg rep seed0 0 0 veg 0.67 0 0.5 rep 0 1 0.5 Thanks, Chris Chris Stubben wrote: > Hi, > > I would like to construct a transition matrix from a data frame with > annual transitions of marked plants. > > plant<-c(1:6) > class<-c("seed","seed", "seed", "veg", "rep", "rep") > fate<-c("dead", "veg","veg","rep", "rep", "veg") > > trans<-data.frame(plant, class, fate) > > plant class fate > 1 1 seed dead > 2 2 seed veg > 3 3 seed veg > 4 4 veg rep > 5 5 rep rep > 6 6 rep veg > > I have been using sql queries to do this, but I would like to construct > the matrix in R since I plan to resample transitions using > trans[sample(nrow(trans), 6, replace=T), ] > > I know I can get the original size vector using table() > > data.matrix(table(trans$class)) > [,1] > rep 2 > seed3 > veg 1 > > > but I don't know how to get counts of each class-fate combination where > fate does NOT equal dead > > seed veg = 2 > veg rep = 1 > rep rep = 1 > rep veg = 1 > > > or how to divide the class-fate count by the original class count in the > size vector to get survival probabilities > > seed veg = 2 / 3 seed = 0.67 > veg rep = 1 / 1 veg = 1 > rep rep = 1 / 2 rep = 0.5 > rep veg = 1 / 2 rep = 0.5 > > > or construct the square matrix with rows and columns in the same > developmental sequence like dev<- c("seed","veg", "rep"). > > seed veg rep > seed0 0 0 > veg 0.67 0 0.5 > rep 0 1 0.5 > > Any help or suggestions would be appreciated. > Thanks, > > > Chris Stubben > > > -- > Los Alamos National Lab > BioScience Division > MS M888 > Los Alamos, NM 87545 > > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Constructing a transition matrix
Hi, I would like to construct a transition matrix from a data frame with annual transitions of marked plants. plant<-c(1:6) class<-c("seed","seed", "seed", "veg", "rep", "rep") fate<-c("dead", "veg","veg","rep", "rep", "veg") trans<-data.frame(plant, class, fate) plant class fate 1 1 seed dead 2 2 seed veg 3 3 seed veg 4 4 veg rep 5 5 rep rep 6 6 rep veg I have been using sql queries to do this, but I would like to construct the matrix in R since I plan to resample transitions using trans[sample(nrow(trans), 6, replace=T), ] I know I can get the original size vector using table() data.matrix(table(trans$class)) [,1] rep 2 seed3 veg 1 but I don't know how to get counts of each class-fate combination where fate does NOT equal dead seed veg = 2 veg rep = 1 rep rep = 1 rep veg = 1 or how to divide the class-fate count by the original class count in the size vector to get survival probabilities seed veg = 2 / 3 seed = 0.67 veg rep = 1 / 1 veg = 1 rep rep = 1 / 2 rep = 0.5 rep veg = 1 / 2 rep = 0.5 or construct the square matrix with rows and columns in the same developmental sequence like dev<- c("seed","veg", "rep"). seed veg rep seed0 0 0 veg 0.67 0 0.5 rep 0 1 0.5 Any help or suggestions would be appreciated. Thanks, Chris Stubben -- Los Alamos National Lab BioScience Division MS M888 Los Alamos, NM 87545 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html