Re: [R] retaining characters in a csv file
data.table's fread reads this as expected. Quoted strings aren't coerced. sapply(fread('5724550,"000202075214",2005.02.17,2005.02.17,"F"\n'), class) # V1 V2 V3 V4 V5 # "integer" "character" "character" "character" "character" Best, Arun. On Wed, Sep 23, 2015 at 12:00 AM, Therneau, Terry M., Ph.D.wrote: > I have a csv file from an automatic process (so this will happen thousands > of times), for which the first row is a vector of variable names and the > second row often starts something like this: > > 5724550,"000202075214",2005.02.17,2005.02.17,"F", . > > Notice the second variable which is > a character string (note the quotation marks) > a sequence of numeric digits > leading zeros are significant > > The read.csv function insists on turning this into a numeric. Is there any > simple set of options that > will turn this behavior off? I'm looking for a way to tell it to "obey the > bloody quotes" -- I still want the first, third, etc columns to become > numeric. There can be more than one variable like this, and not always in > the second position. > > This happens deep inside the httr library; there is an easy way for me to > add more options to the read.csv call but it is not so easy to replace it > with something else. > > Terry T > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] more complex by with data.table???
Ramiro, `dt[, lapply(.SD, mean), by=name]` is the idiomatic way. I suggest reading through the new HTML vignettes at https://github.com/Rdatatable/data.table/wiki/Getting-started Ista, thanks for linking to the new vignette. On Wed, Jun 10, 2015 at 2:17 AM, Ista Zahn istaz...@gmail.com wrote: Hi Ramiro, There is a demonstration of this on the data.table wiki at https://rawgit.com/wiki/Rdatatable/data.table/vignettes/datatable-intro-vignette.html. You can do dt[, lapply(.SD, mean), by=name] or dt[, as.list(colMeans(.SD)), by=name] BTW, there are pretty straightforward ways to do this in base R as well, e.g, data.frame(t(sapply(split(df[-1], df$name), colMeans))) Best, Ista On Tue, Jun 9, 2015 at 4:22 PM, Ramiro Barrantes ram...@precisionbioassay.com wrote: Hello, I am trying to do something that I am able to do with the by function within data.frame but can't figure out how to achieve with data.table. Consider dt-data.table(name=c(rep(a,5),rep(b,6)),var1=0:10,var2=20:30,var3=40:50) myFunction - function(x) { mean(x) } I am aware that I can do something like: dt[, .(meanVar1=myFunction(var1)) ,by=.(name)] but how could I do the equivalent of: df-data.frame(name=c(rep(a,5),rep(b,6)),var1=0:10,var2=20:30,var3=40:50) myFunction - function(x) { mean(x) } columnNames - c(var1,var2,var3) result - by(df, df$name, function(x) { output - c() for(col in columnNames) { output[col] - myFunction(x[,col]) } output }) do.call(rbind,result) Thanks in advance, Ramiro [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data.frame: data-driven column selections that vary by row??
David, In data.table v1.9.5 (current development version, which you can get from here: https://github.com/Rdatatable/data.table/wiki/Installation), new features were added to both `melt` and `cast` for data.tables. They both can handle multiple columns simultaneously. I think this would be of interest for you.. Using 1.9.5, here's how I'd do it. require(data.table) ## v1.9.5+ cols - grep(^da2.*$, names(bw), value=TRUE) ## (1) splt - split(cols, seq_len(length(cols)/2L)) ## (2) vars - unique(gsub((.*?)_(.*$), \\1, cols))## (3) vals - unique(gsub((.*?)_(.*$), \\2, cols))## (4) ans1 = melt(setDT(bw), measure=splt, variable.name=disc, value.name=vals) ## (5) setattr(ans1$disc, 'levels', vars) ## (6) Explanation: --- 1. Get all cols you've to melt 2. Split them into column pairs that should be combined together 3. Get levels for 'variable' column 4. Get column names for molten result 5. Melt by providing list of columns with each element containing the columns you'd want to combine together in the molten result directly. 6. Set levels for variable column appropriately. Advantages: -- 1. melting by combining corresponding columns together, directly, is straightforward and easy to understand, since that's the task you want to perform. Having to combine all columns together and then split them back seems roundabout. 2. casting (tidyr::spread internally uses reshape2::dcast) is a relatively complicated operation, and in this case it can be completely avoided which will save both time and memory (see benchmark at the bottom of post). It also reorders the result which may not be desirable. 3. In 'bw', columns `da20_dev_type` and `da2_dev_type` are type 'factor' while others are type 'numeric'. reshape2::melt (or) tidyr::gather, since it combines all columns will have to coerce these different types to a common type, here 'character'. So, you'll have to convert the columns back to the right type after casting. I think you'll agree that's unnecessary. `melt.data.table` preserves the type as it combines only relevant columns together. 4. Since the operation is performed in a straightforward manner (and in C for speed), it's incredibly fast *and* memory efficient. Benchmark (on ~180,000 rows) - library(tidyr) library(dplyr) require(data.table) ## v1.9.5+ # replacing timestamp so that rows for unique (for spread to work correctly) bw.large = rbindlist(replicate(1e4, bw, simplify=FALSE))[, timestamp := .I][] object.size(bw.large)/1024^2 # ~38MB The data is 38MB, which is not at all large... but enough to illustrate. # data.table system.time({ cols - grep(^da2.*$, names(bw), value=TRUE) ## (1) splt - split(cols, seq_len(length(cols)/2L)) ## (2) vars - unique(gsub((.*?)_(.*$), \\1, cols))## (3) vals - unique(gsub((.*?)_(.*$), \\2, cols))## (4) ans1 = melt(setDT(bw.large), measure=splt, variable.name=disc, value.name=vals) ## (5) setattr(ans1$disc, 'levels', vars) ## (6) }) #user system elapsed # 0.260 0.013 0.275 Memory used: 56MB # tidyr system.time({ ans2 - gather(setDF(bw.large), key = tmp, value = value, matches(^d[a-z]+[0-9]+)) ans2 - separate(ans2, tmp, c(disc, var), _, extra = merge) ans2 - spread(ans2, var, value) }) #user system elapsed # 15.818 1.128 17.063 Memory used : 750MB And that's ~62x speedup. HTH Arun Srinivasan Co-developer, data.table. On Tue, Mar 31, 2015 at 8:35 PM, Tom Wright t...@maladmin.com wrote: Nice clean-up!!! On Tue, 2015-03-31 at 14:19 -0400, Ista Zahn wrote: library(tidyr) library(dplyr) bw - gather(bw, key = tmp, value = value, matches(^d[a-z]+[0-9]+)) bw - separate(bw, tmp, c(disc, var), _, extra = merge) bw - spread(bw, var, value) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] DESeq vs DESeq2 different DEGs results
You're on the wrong list. This is more appropriate on the bioconductor mailing list. On Mon, May 5, 2014 at 9:42 AM, Catalina Aguilar Hurtado cata...@gmail.comwrote: Hi, I want to compare DESeq vs DESeq2 and I am getting different number of DEGs which I will expect to be normal. However, when I compare the 149 genes ID that I get with DESeq with the 869 from DESeq2 there are only ~10 genes that are in common which I donât understand (using FDR 0.05 for both). I want to block the Subject effect for which I am including the reduced formula of ~1. Shouldnât these two methods output similar results? Because at the moment I could interpret my results in different ways. Thanks for your help, Catalina This the DESeq script that I am using: DESeq library(DESeq) co=as.matrix(read.table(2014_04_01_6h_LP.csv,header=T, sep=,, row.names=1)) Subject=c(1,2,3,4,5,1,2,4,5) Treatment=c(rep(co,5),rep(c2,4)) a.con=cbind(Subject,Treatment) cds=newCountDataSet(co,a.con) cds - estimateSizeFactors( cds) cds - estimateDispersions(cds,method=pooled-CR, modelFormula=count~Subject+Treatment) #filtering rs = rowSums ( counts ( cds )) theta = 0.2 use = (rs quantile(rs, probs=theta)) table(use) cdsFilt= cds[ use, ] fit0 - fitNbinomGLMs (cdsFilt, count~1) fit1 - fitNbinomGLMs (cdsFilt, count~Treatment) pvals - nbinomGLMTest (fit1, fit0) padj - p.adjust( pvals, method=BH ) padj - data.frame(padj) row.names(padj)=row.names(cdsFilt) padj_fil - subset (padj,padj 0.05 ) dim (padj_fil) [1] 149 1 ââââââ library (DESeq2) countdata=as.matrix(read.table(2014_04_01_6h_LP.csv,header=T, sep=,, row.names=1)) coldata= read.table (targets.csv, header = T, sep=,,row.names=1) coldata Subject Treatment F1 1co F2 2co F3 3co F4 4co F5 5co H1 1c2 H2 2c2 H4 4c2 H5 5c2 dds - DESeqDataSetFromMatrix( countData = countdata, colData = coldata, design = ~ Subject + Treatment) dds dds$Treatment - relevel (dds$Treatment, co) dds - estimateSizeFactors( dds) dds - estimateDispersions(dds) rs = rowSums ( counts ( dds )) theta = 0.2 use = (rs quantile(rs, probs=theta)) table(use) ddsFilt= dds[ use, ] dds - nbinomLRT(ddsFilt, full = design(dds), reduced = ~ 1) resLRT - results(dds) sum( resLRT$padj 0.05, na.rm=TRUE ) #[1] 869 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reshape large Data Frame to new format
Hi Dark, Sorry for the late response. Since you asked for a `data.table` solution as well, here's one: require(data.table) dt - as.data.table(rawData) dt[, GRP := (0:(.N-1L))%/%25L, by=PersonID] dt[, `:=`(var=codes, N = 1:.N), by=list(PersonID, GRP)] dcast.data.table(dt, PersonID+GRP ~ var+N, value.var=codes) Arun Co-developer of data.table package. On Mon, Mar 24, 2014 at 9:44 PM, David Carlson dcarl...@tamu.edu wrote: 78023, 43785, 69884, 12840, 54021 are listed as PersonID 3 in rawData, but PersonID 4 in resultData. Here is another way to get there: # Split codes by PersonID creating a single vector for each step1 - split(rawData$codes, rawData$PersonID) # Figure out how many lines we need - here 3 lines maxlines - ceiling(max(sapply(step1, length))/25) # Figure out how many entries we need - here 75 entries max - maxlines*25 # Fill in blank entries to pad each line to 75 step2 - lapply(step1, function(x) c(x, rep(, max-length(x # Wrap each single line into three lines step3 - lapply(step2, function(x) matrix(x, maxlines, 25, byrow=TRUE)) # Create PersonID vector PersonID - rep(names(step1), each=maxlines) # Create data frame step4 - data.frame(PersonID, do.call(rbind, step3), stringsAsFactors=FALSE) # Label columns colnames(step4) - gsub(X, Code, colnames(step4)) # Delete empty rows step4 - step4[apply(step4[, -1], 1, function(x) sum(x!=)0),] - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: Monday, March 24, 2014 9:57 AM To: r-help@r-project.org Cc: Dark Subject: Re: [R] Reshape large Data Frame to new format Hi, In your 'resultData, some observations seems to be omitted. with(rawData,tapply(codes, PersonID,FUN=function(x) x))$Person3 #[1] 56177 61704 70879 69033 87224 68670 65602 25476 81209 62086 35492 39771 #[13] 14380 43858 53679 78023 43785 69884 12840 54021 resultData[4,] # PersonId Code1 Code2 Code3 Code4 Code5 Code6 Code7 Code8 Code9 Code10 Code11 #4 Person3 56177 61704 70879 69033 87224 68670 65602 25476 81209 62086 35492 # Code12 Code13 Code14 Code15 Code16 Code17 Code18 Code19 Code20 Code21 Code22 #4 39771 14380 43858 53679 # Code23 Code24 Code25 One way would be: rawData$Seq-with(rawData,ave(codes,PersonID,FUN=function(x) rep(1:25,length.out=length(x rawData$Seq1- with(rawData,ave(codes,PersonID,FUN=function(x) rep(seq(length(x) %/%25 +1),each=25,length.out=length(x res - reshape(rawData,v.names=codes,idvar=c(PersonID,Seq1),timev ar=Seq,direction=wide,sep=)[,-2] res[is.na(res)] - colnames(res) - colnames(resultData) rownames(res) - rownames(resultData) A.K. On Monday, March 24, 2014 10:15 AM, Dark i...@software-solutions.nl wrote: Hi R-experts, I have a data.frame that I want to reshape to a certain format so I can use it in a tool for further analysis. Basicly I have a very long list with IDs of persons and their codes. I create a row for every person with 25 of their codes. I a person has more then 25 codes, I want to add another row for that person. If a row contains less then 25 codes I want to fill with empty string values. I have manually created a sample rawData and resultData and used dput so you can see my starting DF and the wanted result DF. The sample is of very limited size, the real data would contain a few million(!) records. rawData - structure(list(PersonID = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c(Person1, Person2, Person3, Person4, Person5), class = factor), codes = c(34396L, 81878L, 67829L, 13428L, 12992L, 63724L, 85930L, 78497L, 59578L, 50733L, 26154L, 47205L, 74578L, 12204L, 42435L, 96643L, 35242L, 29836L, 73031L, 11326L, 96686L, 55849L, 56415L, 11064L, 78509L, 55715L, 75851L, 60682L, 16277L, 52763L, 23429L, 39723L, 95809L, 60081L, 19618L, 46012L, 79188L, 54664L, 64420L, 72875L, 97428L, 74897L, 75615L, 12023L, 21572L, 56177L, 61704L, 70879L, 69033L, 87224L, 68670L, 65602L, 25476L, 81209L, 62086L, 35492L, 39771L, 14380L, 43858L, 53679L, 78023L, 43785L, 69884L, 12840L, 54021L, 68002L, 79249L, 61784L, 7L, 28935L, 91406L, 42045L, 97716L, 65690L, 57310L, 57627L, 32227L, 43121L, 22251L, 31255L, 90660L, 89118L, 14558L, 99824L, 25005L, 62186L, 10527L, 99438L, 85656L, 79465L,
Re: [R] creating table with sequences of numbers based on the table
I think this'll be way simpler and also faster: ans - data.frame(pop = rep.int(tab$pop, tab$Freq), ind=sequence(tab$Freq)) Arun From:Â Dennis Murphy djmu...@gmail.com Reply:Â Dennis Murphy djmu...@gmail.com Date:Â March 13, 2014 at 9:57:20 PM To:Â arun smartpink...@yahoo.com Cc:Â R help r-help@r-project.org Subject:Â Re: [R] creating table with sequences of numbers based on the table Less coding with plyr: tab - read.table(text=pop Freq 1 1 30 2 2 25 3 3 30 4 4 30 5 5 30 6 6 30 7 7 30,sep=,header=TRUE) # Function to do the work on each row f - function(pop, Freq) data.frame(ind = seq_len(Freq)) library(plyr) u - mdply(tab, f)[, -2] Dennis On Thu, Mar 13, 2014 at 8:01 AM, arun smartpink...@yahoo.com wrote: Hi, Try: Either tab - read.table(text=pop Freq 1 1 30 2 2 25 3 3 30 4 4 30 5 5 30 6 6 30 7 7 30,sep=,header=TRUE) indx - rep(1:nrow(tab),tab$Freq) tab1 - transform(tab[indx,],ind=ave(seq_along(indx),indx,FUN=seq_along))[,-2] #or tab2 - transform(tab[indx,],ind=unlist(sapply(tab$Freq,seq)))[,-2] identical(tab1,tab2) #[1] TRUE #or tab3 - transform(tab[indx,], ind= with(tab,seq_len(sum(Freq))-rep(cumsum(c(0L,Freq[-length(Freq)])),Freq)))[,-2] identical(tab1,tab3) #[1] TRUE A.K. I have a problem with transfering one table to another automatically. From table like this: tab pop Freq 1 1 30 2 2 25 3 3 30 4 4 30 5 5 30 6 6 30 7 7 30 I want to use number of individuals (freq) and then in next table just list them with following numbers (depending on total number of individuals) Like this: in pop ind 1 1 1 2 1 3 1 4 . . . . 1 30 2 1 2 2 2 3 2 4 . . 2 25 3 1 3 2 . . . . How can i do it? I think i have to use loops but so far I failed. Thank you in advance, Best, Malgorzata Gazda __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assign numbers in R
Here's another one: match(d, unique(d)). Arun From:Â Greg Snow 538...@gmail.com Reply:Â Greg Snow 538...@gmail.com Date:Â March 12, 2014 at 8:41:31 PM To:Â T Bal studentt...@gmail.com Cc:Â r-help r-help@r-project.org Subject:Â Re: [R] Assign numbers in R Here are a couple more options if you want some variety: d - c(8,7,5,5,3,3,2,1,1,1) as.numeric( factor(d, levels=unique(d)) ) [1] 1 2 3 3 4 4 5 6 6 6 cumsum( !duplicated(d) ) [1] 1 2 3 3 4 4 5 6 6 6 What would you want the output to be if your d vector had another 8 after the last 1? The different solutions will give different output. On Wed, Mar 12, 2014 at 3:13 AM, T Bal studentt...@gmail.com wrote: Hi, I have the following numbers: d - c(8,7,5,5,3,3,2,1,1,1) I want to convert these into the following numbers: r: 1,2,3,3,4,4,5,6,6,6 So if two numbers are different increment it if they are same then assign the same number: r - NULL for (i in 1:length(d)) { if (d[i] != d[i+1]) { r[i] =i+1; } else { r[i] = i; } } But this is not correct. How can I solve this problem? or how can I solve it in a different way? Thanks a lot! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] On ^ returning a matrix when operated on a data.frame
Dear R-users, I am wondering why ^ operator alone returns a matrix, when operated on a data.frame (as opposed to all other arithmetic operators). Here's an example: DF - data.frame(x=1:5, y=6:10) class(DF*DF) # [1] data.frame class(DF^2) # [1] matrix I posted here on SO: http://stackoverflow.com/questions/19964897/why-does-on-a-data-frame-return-a-matrix-instead-of-a-data-frame-like-do and got a very nice answer - it happens because a matrix is returned (obvious by looking at `Ops.data.frame`). However, what I'd like to understand is, *why* a matrix is returned for ^ alone? Here's an excerpt from Ops.data.frame (Thanks to Neal Fultz): if (.Generic %in% c(+, -, *, /, %%, %/%)) { names(value) - cn data.frame(value, row.names = rn, check.names = FALSE, check.rows = FALSE) } else matrix(unlist(value, recursive = FALSE, use.names = FALSE), nrow = nr, dimnames = list(rn, cn)) It's clear that a matrix will be returned unless `.Generic` is one of those arithmetic operators. My question therefore is, is there any particular reason why ^ operator is being missed in the if-statement here? I can't think of a reason where this would break. Also ?`^` doesn't seem to mention anything about this coercion. Please let me know if I should be posting this to R-devel list instead. Thank you very much, Arun [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] On ^ returning a matrix when operated on a data.frame
Duncan, Thank you. What I meant was that ^ is the only *arithmetic operator* to result in a matrix on operating in a data.frame. I understand it's quite old code. Also, your explanation makes sense, with the exception of / operator, I suppose (I could be wrong here). Arun On Thursday, November 14, 2013 at 12:32 AM, Duncan Murdoch wrote: It's not just ^ that is missing, the logical relations like , ==, etc also return matrices. This is very old code (I think from 1999), but I would guess that the reason is that the ^ and operators always return values of a single type (numeric and logical respectively), whereas the other operators can take mixed type inputs and return mixed type outputs. Duncan Murdoch Please let me know if I should be posting this to R-devel list instead. Thank you very much, Arun [[alternative HTML version deleted]] __ R-help@r-project.org (mailto:R-help@r-project.org) mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculating mean matrix
One way using `Reduce`: set.seed(45) grp - factor(rep(letters[1:10], each=10)) # equivalent of your column x # dummy data df - as.data.frame(matrix(sample(1:1000, replace=T), ncol=length(levels(grp # solution Reduce('+', split(df, grp))/length(levels(grp)) Arun On Saturday, January 19, 2013 at 3:49 PM, ya wrote: Hi list, Thank you vey much for reading this post. I have a data frame, I am trying to split it into a couple of data frame using one of the columns, say, x. After I get the data frames, I am planning to treat them as matrices and trying to calculate an element by element mean matrix. Could anyone give me some advice how to do it? So far, I know that if I have a couple of matrices, say data1,data2,data3,data4...dataN, I can do it like this: data=array(cbind(data1,data2,data3,data4,dataN), c(2, 3, N)) #2 refers to row number of matrix, 3 refers to column number of matrix, N refers to number of matrices to be averaged. meanmtrx=apply(data,1:2,mean) but I do not know how to use the resulting data frames with cbind(). Maybe there are other better ways. Any advice is appreciated. Thank you very much. Have a nice day. ya [[alternative HTML version deleted]] __ R-help@r-project.org (mailto:R-help@r-project.org) mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.