Re: [R] How to convert "c:\a\b" to "c:/a/b"?
If you have 'copied' the path from DOS, then you can use 'scan' to read it into a variable with the proper characters. Here is the string that I 'copied' D:\spencerg\statmtds\R\Rnews Here is the results after 'scan': > x.1 <- scan('clipboard', what='', allowEscapes=FALSE) Read 1 item > x.1 [1] "D:\\spencerg\\statmtds\\R\\Rnews" > Jim __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Convergys Labs [EMAIL PROTECTED] +1 (513) 723-2929 Henrik Bengtsson <[EMAIL PROTECTED]> Sent by: To [EMAIL PROTECTED] Spencer Graves at.math.ethz.ch <[EMAIL PROTECTED]> cc r-help@stat.math.ethz.ch, Dirk 06/27/2005 14:53 Eddelbuettel <[EMAIL PROTECTED]> Subject Re: [R] How to convert "c:\a\b" to "c:/a/b"? Spencer Graves wrote: > Hi, Henrik: > > Several functions, e.g., "grep", "sub", "gsub", and "regexpr", > have an argument "perl", FALSE by default. Moreover, "?regexp" has a > section on "Perl Regular Expressions". If you can do it in perl, might > that transfer to "gsub(..., perl=TRUE)"? I do not know the details behind the different "dialects" of regular expressions, but you can _not_ get the R parser to interpret the two ASCII characters "\n", as the two characters "\" and "n". The R parser is used when code is read by source() or when expressions are typed at the R prompt. The parser will always read it as the newline character (ASCII 10). The results from the parser is then passed to the R enginee. Thus, you cannot write your program such that it fools the parser, because your program is evaluated first after the parser. In other words, there is no way you can get nchar("\n") to equal 2. Cheers Henrik > Thanks, > spencer graves > p.s. I skimmed the discussion of "Pearl Regular Expressions", and > experimented with "gsub(..., perl=TRUE)" without success. However, > there may be a way to do it, and I just don't know perl and regexp well > enough to have figured it out in the time available. > > Henrik Bengtsson wrote: > >> Spencer Graves wrote: >> >>> Thanks, Dirk, Gabor, Eric: >>> >>> You all provided appropriate solutions for the stated problem. >>> Sadly, I oversimplified the problem I was trying to solve: I copy a >>> character string giving a DOS path from MS Windows Explorer into an R >>> script file, and I get something like the following: >>> >>> D:\spencerg\statmtds\R\Rnews >>> >>> I want to be able to use this in R with its non-R meaning, >>> e.g., in readLine, count.fields, read.table, etc., after appending a >>> file name. Your three solutions all work for my oversimplified toy >>> example but are inadequate for the problem I really want to solve. >> >> >> >> Hmmm. It should work as long as you do not source() the file (see >> below). There are two things to watch out for here. >> >> First, you have to be careful with backslashes, that is, a backslash >> is a single character ('\') in memory, but to be typed at the R >> prompt, you have to escape it (with a backslash), which is why we type >> "\\", cf. nchar("\\") == 0. Consider the file foo.txt containing the >> 28 characters (==28 bytes in plain ASCII format) >> >> D:\spencerg\statmtds\R\Rnews >> >> You can create
Re: [R] grep negation
?setdiff e.g., > txt <- c("arm","foot","lefroo", "bafoobar") > i <- grep("foo",txt); i [1] 2 4 > setdiff(seq(length(txt)),grep("foo",txt)) [1] 1 3 > Jim __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Convergys Labs [EMAIL PROTECTED] +1 (513) 723-2929 Marcus Leinweber <[EMAIL PROTECTED]>To: "'r-help@stat.math.ethz.ch'" Sent by: cc: [EMAIL PROTECTED]Subject: [R] grep negation ath.ethz.ch 06/23/2005 08:59 hi, using the example in the grep help: txt <- c("arm","foot","lefroo", "bafoobar") i <- grep("foo",txt); i [1] 2 4 but how can i get the negation (1,3) when looking for 'foo'? thanks, m. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] vectorisation suggestion
v3 <- numeric() v3[v1] <- table(v2)[v1] Jim __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Convergys Labs [EMAIL PROTECTED] +1 (513) 723-2929 Federico Calboli <[EMAIL PROTECTED]To: r-help c.uk>cc: Sent by: Subject: [R] vectorisation suggestion [EMAIL PROTECTED] ath.ethz.ch 06/20/2005 16:15 Hi All, I am counting the number of occurrences of the terms listed in one vector in another vector. My code runs: for( i in 1:length(vector3)){ vector3[i] = sum(1*is.element(vector2, vector1[i])) } where vector1 = vector containing the terms whose occurrences I want to count vector2 = made up of a number of repetitions of all the elements of vector1 vector3 = a vector of NAs that is meant to get the result of the counting My problem is that vector1 is about 6 terms, and vector2 is 62... can anyone suggest a faster code than the one I wrote? Cheers, Federico Calboli -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St. Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 75941602 Fax +44 (0)20 75943193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] (no subject)
'rle' might be your friend. This will find the 'run of a sequence' Here is some code working off the 'visit' data that you created. # $Log$ x.1 <- matrix(visit, ncol=4) # your data x.rle <- apply(x.1, 1, rle) # compute 'rle' for each row Passed <- lapply(x.rle, function(x){ # now process each row see if it meets the criteria .len <- length(x$lengths) if (x$lengths[.len] > 1 && x$values[.len] == 1) return(TRUE) # last two passed else if (.len == 2){ # two sequences if (x$lengths[.len] == 1 && x$values[.len] == 1) return(TRUE) # only last passed } return(FALSE) }) cbind(unlist(Passed), x.1) # put results in first column with the data Jim __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Convergys Labs [EMAIL PROTECTED] +1 (513) 723-2929 [EMAIL PROTECTED] .ca To: r-help@stat.math.ethz.ch Sent by: cc: [EMAIL PROTECTED]Subject: [R] (no subject) ath.ethz.ch 06/20/2005 11:58 R friends, I am using R 2.1.0 in a Win XP . I have a problem working with lists, probably I do not understand how to use them. Lets suppose that a set of patients visit a clinic once a year for 4 years on each visit a test, say 'eib' is performed with results 0 or 1 The patients do not all visit the clinic the 4 times but they missed a lot of visits. The test is considered positive if it is positive at the last 2 visits of that patient, or a more lenient definition, it is positive in the last visit, and never before. Otherwise it is Negative = always negative or is a YoYo = unstable = changes from positive to negative. So, if I codify the visits with codes 1,2,4,8 if present at year 1,2,3,4 and similarly the tests positive I get the last2 list codifying the test code corresponding to the visits patterns possible, similarly the last1 list 20 here means NULL nobs <- 400 # visits 0 1 23 45 6 7 89 last1 <- list((20),(1),(2),c(3,2),(4),c(5,4),c(6,4),c(7,6,4),(8),c(9,8), # visits 10 11 12 13 14 15 c(10,8),c(11,10,8),c(12,8),c(13,12,8),c(14,12,8),c(15,14,12,8)) # visits 0 123 45 67 89 last2 <- list((20),(20),(20),(3),(20),(5),(6),c(7,6),(20),(9), # visits 1011 1213 14 15 (10),c(11,10),(12),c(13,12),c(14,12),c(15,14,12)) # # simulate the visits # visit <- rbinom(nobs,1,0.7) eib <- visit # # simulate a positive test at a given visit # eib <- ifelse(runif(nobs) > 0.7,visit,0) # # create the codes # viskode <- matrix(visit,ncol=4) %*% c(1,2,4,8) eibkode <- matrix(eib,ncol=4) %*% c(1,2,4,8) # #this is the brute force method, slow, of computing the Results according to #the 2 definitions above. Add 16 to the test kode to signify YoYos, Exactly #16 will be the negatives # eibnoyoyo <- eibkode+16 eiblst2 <- eibkode+16 for(i in 1:nobs){ if(eibkode[i] %in% last1[[viskode[i]+1]]) eibnoyoyo[i] <- eibkode[i] if(eibkode[i] %in% last2[[viskode[i]+1]]) eiblast2[i] <- eibkode[i] } # #why is that these statements do not work? # eeibnoyoyo <- eeiblst2 <- rep(0,nobs) eeibnoyoyo <- ifelse(eibkode %in% last1[viskode+1],eibkode,eibkode+16) eeiblast2 <- ifelse(eibkode %in% last2[viskode+1],eibkode,eibkode+16) # table(viskode,eibkode) table(viskode,eibnoyoyo) table(viskode,eiblast2) # # these two tables must be diagonal!! # table(eibnoyoyo,eeibnoyoyo) table(eiblast2,eeiblast2) # Thanks for any help Heberto Ghez
Re: [R] vectorization
try this: > x.1 <- data.frame(income=runif(100)*1, educ=sample(c('hs','col','none'),100,T)) > x.1 income educ 1 5930.30882 col 2 5528.83222 hs 3 5967.04041 hs 4 3926.30682 hs 5 2603.75924 none ... > x.2 <- tapply(x.1$income, x.1$educ, mean) > x.2 col hs none 5575.310 4994.921 5481.962 > x.1$median <- x.2[x.1$educ] > x.1 income educ median 1 5930.30882 col 5575.310 2 5528.83222 hs 4994.921 3 5967.04041 hs 4994.921 4 3926.30682 hs 4994.921 5 2603.75924 none 5481.962 6 7398.83325 col 5575.310 7265.06895 hs 4994.921 . > Jim ______ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Convergys Labs [EMAIL PROTECTED] +1 (513) 723-2929 "Dimitri Joe" <[EMAIL PROTECTED]To: "R-Help" .br> cc: Sent by: Subject: [R] vectorization [EMAIL PROTECTED] ath.ethz.ch 06/17/2005 14:00 Hi there, I have a data frame (mydata) with 1 numeric variable (income) and 1 factor (education). I want a new column in this data with the median income for each education level. A obviously inneficient way to do this is for ( k in 1: nrow(mydata) ){ l <- mydata$education[k] mydata$md[k] <- median(mydata$income[mydata$education==l],na.rm=T) } Since mydata has nearly 30.000 rows, this will be done not untill the end of this month. I thus need some help for vectorizing this, please. Thanks, Dimitri [[alternative HTML version deleted]] ___ Instale o discador agora! http://br.acesso.yahoo.com/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Dateticks
try this example: > x.1 <- strptime("6/17/03",'%m/%d/%y') > x.1 [1] "2003-06-17" > plot(0:250, xaxt='n') > dates <- x.1 + c(0,50,100,150,200,250) * 86400 # 'dates' is in seconds, so add the appropriate number of days > dates [1] "2003-06-17 00:00:00 EDT" "2003-08-06 00:00:00 EDT" "2003-09-25 00:00:00 EDT" [4] "2003-11-13 23:00:00 EST" "2004-01-02 23:00:00 EST" "2004-02-21 23:00:00 EST" > axis(1, at=c(0,50,100,150,200,250), labels=format(dates,"%m/%d/%y")) # format the output > Jim __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Convergys Labs [EMAIL PROTECTED] +1 (513) 723-2929 "Bernard L. Dillard" <[EMAIL PROTECTED]> To: r-help@stat.math.ethz.ch Sent by: cc: [EMAIL PROTECTED]Subject: [R] Dateticks ath.ethz.ch 06/14/2005 12:27 Hello. I am having the worst time converting x-axis date ticks to real dates. I have tried several suggestions in online help tips and books to no avail. For example, the x-axis has 0, 50, 100, etc, and I want it to have "6/17/03", "8/6/03" etc. See attached (sample). Can anybody help me with this. Here's my code: ts.plot(date.attackmode.table[,1], type="l", col="blue", lty=2,ylab="IED Attacks", lwd=2,xlab="Attack Dates",main="Daily Summary of Attack Mode") grid() Thanks for your help if possible. (See attached file: sample.pdf) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html sample.pdf Description: Adobe PDF document __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Manipulating dates
Use POSIX. To convert: my.dates <- strptime(your.characters, format='%d/%m/%Y') once you have that, you can use 'min' to find the minimum. 'difftime' will give you the differences. Jim __________ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Convergys Labs [EMAIL PROTECTED] +1 (513) 723-2929 Richard Hillary <[EMAIL PROTECTED]To: r-help@stat.math.ethz.ch c.uk>cc: Sent by: Subject: [R] Manipulating dates [EMAIL PROTECTED] ath.ethz.ch 06/14/2005 10:07 Please respond to r.hillary Hello, Given a vector of characters, or factors, denoting the date in the following way: 28/03/2000, is there a method of 1) Computing the earliest of these dates; 2) Using this as a base, then converting all the other dates into merely the number of days after this minimum date Many thanks Richard Hillary __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] transform large matrix into list
> x.1 [,1] [,2] [1,]14 [2,]25 [3,] NA6 > cbind(x.1[!is.na(x.1)], which(!is.na(x.1), arr.ind=TRUE)) row col [1,] 1 1 1 [2,] 2 2 1 [3,] 4 1 2 [4,] 5 2 2 [5,] 6 3 2 > Jim ______ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Convergys Labs [EMAIL PROTECTED] +1 (513) 723-2929 Stefan Mischke <[EMAIL PROTECTED]To: r-help@stat.math.ethz.ch .ch> cc: Sent by: Subject: [R] transform large matrix into list [EMAIL PROTECTED] ath.ethz.ch 06/07/2005 08:55 Dear List I need to transform a large matrix M with many NAs into a list L with one row for each non missing cell. Every row should contain the cell value in the first column, and its coordinates of the matrix in column 2 and 3. M: x1 x2 y1 1 2 y2 4 5 y3 7 8 L: vx y 11 1 41 2 71 2 22 1 52 2 82 3 I'm trying to do this with a loop, but since my matrix is quite large (around 10k^2) this just takes a very long time. There must be a more efficient and elegant way to do this. Any hints? Thanks, Stefan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] weighted.mean and tapply (again)
> x.1 <- read.table('clipboard',header=T) > x.1 GROUP VALUE FREQUENCY 1 2 278 2 2 340 3 2 416 4 2 5 3 5 2 6 1 6 2 8 1 7 3 319 8 3 410 9 3 519 10 3 6 4 > by(x.1, x.1$GROUP, function(x) weighted.mean(x$VALUE, x$FREQUENCY)) x.1$GROUP: 2 [1] 2.654676 --- x.1$GROUP: 3 [1] 4.153846 > Jim __________ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Dan Bolser <[EMAIL PROTECTED]To: R mailing list uk> cc: Sent by: Subject: [R] weighted.mean and tapply (again) [EMAIL PROTECTED] ath.ethz.ch 05/25/2005 11:33 I read answers to questions including the words "tapply" and "weighted.mean", but I didn't understand either the problem (data) or the solution provided. Here is my question ... > dat[1:10,] GROUP VALUE FREQUENCY 1 2 278 2 2 340 3 2 416 4 2 5 3 5 2 6 1 6 2 8 1 7 3 319 8 3 410 9 3 519 1 3 6 4 For each GROUP, I would like to calculate the weighted.mean of VALUE using the FREQUENCY as the weight, so for the snippet of data shown that would be... group.2 <- weighted.mean(c(2,3,4,5,6,8),c(78,40,16,3,1,1)) group.3 <- weighted.mean(c(3,4,5,6),c(19,10,19,4)) > cbind(rbind(2,3),rbind(group.2,group.3)) [,1] [,2] group.22 2.654676 group.33 4.153846 I would like to use tapply to automatically do this across the whole dataset (dat) - which includes lots of other distinct grouping factors, however, like I said, I couldn't understand (and therefore apply to my data) any of the other solutions I found, so any help here would be greatly appreciated! All the best, Dan. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plot question
tt <- data.frame(c(0.5, 1, 0.5)) names(tt) <- "a" plot(tt$a, type = 'o',xlim=c(0,4)) ______ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Christoph Lehmann <[EMAIL PROTECTED]To: r-help@stat.math.ethz.ch x.ch>cc: Sent by: Subject: [R] plot question [EMAIL PROTECTED] ath.ethz.ch 03/03/2005 11:29 I have the following simple situation: tt <- data.frame(c(0.5, 1, 0.5)) names(tt) <- "a" plot(tt$a, type = 'o') gives the following plot ('I' and '.' represent the axis): I I I X I I I X X I... 1 2 3 what do I have to change to get the following: I I I X I I I X X I. 1 2 3 i.e. the plot-region should be widened at the left and right side thanks for a hint christoph __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] (no subject)
use 'gsub' > x <- c('1,200.44', '23,345.66') > gsub(',','',x) [1] "1200.44" "23345.66" > as.numeric(gsub(',','',x)) [1] 1200.44 23345.66 > __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Jim Gustafsson <[EMAIL PROTECTED]> To: r-help@stat.math.ethz.ch Sent by: cc: [EMAIL PROTECTED]Subject: [R] (no subject) ath.ethz.ch 02/16/2005 09:08 R-people I wonder if one could change a list of table with number of the form 1,200.44 , to 1200.44 Regards JG -- This e-mail and any attachment may be confidential and may also be privileged. If you are not the intended recipient, please notify us immediately and then delete this e-mail and any attachment without retaining copies or disclosing the contents thereof to any other person. Thank you. -- [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Programming/scripting with "expressions - variables"
Here is one way. It is the custom to return a value that will be assigned to the variable, so I changed your 'macro' to a function that returns the value and then assigns it to your variable: > test <- function(name, value){ + .result <- NULL # initialize to NULL + .result[name] <- value + .result[paste('other_', name, sep='')] <- paste("other_", value, sep='') + .result + } > Gregor <- test('Gorjanc', '25') > Gregor# print out the vector Gorjanc other_Gorjanc "25""other_25" > __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 "Gorjanc Gregor" <[EMAIL PROTECTED]To: uni-lj.si> cc: Sent by: Subject: [R] Programming/scripting with "expressions - variables" [EMAIL PROTECTED] ath.ethz.ch 02/07/2005 09:52 Hello to Rusers! I am puzzled with R and I really do not know where to look in for my problem. I am moving from SAS and I have difficulties in translating SAS to R world. I hope I will get some hints or pointers so I can study from there on. I would like to do something like this. In SAS I can write a macro as example bellow, which is afcourse a silly one but shows what I don't know how to do in R. %macro test(data, colname, colvalue); data &data; ... &colname="&colvalue"; other_&colname="other_&colvalue"; run; %mend; And if I run it with this call: %test(Gregor, Gorjanc, 25); I get a table with name 'Gregor' and columns 'Gorjanc', and 'other_Gorjanc' with values: Gorjanc other_Gorjanc "25""other_25" So can one show me the way to do the same thing in R? Thanks! -- Lep pozdrav / With regards, Gregor GORJANC --- University of Ljubljana Biotechnical Faculty URI: http://www.bfro.uni-lj.si Zootechnical Departmentemail: gregor.gorjanc bfro.uni-lj.si Groblje 3 tel: +386 (0)1 72 17 861 SI-1230 Domzalefax: +386 (0)1 72 17 888 Slovenia __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Frequency of Data
Try this using strsplit and table: > dates <- c('29.02.1997','15.02.2001','15.02.2001','23.12.2002') > x.1 <- do.call('rbind',strsplit(dates,'\\.')) > x.1 [,1] [,2] [,3] [1,] "29" "02" "1997" [2,] "15" "02" "2001" [3,] "15" "02" "2001" [4,] "23" "12" "2002" > class(x.1) <- 'integer' > x.1 [,1] [,2] [,3] [1,] 292 1997 [2,] 152 2001 [3,] 152 2001 [4,] 23 12 2002 > table(list(x.1[,2], x.1[,3])) .2 .1 1997 2001 2002 2 120 12 001 > __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 "Carsten Steinhoff" <[EMAIL PROTECTED]To: ttingen.de>cc: Sent by: Subject: [R] Frequency of Data [EMAIL PROTECTED] h 02/02/2005 14:44 Hello, just another problem in R, maybe it's simple to solve for you. I didn't find a solution up to now, but I'm convinced that I'm not the only one who has/had a similar problem. Maybe there's a ready-made function in R? The prob: I've imported a CSV-file into R with 1000 dates of an observed event (there's only information of the date. When there happend no event the date is not recorded, when there have been two events it's recordet twice). Now I want to COUNT the frequency of events in every month or year. The CSV-data is structured as: date 25.02.2003 29.07.1997 ... My desired output would be a matrix with n rows for the years and m columns for the month. How could a solution look like ? If the format is no matrix it doesn't matter. Importend is the extraction of frequency from my data. Thanks for all reply, Carsten [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Finding "runs" of TRUE in binary vector
use 'rle'; > a <- rnorm(20) > b <- a < .5 > b [1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE TRUE [13] FALSE FALSE TRUE TRUE TRUE FALSE TRUE FALSE > rle(b) Run Length Encoding lengths: int [1:9] 1 7 2 2 2 3 1 1 1 values : logi [1:9] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE > ______ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Sean Davis <[EMAIL PROTECTED]To: r-help >cc: Sent by: Subject: [R] Finding "runs" of TRUE in binary vector [EMAIL PROTECTED] ath.ethz.ch 01/27/2005 17:13 I have a binary vector and I want to find all "regions" of that vector that are runs of TRUE (or FALSE). > a <- rnorm(10) > b <- a<0.5 > b [1] TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE My function would return something like a list: region[[1]] 1,3 region[[2]] 5,5 region[[3]] 7,10 Any ideas besides looping and setting start and ends directly? Thanks, Sean __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Avoiding a Loop?
Does this do what you want? nr.of.columns <- 4 myconstant <- 27.5 mymatrix <- matrix(myconstant, nrow=5, ncol=nr.of.columns) mymatrix[,1] <- 1:5 t(apply(mymatrix, 1, function(x) cumprod(x))) ______ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 "Rau, Roland" <[EMAIL PROTECTED]> To: Sent by: cc: [EMAIL PROTECTED]Subject: [R] Avoiding a Loop? ath.ethz.ch 01/21/2005 07:31 Dear R-Helpers, I have a matrix where the first column is known. The second column is the result of multiplying this first column with a constant "const". The third column is the result of multiplying the second column with "const". So far, I did it like this (as a simplified example): nr.of.columns <- 4 myconstant <- 27.5 mymatrix <- matrix(numeric(0), nrow=5, ncol=nr.of.columns) mymatrix[,1] <- 1:5 for (i in 2:nr.of.columns) { mymatrix[,i] <- myconstant * mymatrix[,i-1] } Can anyone give me some advice whether it is possible to avoid this loop (and if yes: how)? Any suggestions are welcome! Thanks, Roland + This mail has been sent through the MPI for Demographic Rese...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] recoding large number of categories (select in SAS)
Here is a way of doing it by setting up a matrix of values to test against. Easier than writing all the 'select' statements. > x.trans <- matrix(c( # translation matrix; first column is min, second is max, + 149, 150, 150, # and third is the value to be returned + 186, 187, 187, + 438, 438, 438, + 430, 430, 430, + 808, 826, 808, + 830, 832, 808, + 997, 998, 792, + 792, 796, 792), ncol=3, byrow=T) > colnames(x.trans) <- c('min', 'max', 'value') > > x.default <- # default/nomatch value > > x.test <- c(150, 149, 148, 438, 997, 791, 795, 810, 820, 834) # test data > # > # this function will test each value and if between the min/max, return 3 column > # > newValues <- sapply(x.test, function(x){ + .value <- x.trans[(x >= x.trans[,'min']) & (x <= x.trans[,'max']),'value'] + if (length(.value) == 0) .value <- x.default# on no match, take default + .value[1] # return first value if multiple matches + }) > newValues [1] 150 150 438 792 792 808 808 > __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Denis Chabot <[EMAIL PROTECTED]To: r-help@stat.math.ethz.ch .net>cc: Sent by: Subject: [R] recoding large number of categories (select in SAS) [EMAIL PROTECTED] ath.ethz.ch 01/19/2005 08:56 AM Hi, I have data on stomach contents. Possible prey species are in the hundreds, so a list of prey codes has been in used in many labs doing this kind of work. When comes time to do analyses on these data one often wants to regroup prey in broader categories, especially for rare prey. In SAS you can nest a large number of "if-else", or do this more cleanly with "select" like this: select; when (149 <= prey <=150) preyGr= 150; when (186 <= prey <= 187) preyGr= 187; when (prey= 438) preyGr= 438; when (prey= 430) preyGr= 430; when (prey= 436) preyGr= 436; when (prey= 431) preyGr= 431; when (prey= 451) preyGr= 451; when (prey= 461) preyGr= 461; when (prey= 478) preyGr= 478; when (prey= 572) preyGr= 572; when (692 <= prey <= 695 ) preyGr= 692; when (808 <= prey <= 826, 830 <= prey <= 832 ) preyGr= 808; when (997 <= prey <= 998, 792 <= prey <= 796) preyGr= 792; when (882 <= prey <= 909) preyGr= 882; when (prey in (999, 125, 994)) preyGr= 9994; otherwise preyGr= 1; end; *select; The number of transformations is usually much larger than this short example. What is the best way of doing this in R? Sincerely, Denis Chabot __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] subsampling
Consider using a list. This will create a list with 10 entries of your 20 samples: x <- 1:200 myList <- split(x, cut(sample(x,200),breaks=10)) __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 [EMAIL PROTECTED] Sent by: To: r-help@stat.math.ethz.ch [EMAIL PROTECTED]cc: ath.ethz.ch Subject: [R] subsampling 01/14/2005 11:23 hi, I would like to subsample the array c(1:200) at random into ten subsamples v1,v2,...,v10. I tried with to go progressively like this: > x<-c(1:200) > v1<-sample(x,20) > y<-x[-v1] > v2<-sample(y,20) and then I want to do: >x<-y[-v2] Error: subscript out of bounds. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plotting percent of incidents within different 'bins'
You can use 'cut' to create the breaks.. Actually there are 8 in the 3-4 range: Outcome predictor 10 1 21 2 31 2 40 3 50 3 60 2 71 3 81 4 91 4 10 0 4 11 0 4 12 0 4 > cut(x.1$p, breaks=c(0,2,4)) [1] (0,2] (0,2] (0,2] (2,4] (2,4] (0,2] (2,4] (2,4] (2,4] (2,4] (2,4] (2,4] Levels: (0,2] (2,4] > x.c <- cut(x.1$p, breaks=c(0,2,4)) > tapply(x.1$O, x.c, function(x)sum(x==1)/length(x)) (0,2] (2,4] 0.500 0.375 > __________ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 "Stephen Choularton" <[EMAIL PROTECTED]> To: "R Help" Sent by: cc: [EMAIL PROTECTED]Subject: [R] plotting percent of incidents within different 'bins' ath.ethz.ch 01/05/2005 14:34 Hi Say I have some data, two columns in a table being a binary outcome plus a predictor and I want to plot a graph that shows the percentage positives of the binary outcome within bands of the predictor, e.g. Outcome predictor 0 1 1 2 1 2 0 3 0 3 0 2 1 3 1 4 1 4 0 4 0 4 0 4 etc In this case there are 4 cases in the band 1 - 2 of the predictor, 2 of them are true so the percent is 50% and there are 7 cases in the band 3 - 4, 3 of which are true making the percentage 43% . Is there some function in R that will sum these outcomes by bands of predictor and produce a one by two data set with the percentages in one column and the ordered bands in the other, or alternately is there some sort of special plot. that does it all for you? Thanks Stephen [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to duplicate rows in dataframe?
> x.1 <- data.frame(a=1:5, b=1:5) > x.1 a b 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 > x.1[c(1,2,2,2,3,3,4,4,5,4,3,2,1),] a b 1 1 1 2 2 2 2.1 2 2 2.2 2 2 3 3 3 3.1 3 3 4 4 4 4.1 4 4 5 5 5 4.2 4 4 3.2 3 3 2.3 2 2 1.1 1 1 > ______ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 cstrato <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent by: cc: [EMAIL PROTECTED]Subject: [R] How to duplicate rows in dataframe? ath.ethz.ch 12/13/2004 14:02 Dear all: I have the following (simple?) problem: Consider a dataframe where the first column contains integers used as index, e.g. index 24 13 46 32 Now I have the following vector used to sort the dataframe: x <- c(13,24,32,46) Sorting the dataframe can be done by using order. However consider the following vector: x <- c(13,32,13,24,46,24,24) Now I want to get the dataframe in the order of the rows defined in x, i.e. the dataframe contains duplicate rows. One way to achieve this would be to use rbind in a for-loop. My question is: Is there an easier and - more important - faster way to obtain the dataframe as defined in x? Thank you in advance. Best regards Christian _._._._._._._._._._._._._._._._ C.h.i.s.t.i.a.n S.t.r.a.t.o.w.a V.i.e.n.n.a A.u.s.t.r.i.a _._._._._._._._._._._._._._._._ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] lists within a list / data-structure problem
construct you list in the loop: x.all <- list() # initialize for (i in 1:limit){ ... x.all[[i]] <- result.list } now you want to name them, e.g., run1 names(x.all) <- paste('run', seq(length(x.all)), sep='') To access, you can dox.all$run1$Dom To extract all the 'Dom's lapply(x.all, function(x) x$Dom) HTH ______ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Jan Wantia <[EMAIL PROTECTED]>To: [EMAIL PROTECTED] Sent by: cc: [EMAIL PROTECTED]Subject: [R] lists within a list / data-structure problem ath.ethz.ch 12/13/2004 10:59 Dear all, this is a rather basic question; i am not sure how to structure my data well: I want to extraxt various measures from my raw-data. These measures are of different sizes, so I decided to store them in a list, like: run1 <- list(Dom = (my_vector), mean = (my_single_number)) I can do that in a for loop for 40 runs, ending up with 40 lists: run1, run2, ..., run40. To have all the measurements neatly together I thought of making another list, containing 40 sub-lists: > ALL <- list(run1, run2,..., run40) > ALL [[1]] [[1]]$Dom [1] "my_vector" [[1]]$mean [1] "my_single_number" [[2]] [[2]]$Dom [1] "my_vector" [[2]]$mean [1] "my_single_number" ... 1) This may be a bit clumsy as I have to type all the sub-list's names in by hand in order to produce my ALL-list: Is there a better way? 2) I have problems of addressing the data now. I can easily access any single value; for example, for the second component of the second sub- list: > ALL[[2]][[2]] [1] "my_single_number", but: how could I get the second component of all sub-lists, to plot, for example, all the $mean in one plot? For a matrix, mat[,2] would give me the whole second column, but ALL[[]][[2]] does not return all the second components. I feel that 'lapply' might help me here, but I could not figure out exactly how to use it, and it always comes down to the problem of how to correctly address the components in the sublists. Or is there maybe a smarter way to do that instead of using a list of lists? Any hint would be warmly appreciated! Jan (R 2.0.1 on windows XP) -- __ Jan Wantia Deptartment of Informatics, University of Zurich Andreasstr. 15 CH 8050 Zurich Switzerland Tel.:+41 (0) 1 635 4315 Fax: +41 (0) 1 635 45 07 email: [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] 'object.size' takes a long time to return a value
I was using 'object.size' to see how much memory a list was taking up. After executing the command, I had thought that my computer had locked up. After further testing, I determined that it was taking 241 seconds for object.size to return a value. I did notice in the release notes that 'object.size' did take longer when the list contained character vectors. Is the time that it is taking 'object.size' to return a value to be expected for such a list? Much better results were obtained when the character vectors were converted to factors. ## Results from the testing ### > str(x.1) List of 10 $ : chr [1:227299] "sadc" "sar" "date" "ksh" ... $ : chr [1:227299] "aprperf" "aprperf" "aprperf" "aprperf" ... $ : num [1:227299] 23 23 0 23 23 0 0 0 0 23 ... $ : num [1:227299] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:227299] 3600 3600 0.01 3600 3600 0.01 0.01 0.01 0.01 3600 ... $ : num [1:227299] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ... $ : num [1:227299] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:227299] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ... $ : num [1:227299] 62608 6796829 10208 13128 ... $ : num [1:227299] 0 1 0 0 1 0 0 0 0 0 ... # takes a long time (241 seconds) to report the size > gc();system.time(print(object.size(x.1))) used (Mb) gc trigger (Mb) Ncells 711007 19.02235810 59.8 Vcells 5191294 39.7 14409257 110.0 [1] 34154972 [1] 241.07 0.00 241.08 NA NA # trying list of 1000 > x.2 <- list.subset(x.1, 1:1000);gc();system.time(print(object.size(x.2))) used (Mb) gc trigger (Mb) Ncells 711006 19.02235810 59.8 Vcells 4300288 32.9 14409257 110.0 [1] 145860 [1] 0.01 0.00 0.01 NA NA # trying list of 10,000 > x.2 <- list.subset(x.1, 1:1);gc();system.time(print(object.size(x.2))) used (Mb) gc trigger (Mb) Ncells 711006 19.02235810 59.8 Vcells 4381288 33.5 14409257 110.0 [1] 1491948 [1] 0.28 0.00 0.28 NA NA # list of 40,000 > x.2 <- list.subset(x.1, 1:4);gc();system.time(print(object.size(x.2))) used (Mb) gc trigger (Mb) Ncells 711006 19.02235810 59.8 Vcells 4651288 35.5 14409257 110.0 [1] 5988460 [1] 7.15 0.00 7.15 NA NA # list of 60,000 > x.2 <- list.subset(x.1, 1:6);gc();system.time(print(object.size(x.2))) used (Mb) gc trigger (Mb) Ncells 711006 19.02235810 59.8 Vcells 4831288 36.9 14409257 110.0 [1] 9001556 [1] 17.33 0.00 17.32NANA # list of 100,000 > x.2 <- list.subset(x.1, 1:10);gc();system.time(print(object.size(x.2))) used (Mb) gc trigger (Mb) Ncells 711006 19.02235810 59.8 Vcells 5191288 39.7 14409257 110.0 [1] 15044780 [1] 51.85 0.00 51.86NANA # list structure of the last object > str(x.2) List of 10 $ : chr [1:10] "sadc" "sar" "date" "ksh" ... $ : chr [1:10] "aprperf" "aprperf" "aprperf" "aprperf" ... $ : num [1:10] 23 23 0 23 23 0 0 0 0 23 ... $ : num [1:10] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:10] 3600 3600 0.01 3600 3600 0.01 0.01 0.01 0.01 3600 ... $ : num [1:10] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ... $ : num [1:10] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:10] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ... $ : num [1:10] 62608 6796829 10208 13128 ... $ : num [1:10] 0 1 0 0 1 0 0 0 0 0 ... # with the first two items on the list converted to factors, # 'object.size' performs a lot better > str(x.1) List of 10 $ : Factor w/ 175 levels "#bpbkar","#bpcd",..: 132 133 60 93 13 160 60 84 60 132 ... $ : Factor w/ 8 levels "apra3g","aprperf",..: 2 2 2 2 2 2 2 2 2 2 ... $ : num [1:227299] 23 23 0 23 23 0 0 0 0 23 ... $ : num [1:227299] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:227299] 3600 3600 0.01 3600 3600 0.01 0.01 0.01 0.01 3600 ... $ : num [1:227299] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ... $ : num [1:227299] 0 0 0 0 0 0 0 0 0 0 ... $ : num [1:227299] 0.01 0 0.01 0 0.01 0 0.01 0 0 0.01 ... $ : num [1:227299] 62608 6796829 10208 13128 ... $ : num [1:227299] 0 1 0 0 1 0 0 0 0 0 ... > system.time(print(object.size(x.1))) # now it is fast [1] 16374176 [1] 0 0 0 NA NA > version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor0.1 year 2004 month11 day 15 language R > __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 -- "NOTICE: The information contained in this electronic mail ...{{dropped}} __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how can I get the coefficients of x^0, x^1, x^2, . , x^6 from expansion of (1+x+x^2)^3
Use the 'polynom' library: > p <- as.polynomial(c(1,1,1)) > p 1 + x + x^2 > p^3 1 + 3*x + 6*x^2 + 7*x^3 + 6*x^4 + 3*x^5 + x^6 > unclass(p^3) [1] 1 3 6 7 6 3 1 > __________ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 "Peter Yang" <[EMAIL PROTECTED]To: <[EMAIL PROTECTED]> >cc: Sent by: Subject: [R] how can I get the coefficients of x^0, x^1, x^2, . ,x^6 from [EMAIL PROTECTED] expansion of (1+x+x^2)^3 ath.ethz.ch 12/03/2004 14:56 Hi, I would like to get the coefficients of x^0, x^1, x^2, . , x^6 from expansion of (1+x+x^2)^3. The result should be 1, 3, 6, 7, 6, 3, 1; How can I calculate in R? You help will be greatly appreciated. Peter [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scatterplot of 100000 points and pdf file format
Have you tried plot(...,pch='.') This will use the period as the plotting character instead of the 'circle' which is drawn. This should reduce the size of the PDF file. I have done scatter plots with 2M points and they are typically meaningless with that many points overlaid. Check out 'hexbin' on Bioconductor (you can download the package from the RGUI window. This is a much better way of showing some information since it will plot the number of points that are within a hexagon. I have found this to be a better way of looking at some data. ______ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Witold Eryk Wolski <[EMAIL PROTECTED]To: R Help Mailing List <[EMAIL PROTECTED]> >cc: Sent by: Subject: [R] scatterplot of 10 points and pdf file format [EMAIL PROTECTED] ath.ethz.ch 11/24/2004 10:34 Hi, I want to draw a scatter plot with 1M and more points and save it as pdf. This makes the pdf file large. So i tried to save the file first as png and than convert it to pdf. This looks OK if printed but if viewed e.g. with acrobat as document figure the quality is bad. Anyone knows a way to reduce the size but keep the quality? /E -- Dipl. bio-chem. Witold Eryk Wolski MPI-Moleculare Genetic Ihnestrasse 63-73 14195 Berlin tel: 0049-30-83875219 __("<_ http://www.molgen.mpg.de/~wolski \__/'v' http://r4proteomics.sourceforge.net||/ \ mail: [EMAIL PROTECTED]^^ m m [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to extract data?
By 'ignore', can we delete those from the list of data? I would then assume that if you have a sequence of +0+0+ that you would want the last "+" for the increase of three. If that is the case, then do a 'diff' and delete the entries that are 0. Then create a new 'diff' and then use 'rle' to see what the length of the sequences are: > x <- c(1,2,2,3,3,4,3,3,2,2,2,1) > x [1] 1 2 2 3 3 4 3 3 2 2 2 1 > x.d <- diff(x) > x.d [1] 1 0 1 0 1 -1 0 -1 0 0 -1 > x.new <- x[c(x.d,1) != 0] > x.new [1] 1 2 3 4 3 2 1 > x.d1 <- diff(x.new) > x.d1 [1] 1 1 1 -1 -1 -1 > rle(x.d1) Run Length Encoding lengths: int [1:2] 3 3 values : num [1:2] 1 -1 > you can check the results of 'rle' to determine where the changes are. __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 ebashi <[EMAIL PROTECTED]> To: [EMAIL PROTECTED], [EMAIL PROTECTED] Sent by: cc: [EMAIL PROTECTED]Subject: [R] How to extract data? ath.ethz.ch 11/23/2004 15:54 I appreciate if anyone can help me, I have a table as follow, > rate DATE VALUE 1 1997-01-10 5.30 2 1997-01-17 5.30 3 1997-01-24 5.28 4 1997-01-31 5.30 5 1997-02-07 5.29 6 1997-02-14 5.26 7 1997-02-21 5.24 8 1997-02-28 5.26 9 1997-03-07 5.30 10 1997-03-14 5.30 ... ... ... ... ... ... I want to extract the DATE(s) on which the VALUE has already dropped twice and the DATE(s) that VALUE has already increased for three times,( ignore where VALUE(i+1)-VALUE(i)=0),I try to use diff() function, however that works only for one increase or decrease. Sincerely, Sean __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] timeDate
You might want to check out 'chron'. This stores the time as days and fractions of a day. If you take the current date, > as.numeric(chron(dates.="11/23/2004")) [1] 12745 > you get the value above. If you change this to millisecond, you get > as.numeric(chron(dates.="11/23/2004")) * 86400 * 1000 [1] 1.101168e+12 > this value requires 46 bits and since a floating point number has 54 bits of value, it should be enough to give you millisecond resolution and still maintain the 'date' __________ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Yasser El-Zein <[EMAIL PROTECTED]>To: [EMAIL PROTECTED] Sent by: cc: [EMAIL PROTECTED]Subject: Re: [R] timeDate ath.ethz.ch 11/23/2004 09:55 Please respond to Yasser El-Zein I am looking for up to the millisecond resolution. Is there a package that has that? On Mon, 22 Nov 2004 21:48:20 + (UTC), Gabor Grothendieck <[EMAIL PROTECTED]> wrote: > Yasser El-Zein gmail.com> writes: > > > > > >From the document it is apparent to me that I need as.POSIXct (I have > > a double representing the number of millis since 1/1/1970 and I need > > to construct a datetime object). I see it showing how to construct the > > time object from a string representing the time but now fro a double > > of millis. Does anyone know hoe to do that? > > > > If by millis you mean milliseconds (i.e. one thousandths of a second) > then POSIXct does not support that resolution, but if rounding to > seconds is ok then > > structure(round(x/1000), class = c("POSIXt", "POSIXct")) > > should give it to you assuming x is the number of milliseconds. > > __ > [EMAIL PROTECTED] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RE : Create sequence for dataset
I think this might do it. > x.1 <- data.frame(x=sample(1:3,20,T), y=sample(10:12,20,T)) # create test data > x.1 # print it out x y 1 2 11 2 3 11 3 2 10 4 1 12 5 3 11 6 1 10 7 3 10 8 1 11 9 1 12 10 1 11 11 1 12 12 1 12 13 2 11 14 3 11 15 3 10 16 3 10 17 2 12 18 2 10 19 3 11 20 2 11 # split the data by the numbers in 'x' (would be your 'amnl_key) # and add a column containing the sequence number > x.s <- by(x.1, x.1$x, function(x){x$seq <- seq(along=x$x); x}) # the result in 'x.s' is a list and the rows have to be recombined (rbind) to form the result > x.s # print out the data x.1$x: 1 x y seq 4 1 12 1 6 1 10 2 8 1 11 3 9 1 12 4 10 1 11 5 11 1 12 6 12 1 12 7 x.1$x: 2 x y seq 1 2 11 1 3 2 10 2 13 2 11 3 17 2 12 4 18 2 10 5 20 2 11 6 x.1$x: 3 x y seq 2 3 11 1 5 3 11 2 7 3 10 3 14 3 11 4 15 3 10 5 16 3 10 6 19 3 11 7 > do.call('rbind', x.s) # bind the rows and print out the result x y seq 1.4 1 12 1 1.6 1 10 2 1.8 1 11 3 1.9 1 12 4 1.10 1 11 5 1.11 1 12 6 1.12 1 12 7 2.1 2 11 1 2.3 2 10 2 2.13 2 11 3 2.17 2 12 4 2.18 2 10 5 2.20 2 11 6 3.2 3 11 1 3.5 3 11 2 3.7 3 10 3 3.14 3 11 4 3.15 3 10 5 3.16 3 10 6 3.19 3 11 7 > ______ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 [EMAIL PROTECTED] Sent by: To: [EMAIL PROTECTED] [EMAIL PROTECTED]cc: ath.ethz.ch Subject: [R] RE : Create sequence for dataset 11/21/2004 16:28 Dear members, I want to create a sequence of numbers for the multiple records of individual animal in my dataset. The SAS code below will do the trick, but I want to learn to do it in R. Can anyone help ? data ht&ssn; set ht&ssn; by anml_key; if first.anml_key then do; seq_ht_rslt=0; end; seq_ht_rslt+1; Thanks in advance. Stella ___ This message, including attachments, is confidential. If you are not the intended recipient, please contact us as soon as possible and then destroy the message. Do not copy, disclose or use the contents in any way. The recipient should check this email and any attachments for viruses and other defects. Livestock Improvement Corporation Limited and any of its subsidiaries and associates are not responsible for the consequences of any virus, data corruption, interception or unauthorised amendments to this email. Because of the many uncertainties of email transmission we cannot guarantee that a reply to this email will be received even if correctly sent. Unless specifically stated to the contrary, this email does not designate an information system for the purposes of section 11(a) of the New Zealand Electronic Transactions Act 2002. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] scan or source a text file into a list
use an 'environment' to read in the values; e.g., > with(e1 <- new.env(),source('/tempxx.txt', local=T)) # read in the file to a new environment > myList <- list() # define empty list > for (i in ls(e1)){ # process each element + myList[i] <- get(i, e1) + } > > ls(e1) # show objects in the list [1] "fnr""nYears" "qe" "year0" > myList # output my list $fnr [1] 0.3 $nYears [1] 50 $qe [1] 0.04 $year0 [1] 1970 > __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 "Andy Bunn" <[EMAIL PROTECTED]> To: "R-Help" <[EMAIL PROTECTED]> Sent by: cc: [EMAIL PROTECTED]Subject: [R] scan or source a text file into a list ath.ethz.ch 11/11/2004 10:57 I've ported somebody else's rather cumbersome Matlab model to R for colleagues that want a free build of the model with the same type of I/O. The Matlab model reads a text file with the initial parameters specified as: C:\Data\Carluc\Rport>more Params.R # Number of years to simulate nYears = 50; # Initial year for graphing purposes year0 = 1970; # NPP/GPP ratio (cpp0 unitless) fnr = 0.30; # Quantum efficency qe = 0.040; That is, there are four input variables (for this run - there can be many more) written in a way that R can understand them. In R, I can have the model source the parameter text file easily enough and have the objects in the workspace. The model function in R takes a list at runtime. How can I have R read that file and put the contents into the list I need? E.g., > rm(list = ls()) > source("Params.R") > ls() [1] "fnr""nYears" "qe" "year0" > fnr [1] 0.3 > nYears [1] 50 > foo.list <- list(fnr = fnr, nYears = nYears) > > foo.list $fnr [1] 0.3 $nYears [1] 50 The model is then run with > CarlucR(inputParamList = foo.list, ...) I can't build inputParamList "by hand" as above because the number of initial parameters changes with the model run and this runs in a wrapper. Any thoughts? Some combination of paste with scan or parse? -Andy > version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor0.0 year 2004 month10 day 04 language R > __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Error in PDF output in R 2.0.0
The following script works fine in R 1.9.1. It was creating a PDF file with the graphs in it. In R 2.0.0, I got the error message below. I tried the same script just outputting to Windows and postscript and the output was OK. The error message only showed up when trying to create a PDF file. > version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor0.0 year 2004 month10 day 04 language R ## output to windows -- OK > print(xyplot(I(usr/sys) ~ time|factor(cpu), memIn, + panel=function(x,y)panel.xyplot(x,y,type='l'))) > print(xyplot(csw ~ time|factor(cpu), memIn, + panel=function(x,y)panel.xyplot(x,y,type='l'))) ## ouput to postscript file -- OK > postscript('out.ps') > print(xyplot(I(usr/sys) ~ time|factor(cpu), memIn, + panel=function(x,y)panel.xyplot(x,y,type='l'))) > print(xyplot(csw ~ time|factor(cpu), memIn, + panel=function(x,y)panel.xyplot(x,y,type='l'))) > dev.off() windows 2 ## output to PDF file -- ERRORS > pdf('out.pdf') > print(xyplot(I(usr/sys) ~ time|factor(cpu), memIn, + panel=function(x,y)panel.xyplot(x,y,type='l'))) Error in "[<-"(`*tmp*`, pos.heights[[nm]], value = numeric(0)) : nothing to replace with > print(xyplot(csw ~ time|factor(cpu), memIn, + panel=function(x,y)panel.xyplot(x,y,type='l'))) Error in "[<-"(`*tmp*`, pos.heights[[nm]], value = numeric(0)) : nothing to replace with > dev.off() windows 2 ## traceback on error > traceback() 3: calculateGridLayout(x, rows.per.page, cols.per.page, number.of.cond, panel.height, panel.width, main, sub, xlab, ylab, x.alternating, y.alternating, x.relation.same, y.relation.same, xaxis.rot, yaxis.rot, xaxis.cex, yaxis.cex, par.strip.text, legend) 2: print.trellis(xyplot(csw ~ time | factor(cpu), memIn, panel = function(x, y) panel.xyplot(x, y, type = "l"))) 1: print(xyplot(csw ~ time | factor(cpu), memIn, panel = function(x, y) panel.xyplot(x, y, type = "l"))) > __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 -- "NOTICE: The information contained in this electronic mail ...{{dropped}} __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Error with repeat lines() in function
The problem was is that you were not return a value from the apply function. It was trying to store the result of the apply into an array and there was no value. See the line I added in your function. __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Sean Davis <[EMAIL PROTECTED]To: Uwe Ligges <[EMAIL PROTECTED]> >cc: r-help <[EMAIL PROTECTED]> Sent by: Subject: Re: [R] Error with repeat lines() in function [EMAIL PROTECTED] ath.ethz.ch 09/24/2004 13:09 Here is an example that seems to reproduce the error: rf1 <- matrix(sort(abs(round(runif(4)*100))),nrow=1) annot1 <- sort(abs(round(runif(193)*100))) annot2 <- annot1 + 70 annot3 <- cbind(annot1,annot2) rat2 <- rnorm(193) rat1 <- rnorm(193) plotter <- function(annot,rat1,rat2,rf1,...) { par(las=2) xmax <- max(annot[,2]) xmin <- min(annot[,1]) par(mfrow=c(2,1)) plot(annot[,1],rat1,type="l",xlab="",ylab="log2 Ratio",...) points(annot[,1],rat1) apply(rf1,1,function(z) { if (z[4]=="+") { color <- 'green' yoffset=1 } else { color <- 'red' yoffset=-1 } lines(list(x=c(z[1],z[4]),y=c(-2-yoffset/10,-2-yoffset/ 10)),lwd=2,col=color) lines(list(x=c(z[2],z[3]),y=c(-2-yoffset/10,-2-yoffset/ 10)),lwd=4,col=color) 1 # fake a return value }) abline(h=0,lty=2) } plotter(annot3,rat1,rat2,rf1) Error in ans[[1]] : subscript out of bounds Enter a frame number, or 0 to exit 1:plotter(annot3, rat1, rat2, rf1) 2:apply(rf1, 1, function(z) { Selection: 0 On Sep 24, 2004, at 12:05 PM, Uwe Ligges wrote: > Sean Davis wrote: > >> I have a function that does some plotting. I then add lines to the >> plot. If executed one line at a time, there is not a problem. If I >> execute the function, though, I get: >> Error in ans[[1]] : subscript out of bounds >> This always occurs after the second lines command, and doesn't happen >> with all of my data points (some do not have errors). Any ideas? > > Please give an example how to produce the error, > i.e. specify a very small toy example (including generated data and > the call to your function). > Many people on this list are quite busy these days and don't want to > think about how to call your function and invent an example ... > > Uwe Ligges > > > >> Thanks, >> Sean >> function(x,annot,rat1,rat2,rf,...) { >> par(las=2) >> wh <- which(annot[,5]==x) >> xmax <- max(annot[wh,4]) >> xmin <- min(annot[wh,3]) >> chr <- annot[wh,2][1] >> wh.rf <- rf$chrom==as.character(chr) & rf$txStart>xmin & >> rf$txEnd> par(mfrow=c(2,1)) >> plot(annot[wh,3],rat1[wh],type="l",xlab="",ylab="log2 >> Ratio",main=x,...) >> points(annot[wh,3],rat1[wh]) >> apply(rf[wh.rf,],1,function(z) { >> browser() >> if (z[4]=="+") { >> color <- 'green' >> yoffset=1 >> } else { >> color <- 'red' >> yoffset=-1 >>
Re: [R] Spare some CPU cycles for testing lme?
I tried out your example and it abended. It ran through 22472 times and ended with an error message that the instruction at 0x77f5b2ab could not reference location 0x0028. > version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major1 minor9.1 year 2004 month06 day 21 language R HTH __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Frank Samuelson <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent by: cc: [EMAIL PROTECTED]Subject: [R] Spare some CPU cycles for testing lme? ath.ethz.ch 09/13/2004 08:40 If anyone has a few extra CPU cycles to spare, I'd appreciate it if you could verify a problem that I have encountered. Run the code below and tell me if it crashes your R before completion. library(lme4) data(bdf) dump<-sapply( 1:5, function(i) { fm <- lme(langPOST ~ IQ.ver.cen + avg.IQ.ver.cen, data = bdf, random = ~ IQ.ver.cen | schoolNR); cat(" ",i,"\r") 0 }) The above code simply reruns the example from the lme help page a large number of times and returns a bunch of 0's, so you'll need to have the lme4 and Matrix packages installed. It might take a while to complete, but you can always nice it and let it run. I'm attempting to bootstrap lme() from the lme4 package, but it causes a segfault after a couple hundred iterations. This happens on my Linux x86 RedHat 7.3, 8.0, 9.0, FC1 systems w/ 1.9.1 and devel 2.0.0 (not all possible combinations actually tested.) I've communicated w/ Douglas Bates about this and he doesn't appear to have the problem. Thanks for any help. -Frank __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] calculating error
This is in the FAQs. It has to do with representation of floating point numbers. You can not represent 'pi' exactlly in the 53 bits of precision in floating point. If you notice, 2^-53 is 1.1e-16 which indicates the 'roundoff' is in the least significant bit of the precision; this is to be expected with floating point numbers. __________ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 "Branimir K. Hackenberger"To: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> cc: Sent by: Subject: [R] calculating error [EMAIL PROTECTED] ath.ethz.ch 09/12/2004 14:28 Could anybody explain this results? >sin(2*pi) -2.449213e-16 #should be zero >(10^16)*sin(log2(4)*pi) -2.449213 #should be zero too and explain what to do to correct this events? Thanks!!! Branimir K. Hackenberger __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Unique lists from a list
Try this: l.1 <- list(list(name='a', addr='123'),list(name='b', addr='234'), list(name='b', addr='234'), list(name='a', addr='123')) # create a list l.names <- unlist(lapply(l.1, '[[', 'name')) # get the 'name' l.u <- unique(l.names) # make unique new.list <- l.1[match(l.u, l.names)] # create new list with just one 'name' __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 "michael watson (IAH-C)" To: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]cc: .ac.uk> Subject: [R] Unique lists from a list Sent by: [EMAIL PROTECTED] ath.ethz.ch 09/01/2004 10:31 Hi I have a list. Two of the elements of this list are "Name" and "Address", both of which are character vectors. Name and Address are linked, so that the same "Name" always associates with the same "Address". What I want to do is pull out the unique values, as a new list of the same format (ie two elements of character vectors). Now I've worked out that unique(list$Name) will give me a list of the unique names, but how do I then go and link those to the correct (unique) addresses so I end up with a new list which is the same format as the rest, but now unique? Cheers Mick __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] naive question
It is amazing the amount of time that has been spent on this issue. In most cases, if you do some timing studies using 'scan', you will find that you can read some quite large data structures in a reasonable time. If you initial concern was having to wait 10 minutes to have your data read in, you could have read in quite a few data sets by now. When comparing speeds/feeds of processors, you also have to consider what it being done on them. Back in the "dark ages" we had a 1 MIP computer with 4M of memory handling input from 200 users on a transaction system. Today I need a 1GHZ computer with 512M to just handle me. Now true, I am doing a lot different processing on it. With respect to I/O, you have to consider what is being read in and how it is converted. Each system/program has different requirements. I have some applications (running on a laptop) that can read in approximately 100K rows of data per second (of course they are already binary). On the other hand, I can easily slow that down to 1K rows per second if I do not specify the correct parameters to 'read.table'. So go back and take a look at what you are doing, and instrument your code to see where time is being spent. The nice thing about R is that there are a number of ways of approaching a solution and it you don't like the timing of one way, try another. That is half the fun of using R. ______ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 <[EMAIL PROTECTED] mple.edu>To: <[EMAIL PROTECTED]> Sent by: cc: [EMAIL PROTECTED], [EMAIL PROTECTED] [EMAIL PROTECTED], [EMAIL PROTECTED] ath.ethz.ch Subject: Re: [R] naive question 06/30/2004 16:25 > <[EMAIL PROTECTED]> writes: > >> I did not use R ten years ago, but "reasonable" RAM amounts have >> multiplied by roughly a factor of 10 (from 128Mb to 1Gb), CPU speeds >> have gone up by a factor of 30 (from 90Mhz to 3Ghz), and disk space >> availabilty has gone up probably by a factor of 10. So, unless the I/O >> performance scales nonlinearly with size (a bit strange but not >> inconsistent with my R experiments), I would think that things should >> have gotten faster (by the wall clock, not slower). Of course, it is >> possible that the other components of the R system have been worked on >> more -- I am not equipped to comment... > > I think your RAM calculation is a bit off. in late 1993, 4MB systems > were the standard PC, with 16 or 32 MB on high-end workstations. I beg to differ. In 1989, Mac II came standard with 8MB, NeXT came standard with 16MB. By 1994, 16MB was pretty much standard on good quality (= Pentium, of which the 90Mhz was the first example) PCs, with 32Mb pretty common (though I suspect that most R/S-Plus users were on SUNs, which were somewhat more plushly equipped). > Comparable figures today are probably 256MB for the entry-level PC and > a couple GB in the high end. So that's more like a factor of 64. On the > other hand, CPU's have changed by more than the clock speed; in > particular, the number of clock cycles per FP calculation has > decreased considerably and is currently less than one in some > circumstances. > I think that FP performance has increased more than integer performance, which has pretty much kept pace with the clock speed. The compilers have also improved a bit... Igor __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] binding rows from different matrices
Try: > veca=matrix(1:25,5,5) > vecb=matrix(letters[1:25],5,5) > vecc=matrix(LETTERS[1:25],5,5) > x.1 <- lapply(1:5,function(x)rbind(veca[x,],vecb[x,],vecc[x,])) > do.call('rbind',x.1) [,1] [,2] [,3] [,4] [,5] [1,] "1" "6" "11" "16" "21" [2,] "a" "f" "k" "p" "u" [3,] "A" "F" "K" "P" "U" [4,] "2" "7" "12" "17" "22" [5,] "b" "g" "l" "q" "v" [6,] "B" "G" "L" "Q" "V" [7,] "3" "8" "13" "18" "23" [8,] "c" "h" "m" "r" "w" [9,] "C" "H" "M" "R" "W" [10,] "4" "9" "14" "19" "24" [11,] "d" "i" "n" "s" "x" [12,] "D" "I" "N" "S" "X" [13,] "5" "10" "15" "20" "25" [14,] "e" "j" "o" "t" "y" [15,] "E" "J" "O" "T" "Y" > __ James Holtman"What is the problem you are trying to solve?" Executive Technical Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] +1 (513) 723-2929 Stephane DRAY <[EMAIL PROTECTED]To: [EMAIL PROTECTED] eal.ca> cc: Sent by: Subject: [R] binding rows from different matrices [EMAIL PROTECTED] ath.ethz.ch 06/29/2004 11:00 Hello list, I have 3 matrices with same dimension : > veca=matrix(1:25,5,5) > vecb=matrix(letters[1:25],5,5) > vecc=matrix(LETTERS[1:25],5,5) I would like to obtain a new matrix composed by alternating rows of these different matrices (row 1 of mat 1, row 1 of mat 2, row 1 of mat 3, row 2 of mat 1.) I have found a solution to do it but it is not very pretty and I wonder if I can do it in an other way (perhaps with apply ) ? > res=matrix(0,1,5) > for(i in 1:5) + res=rbind(res,veca[i,],vecb[i,],vecc[i,]) > res=res[-1,] > res [,1] [,2] [,3] [,4] [,5] [1,] "1" "6" "11" "16" "21" [2,] "a" "f" "k" "p" "u" [3,] "A" "F" "K" "P" "U" [4,] "2" "7" "12" "17" "22" [5,] "b" "g" "l" "q" "v" [6,] "B" "G" "L" "Q" "V" [7,] "3" "8" "13" "18" "23" [8,] "c" "h" "m" "r" "w" [9,] "C" "H" "M" "R" "W" [10,] "4" "9" "14" "19" "24" [11,] "d" "i" "n" "s" "x" [12,] "D" "I" "N" "S" "X" [13,] "5" "10" "15" "20" "25" [14,] "e" "j" "o" "t" "y" [15,] "E" "J" "O" "T" "Y" > Thanks in advance ! Stéphane DRAY -- Département des Sciences Biologiques Université de Montréal, C.P. 6128, succursale centre-ville Montréal, Québec H3C 3J7, Canada Tel : 514 343 6111 poste 1233 E-mail : [EMAIL PROTECTED] -- Web http://www.steph280.freesurf.fr/ __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Specifying suitable PC to run R
If you are running Windows, do you have the Performance Monitor running? This will help identify the reasons that programs are running slow. Most likely, you are low on memory and are paging a lot. I alway have it running and when I am running a large R script, if I am not using 100% of the CPU, then I must be paging (assuming that I am not reading in my data). You can also sprinkle the following function throughout your code to see how much CPU and memory you are using. I bracket all my major computational sections with it: my.stats <- function(text = "stats") { cat(text, "-",sys.call(sys.parent())[[1]], ":", proc.time()[1:3], " : ", round( memory.size()/2.^20., 1.), "MB\n") invisible(flush.console()) } This prints out a message like: > my.stats('Begin Reading') Begin Reading - my.stats : 5.61 3.77 22309.67 : 18.7 MB This says that I have used 5.61 CPU seconds of 'user' time, 3.77 CPU seconds of 'system' time and the R session has been running for 22309 seconds (I always have one waiting for simple calculation) and I have 18.7MB of memory allocated to objects. My first choice is get as much memory on your machine as you can; 1GB since this the most that R can use. I noticed a big difference in upgrading from 256M -> 512M. I also watch the Performance Monitor and when memory gets low and I want to run a large job, I restart R. Most of my scripts are setup to run R without saving any data in the .Rdata file. If I need to save a large object, I do it explicitly since memory is key performance limiting factor and Windows is not that good at freeing up memory after you have used a lot of it. A faster CPU will also help, but it would be the second choice, since if you are paging, most of your time is spent on data transfer and not computation. __ James Holtman"What is the problem you are trying to solve?" Executive Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] (513) 723-2929 Michael Dewey <[EMAIL PROTECTED]To: [EMAIL PROTECTED] uk> cc: Sent by: Subject: [R] Specifying suitable PC to run R [EMAIL PROTECTED] ath.ethz.ch 10/09/2003 14:04 If I am buying a PC where the most compute intensive task will be running R and I do not have unlimited resources what trade-offs should I make? Specifically should I go for 1 - more memory, or 2 - faster processor, or 3 - something else? If it makes a difference I shall be running Windows on it and I am thinking about getting a portable which I understand makes upgrading more difficult. Extra background: the tasks I notice going slowly at the moment are fitting models with lme which have complex random effects and bootstrapping. By the standards of r-help posters I have small datasets (few thousand cases, few hundred variables). In order to facilitate working with colleagues I need to stick with windows even if linux would be more efficient Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- "NOTICE: The information contained in this electronic mail ...{{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] timezones
Part of the problem is that 'now' is POSIXct and 'now.gmt' is POSIXlt. If you use as.POSIXct, you get the right answer. > (now <- Sys.time()) [1] "2003-08-03 18:29:38 EDT" > str(now) `POSIXct', format: chr "2003-08-03 18:29:38" > (now.gmt <- as.POSIXlt(now,tz="GMT")) [1] "2003-08-03 22:29:38 GMT" > str(now.gmt) `POSIXlt', format: chr "2003-08-03 22:29:38" > (now.gmt <- as.POSIXct(now,tz="GMT")) [1] "2003-08-03 18:29:38 EDT" > str(now.gmt) `POSIXct', format: chr "2003-08-03 18:29:38" > now-now.gmt Time difference of 0 secs > __ James Holtman"What is the problem you are trying to solve?" Executive Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] (513) 723-2929 Jerome Asselin <[EMAIL PROTECTED]To: [EMAIL PROTECTED], > [EMAIL PROTECTED] Sent by: cc: [EMAIL PROTECTED]Subject: Re: [R] timezones ath.ethz.ch 07/31/2003 12:30 I share your concerns regarding Problems 1 and 2. However, I am unable to provide help on those at this moment. As for Problem 3, an alternative for the time being would be to use another package such as chron or date, although it would be preferable to use the classes of the base package if possible. Sorry I can't be more helpful. Jerome On July 30, 2003 09:19 pm, Gabor Grothendieck wrote: > I have some questions and comments on timezones. > > Problem 1. > > # get current time in current time zone > > > (now <- Sys.time()) > > [1] "2003-07-29 18:23:58 Eastern Daylight Time" > > # convert this to GMT > > > (now.gmt <- as.POSIXlt(now,tz="GMT")) > > [1] "2003-07-29 22:23:58 GMT" > > # take difference > > > now-now.gmt > > Time difference of -5 hours > > Note that the difference between the times displayed by the first two > R expressions is -4 hours. Why does the last expression return > -5 hours? > > > Problem 2. Why do the two expressions below give different answers? > I take the difference between two dates in GMT and then repeat it in the > current time zone (EDT). > > # days since origin in GMT > > > julian(as.POSIXct("2003-06-29",tz="GMT"),origin=as.POSIXct("1899-12-30 > >",tz="GMT")) > > Time difference of 37801 days > > # days since origin in current timezone > > > julian(as.POSIXct("2003-06-29"),origin=as.POSIXct("1899-12-30")) > > Time difference of 37800.96 days > > > I thought this might be daylight savings time related but even with > > standard time I get: > > julian(as.POSIXct("2003-06-29",tz="EST"),origin=as.POSIXct("1899-12-30 > >",tz="EST")) > > Time difference of 37800.96 days > > > Problem 3. What is the general strategy of dealing with dates, as > opposed to datetimes, in R? > > I have had so many problems and a great deal of frustration, mostly > related to timezones. > > The basic problem is that various aspects of the date such as the year, > the month, the day of the month, the day of the week can be different > depending on the timezone you use. This is highly undesirable since > I am not dealing with anything more granular than a day yet timezones, > which are completely extraneous to dates and by all rights should not > have to enter into my problems, keep fowling me up. > > A lesser problem is that I find myself using irrelevant constants such >
[R] Problem reading a PDF output
I generated a PDF output file of 10 plots. When I try to view it with Adobe reader (R4 & R5), it will lockup the reader (it is consuming 100% of the CPU) after presenting the 4th plot. I can generate the plots just fine in Windows and as a postscript file reading it with GSview. Is there anyway to tell what might be wrong with the PDF output? The file is 890KB in size if anyone would like to look at it. The postscript file is 783KB in size. I was able to isolate it to a single plot in a PDF file and it had 38,000 lines of the following that composed 99% of the file: (I was plotting out individual events, which were about that many) 86.66 88.67 m 86.66 92.81 l 86.66 96.95 l 86.66 101.09 l 86.66 105.23 l 86.66 109.37 l 86.66 113.51 l : 38,000 more of the same 388.87 96.95 l 388.87 92.81 l 388.97 88.67 l 389.07 84.53 l S Q q 0.000 0.000 0.000 RG 0.75 w [] 0 d Is this breaking some limit in PDF? I am running: platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major1 minor7.1 year 2003 month06 day 16 language R ______ James Holtman "What is the problem you are trying to solve?" Executive Consultant -- Office of Technology, Convergys [EMAIL PROTECTED] (513) 723-2929 -- "NOTICE: The information contained in this electronic mail tran... {{dropped}} __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Convert char vector to numeric table
use 'textConnection': > x.1 <- c('1 2 3','4 5 6','7 8 9','8 7 6','6 5 4') # create character vector > x.in <- textConnection(x.1) # setup connection > x.data <- read.table(x.in) # read in the character vector > x.data V1 V2 V3 1 1 2 3 2 4 5 6 3 7 8 9 4 8 7 6 5 6 5 4 > "Nurnberg-LaZerte" <[EMAIL PROTECTED]> To: "R's help mailing list" <[EMAIL PROTECTED]> Sent by: cc: [EMAIL PROTECTED]Subject: [R] Convert char vector to numeric table ath.ethz.ch 03/31/03 17:09 Please respond to Nurnberg-LaZerte I'm a great fan of read.table(), but this time the data had a lot of cruft. So I used readLines() and editted the char vector to eventually get something like this: " 23.4 1.5 4.2" " 19.1 2.2 4.1" and so on. To get that into a 3 col numeric table, I first just used: writeLines(data,"tempfile") read.table("tempfile",col.names=c("A","B","C")) Works fine, but writing to a temporary file seems ... inelegant? And read.table() doesn't take a char vector as a file or connection argument. The following works but it seems like a lot of code: data <- sub(" +","",data)# remove leading blanks for strsplit data <- strsplit(data," +")# strsplit returns a list of char vectors ndata <- character(0) # vectorize the list of char vectors for (ii in 1:length(data)) ndata <- c(ndata,data[[ii]]) ndata <- as.numeric(ndata) dim(ndata) <- c(3,length(data)) data <- t(ndata) data.frame(A=data[,1],B=data[,2],C=data[,3]) Am I missing something? Thanks, Bruce L. __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- "NOTICE: The information contained in this electronic mail transmission is intended by Convergys Corporation for the use of the named individual or entity to which it is directed and may contain information that is privileged or otherwise confidential. If you have received this electronic mail transmission in error, please delete it from your system without copying or forwarding it, and notify the sender of the error by reply email or by telephone (collect), so that the sender's address records can be corrected." __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] overlapping pattern match (errata 2.0)
Another way to find all the multiple occurances of a character in a string is to use 'rle': > x.s <- 'aaabbcdeeeggiijjysbbddeffghjjjsdk' > x <- unlist(strsplit(x.s, NULL)) > x [1] "a" "a" "a" "b" "b" "c" "d" "e" "e" "e" "f" "f" "f" "f" "g" "g" "i" "i" "j" [20] "j" "y" "s" "b" "b" "d" "d" "e" "f" "f" "g" "h" "j" "j" "j" "s" "d" "k" "k" [39] "k" "k" "k" > rle(x) Run Length Encoding lengths: int [1:21] 3 2 1 1 3 4 2 2 2 1 ... values : chr [1:21] "a" "b" "c" "d" "e" "f" "g" "i" "j" "y" "s" "b" "d" "e" "f" "g" ... > When the lengths are >1, the corresponding 'values' are the repeated characters. FMGCFMGC <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent by: cc: [EMAIL PROTECTED] [EMAIL PROTECTED]Subject: Re: [R] overlapping pattern match (errata 2.0) ath.ethz.ch 03/28/03 17:36 well! excuse me again but... your.string <- "aaacdf" nc1 <- nchar(your.string)-1 x <- unlist(strsplit(your.string, NULL)) CORRECT x2 <- c() for (i in 1:nc1) x2 <- c(x2, paste(x[i], x[i+1], sep="")) ERRATA 2 cat("ocurrences of in : ", length(grep("aa", x2)), sep="", fill=TRUE) Fran PD: sorry again __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- "NOTICE: The information contained in this electronic mail transmission is intended by Convergys Corporation for the use of the named individual or entity to which it is directed and may contain information that is privileged or otherwise confidential. If you have received this electronic mail transmission in error, please delete it from your system without copying or forwarding it, and notify the sender of the error by reply email or by telephone (collect), so that the sender's address records can be corrected." __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
Re: [R] Finding Missing Data Patterns
use 'rle' to test for the sequence of data--NAs. Depending on what you want to test for, a length > 2 of 'data/NA' would say that you have an mix. If you want 'data' first, then check the first value. Here is an example: > x.1 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,]11 NA NA1 NA NA [2,]11111 NA NA [3,] NA NA NA1111 [4,]1 NA1 NA1 NA1 > x.2 <- apply(x.1,1,function(x)rle(is.na(x))) > x.2 [[1]] Run Length Encoding lengths: int [1:4] 2 2 1 2 values : logi [1:4] FALSE TRUE FALSE TRUE [[2]] Run Length Encoding lengths: int [1:2] 5 2 values : logi [1:2] FALSE TRUE [[3]] Run Length Encoding lengths: int [1:2] 3 4 values : logi [1:2] TRUE FALSE [[4]] Run Length Encoding lengths: int [1:7] 1 1 1 1 1 1 1 values : logi [1:7] FALSE TRUE FALSE TRUE FALSE TRUE FALSE > sapply(x.2,function(x)length(x$lengths)>2) [1] TRUE FALSE FALSE TRUE > # first and fourth cases have the sequence you want and both > # start with 'data' because 'values' is FALSE Wolfgang ViechtbauerTo: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]cc: iuc.edu> Subject: [R] Finding Missing Data Patterns Sent by: [EMAIL PROTECTED] ath.ethz.ch 02/02/2003 00:09 Dear R-Helpers, I have a large data matrix, which contains missing data. The matrix looks something like this: 1) X X X X X X NA NA NA 2) NA NA NA NA X X X X X 3) NA NA X X X X NA NA NA 4) X X X X X X X X X 5) X X NA NA X NA NA NA NA and so on. Notice that the first row starts with complete data but ends with missing. The second row starts with missing, but the rest is complete. The third starts and ends with missing, but the middle part is complete. The fourth is complete. What I want to do is filter out patterns like in row 5, where the data are interrupted by missing data. Basically, I need to test each row for a "data, at least one NA, data" pattern. Is there some kind of way of doing this? I am at a loss for an easy way to accomplishing this. Any suggestions would be most appreciated! -- Wolfgang Viechtbauer __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help -- "NOTICE: The information contained in this electronic mail transmission is intended by Convergys Corporation for the use of the named individual or entity to which it is directed and may contain information that is privileged or otherwise confidential. If you have received this electronic mail transmission in error, please delete it from your system without copying or forwarding it, and notify the sender of the error by reply email or by telephone (collect), so that the sender's address records can be corrected." __ [EMAIL PROTECTED] mailing list http://www.stat.math.ethz.ch/mailman/listinfo/r-help