Re: [R] Adding axis limit to log-log plot
You need to come up some better limits. log(0) - -Inf On 3/16/06, Abd Rahman Kassim [EMAIL PROTECTED] wrote: Dear All, I've tried to add axis limit ( e.g. ylim=c(0,50)) to the log-log plot, but it turn out to be error (see below). Any solution is much appreciated. plot(maxsd$tph,maxsd$qdbh,log=xy,xlab=Stand density(trees per ha),ylab=Quadratic mean dbh (cm),ylim=c(0,50)) Error in plot.window(xlim, ylim, log, asp, ...) : Infinite axis extents [GEPretty(0,1.#INF,5)] In addition: Warning message: Nonfinite axis limits [GScale(-1.#INF,1.69897,2, .); log=1] Thanks Abd. Rahman Kassim, PhD Forest Management Ecology Program Forestry Conservation Division Forest Research Institute Malaysia Kepong 52109 Selangor Malaysia Fax: 603-62729852 Tel: 603-62797179 * * [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to create analog stripchart plots of x vs t (t=mm/dd/yyyy hh:mm:ss)
I think that you problem is that your format statement should be %m/%d/%Y %H:%M; notice the capital 'y' for a 4 digit year. You were probably getting NAs for dates. Here is what I got after reading in your partial data: x - read.csv('clipboard', as.is=T, header=F) x V1V2 V3 1 3/11/2006 15:09 0.014652 2 3/11/2006 15:08 0.014652 3 3/11/2006 15:06 0.00 x.date - strptime(paste(x$V1, x$V2), format=%m/%d/%Y %H:%M) x.date # make sure the dates look correct [1] 2006-03-11 15:09:00 2006-03-11 15:08:00 2006-03-11 15:06:00 plot(x.date, x$V3) On 3/11/06, Richard Evans [EMAIL PROTECTED] wrote: Hello r-experts, I sure could us a little help. I have an ever updating text file with timestamped data in it. I can reformat in anyway I want if need be but currently I have chosen to make columns of date, time and measuresed value (comma delimeted and with the dates and times in quotes to interpret them as strings). Here is a small section of my text data file: 3/11/2006,15:09,0.014652 3/11/2006,15:08,0.014652 3/11/2006,15:06,0.00 ...etc... I am trying to make a plot of 'X' vs. 't' where 'x' is a simple numerical vector of N elemets and 't' is a timestamp like mm/dd/ hh:mm:ss (also of N elements) here is how i am getting the data: TheData - scan(MyDataFile.txt,sep=,,list(,,0)) TheDates - TheData[[1]] TheTimes - TheData[[2]] TheValue - TheData[[3]] and here is how i am making the datetime objects: TimeStampStr - paste(TheDates,TheTimes) TimeStampObj - strptime(xStamp,%m/%d/%y %H:%M:%S) and then i try to plot it with: plot(TimeStampObj,TheValues) but this is where i get an error like this: Read 54 records Error in plot.window(xlim, ylim, log, asp, ...) : need finite 'ylim' values In addition: Warning messages: 1: no finite arguments to min; returning Inf 2: no finite arguments to max; returning -Inf Can someone help me understand what i need to do? Am I going about this the wrong way? Is there a smarter/more elegant way to do this? And advice is greatly apreciated. Sincerest thanks (in advance), - revansx P.S. here is an small section of my input file: __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to get f(x)=___ from a piecwise function
?approxfun x- c(-100.4, 32.0, 99.8, 200.2, 300.6, 399.8, 500.0, 600.0, 699.6, 799.6, + 899.8) y- c(0.4, 0.0, 0.2, -0.2, -0.6, 0.2, 0.0, 0.0, 0.4, 0.4, 0.2) x.f - approxfun(x,y) x.f(356) [1] -0.1532258 On 2/25/06, Eric C. Jennings [EMAIL PROTECTED] wrote: From actual real-world readings, I have two vectors: x- c(-100.4, 32.0, 99.8, 200.2, 300.6, 399.8, 500.0, 600.0, 699.6, 799.6, 899.8) y- c(0.4, 0.0, 0.2, -0.2, -0.6, 0.2, 0.0, 0.0, 0.4, 0.4, 0.2) which, in the usual way constitute a continuous piecewise function. What I want to do is find an easy method to get at f(x) for some x I have NOT specified in the above vector. For example I want f(356). I have already put the time and effort in to write a program to compute this by breaking the function into the various pieces and computing the slopes of the individual lines etc. etc. I am just looking to find an easier method. Thank you for your help. Eric __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 646 9390 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Working with lists with numerical names
x playerID yearID stint teamID lgID G AB R H 2B 3B HR RBI SB CS BB 1 robleos01 2005 1LAN NL 110 364 44 99 18 1 5 34 0 8 31 2 iguchta01 2005 1CHA AL 135 511 74 142 25 6 15 71 15 5 47 3 molinya01 2005 1SLN NL 114 385 36 97 15 1 8 49 2 3 23 x[[H]] + x[[H]] [1] 99 142 97 x[[2B]] [1] 18 25 15 On 2/23/06, Christopher Swingley [EMAIL PROTECTED] wrote: Greetings! I'm have a hard time working with some data I imported from a baseball database. Several of the database columns have numbers in them (2B, 3B), and when I try to use these vectors from the data frame, I get syntax errors, probably because it's interpreting the name as a number: show(batting2005) playerID yearID stint teamID lgID G AB R H 2B 3B HR RBI SB CS BB 1 robleos01 2005 1LAN NL 110 364 44 99 18 1 5 34 0 8 31 2 iguchta01 2005 1CHA AL 135 511 74 142 25 6 15 71 15 5 47 3 molinya01 2005 1SLN NL 114 385 36 97 15 1 8 49 2 3 23 . . . print(batting2005$HR) [1] 5 15 8 3 14 3 6 21 8 7 9 27 12 5 14 8 28 9 22 15 5 22 9 10 1 . . . print(batting2005$2B) Error: syntax error in print(batting2005$2 SLG-(H + 2B + 3B * 2 + HR * 3) / AB; Error: syntax error in SLG-(H + 2B # batting2005 is attached SLG-(H + 2B + 3B * 2 + HR * 3) / AB; Error in H + 2B : non-numeric argument to binary operator Is there a way to escape the '2B' somehow or encapsulate it so that R knows I'm talking about that particular numeric vector? Thanks, Chris -- Christopher S. Swingley email: [EMAIL PROTECTED] Intl. Arctic Research Center University of Alaska Fairbanks www.frontier.iarc.uaf.edu/~cswingle/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 646 9390 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] multinomial test
?sample sample(1:3, 6, TRUE, prob=c(2/9, 1/6, 11/18)) On 2/22/06, Li,Qinghong,ST.LOUIS,Molecular Biology [EMAIL PROTECTED] wrote: Hi All, What is the R function for computing multinomial distribution, e.g. f(2,1,3; 2/9, 1/6, 11/18, 6)? That is, a total of 6 trials, event 1's p1=2/9, x1=2, event 2's p2=1/6, x2=1, and event 3's p3=11/18, x3=3. thanks, Johnny [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 646 9390 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] read.table
?read.table The documentation has the parameter 'check.names'. On 2/13/06, Diethelm Wuertz [EMAIL PROTECTED] wrote: I have a file named test.csv with the following 3 lines: %y-%m-%d;VALUE 1999-01-01;100 2000-12-31;999 read.table(test.csv, header = TRUE, sep = ;) delivers: X.y..m..d VALUE 1 1999-01-01 100 2 2000-12-31 999 I would like to see the following ... %y-%m-%d VALUE 1 1999-01-01 100 2 2000-12-31 999 Note, readLines(test.csv, 1) delivers [1] %y-%m-%d;VALUE Is this possible ??? Thanks DW __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] question about binary data file and data.frame
Read it into a vector and then convert into a matrix; convert to a data frame: df - readBin(d:/sim.data, what='double', n=1, size=4) # make n greater than file size dim(df) - c(2, length(df) / 2) df - t(df) colnames(df) - c('age', 'weight') df - as.data.frame(df) On 2/7/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Each data (edge,weight) is numerical data with type double (4 bytes) df-file(d:/sim.data,rb) age[1]-readBin(df,double()) weigt[1]-readBin(df,double()) ... age[10]-readBin(df,double()) weigt[10]-readBin(df,double()) it is not asc file. === 2006-02-07 13:03:17 === [EMAIL PROTECTED] wrote: I have a binary file with data sequence in the order What do you mean by 'binary file'? [age,weight][age,weight] How are age and weight encoded in this 'binary file'? I know the length of the data and I want to load it into a data.frame. of course a way to do this is to read age and weight seperately and then use cbin(age,weight) to combine them into a dataframe, but is there a better solution? Is it really an ASCII file? With age and weight separated by commas, and then age-weight pairs separated by spaces? Are there really square bracket pairs in there too? Or is it really a binary file, a series of 4 or 8-byte binary representations of age and weight? Barry = = = = = = = = = = = = = = = = = = = = www.brook [EMAIL PROTECTED] 2006-02-07 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reading in a tricky computer program output
,i.005.003,i.005.004,i.005.005, i.006.001,i.006.002,i.006.003,i.006.004,i.006.005, i.007.001,i.007.002,i.007.003,i.007.004,i.007.005, i.008.001,i.008.002,i.008.003,i.008.004,i.008.005, i.009.001,i.009.002,i.009.003,i.009.004,i.009.005, i.010.001,i.010.002,i.010.003,i.010.004,i.010.005, i.006.006, i.007.006,i.007.007, i.008.006,i.008.007,i.008.008, i.009.006,i.009.007,i.009.008,i.009.009, i.010.006,i.010.007,i.010.008,i.010.009,i.010.010 ) one of the character vectors that matchs with the second 10 variable output is second.10-c( i.002.001, i.003.001,i.003.002, i.004.001,i.004.002,i.004.003, i.005.001,i.005.002,i.005.003,i.005.004, i.006.001,i.006.002,i.006.003,i.006.004,i.006.005, i.007.001,i.007.002,i.007.003,i.007.004,i.007.005, i.008.001,i.008.002,i.008.003,i.008.004,i.008.005, i.009.001,i.009.002,i.009.003,i.009.004,i.009.005, i.010.001,i.010.002,i.010.003,i.010.004,i.010.005, i.007.006, i.008.006,i.008.007, i.009.006,i.009.007,i.009.008, i.010.006,i.010.007,i.010.008,i.010.009 ) and then assign the character vector to the numeric vector by names-first.10 first.10 = numeric.vector combined.one - cbind(names,first.10) container - diag(10) for (i in 1:(10*10)) { k - as.numeric(substr(combined.one[i,1],7,9)) l - as.numeric(substr(combined.one [i,1],3,5)) val - as.numeric(combined.one [i,2]) container [k,l] - val } container - t(container ) Is there any other neat way to do this? Any help would be appreciated TM __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] saving a character vector
Is this what you want? It returns a character vector with the values: generate.index-function(n.item){ + .return - character() # initialize vector + for (i in 1:n.item) +{ +for (j in ((i+1):n.item)) +{ + # concatenate the results + .return - c(.return, paste(i,formatC(i,digits=2,flag=0),.,formatC(j,digits=2,flag=0),sep=)) + +} + +} +.return + } generate.index(10) [1] i001.002 i001.003 i001.004 i001.005 i001.006 i001.007 [7] i001.008 i001.009 i001.010 i002.003 i002.004 i002.005 [13] i002.006 i002.007 i002.008 i002.009 i002.010 i003.004 [19] i003.005 i003.006 i003.007 i003.008 i003.009 i003.010 [25] i004.005 i004.006 i004.007 i004.008 i004.009 i004.010 [31] i005.006 i005.007 i005.008 i005.009 i005.010 i006.007 [37] i006.008 i006.009 i006.010 i007.008 i007.009 i007.010 [43] i008.009 i008.010 i009.010 i010.011 i010.010 On 2/4/06, Taka Matzmoto [EMAIL PROTECTED] wrote: Hi R users I wrote a function that generates some character strings. generate.index-function(n.item){ for (i in 1:n.item) { for (j in ((i+1):n.item)) { cat(i,formatC(i,digits=2,flag=0),.,formatC(j,digits=2,flag=0),\n,sep=) } } } I like to save what appears on the screen when I run using generate.index(10) as a character vector I used temp - generate.index(10) but it didn't work. Could you provide some advice on this issue? Thanks in advance TM __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] generating strings in a tricky order
Is this what you want? n - 5 result - character() for (i in 2:n){ + for (j in 1:(i - 1)){ + result - c(result, sprintf(%03d:%03d, i, j)) + } + } result [1] 002:001 003:001 003:002 004:001 004:002 004:003 005:001 [8] 005:002 005:003 005:004 On 2/4/06, Taka Matzmoto [EMAIL PROTECTED] wrote: Hi R users I like to generate some strings (a character vector) in a special way like If i have 5 variables 002.001, 003.001, 003.002, 004.001, 004.002, 004.003, 005.001, 005.002, 005.003, 005.004 so the created string vector's elements are 002.001, 003.001, 003.002,004.001, 004.002, 004.003,005.001, 005.002, 005.003, 005.004 I tried to come up with for loop with two indexes (i and j) but I kept failing to generate that kind of order of strings. The order of the element in the character vector is very improtant. Any advice or help would be appreciated Thanks in advance TM, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Subsetting a matrix without for-loop
use 'filter': x - matrix(1:100,10) x [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]1 11 21 31 41 51 61 71 8191 [2,]2 12 22 32 42 52 62 72 8292 [3,]3 13 23 33 43 53 63 73 8393 [4,]4 14 24 34 44 54 64 74 8494 [5,]5 15 25 35 45 55 65 75 8595 [6,]6 16 26 36 46 56 66 76 8696 [7,]7 17 27 37 47 57 67 77 8797 [8,]8 18 28 38 48 58 68 78 8898 [9,]9 19 29 39 49 59 69 79 8999 [10,] 10 20 30 40 50 60 70 80 90 100 (y - apply(x, 2, filter, c(1,1,1))) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] NA NA NA NA NA NA NA NA NANA [2,]6 36 66 96 126 156 186 216 246 276 [3,]9 39 69 99 129 159 189 219 249 279 [4,] 12 42 72 102 132 162 192 222 252 282 [5,] 15 45 75 105 135 165 195 225 255 285 [6,] 18 48 78 108 138 168 198 228 258 288 [7,] 21 51 81 111 141 171 201 231 261 291 [8,] 24 54 84 114 144 174 204 234 264 294 [9,] 27 57 87 117 147 177 207 237 267 297 [10,] NA NA NA NA NA NA NA NA NANA y[-c(1, nrow(y)),] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]6 36 66 96 126 156 186 216 246 276 [2,]9 39 69 99 129 159 189 219 249 279 [3,] 12 42 72 102 132 162 192 222 252 282 [4,] 15 45 75 105 135 165 195 225 255 285 [5,] 18 48 78 108 138 168 198 228 258 288 [6,] 21 51 81 111 141 171 201 231 261 291 [7,] 24 54 84 114 144 174 204 234 264 294 [8,] 27 57 87 117 147 177 207 237 267 297 On 1/30/06, Camarda, Carlo Giovanni [EMAIL PROTECTED] wrote: Dear R-users, I'm struggling in R in order to squeeze a matrix without using a for-loop. Although my case is a bit more complex, the following example should help you to understand what I would like to do, but without the slow for-loop. Thanks in advance, Carlo Giovanni Camarda A - matrix(1:54, ncol=6) # my original matrix A.new - matrix(nrow=3, ncol=6) # a new matrix which I'll fill # for-loop for(i in 1:nrow(A.new)){ B - A[i:(i+2), ] # selecting the rows C - apply(B,2,sum) # summing by columns A.new[i,] - C # inserting in the new matrix } + This mail has been sent through the MPI for Demographic Rese...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Integer bit size and the modulus operator
You have reached the maximum value that can be stored accurately in a floating point number. That is what the error message is telling you. I get 21 warnings and this says that at 8^20 I am now truncating digits in the variable. You only have about 54 bits in the floating point number and you exceed this about 8^19. a=1:40; 8^a %% 41 [1] 8 23 20 37 9 31 2 16 5 40 33 18 21 4 32 10 39 25 36 1 8 23 20 37 9 31 2 16 5 40 33 [32] 18 21 4 32 10 0 0 0 0 There were 21 warnings (use warnings() to see them) warnings() Warning messages: 1: probable complete loss of accuracy in modulus 2: probable complete loss of accuracy in modulus 3: probable complete loss of accuracy in modulus 4: probable complete loss of accuracy in modulus 5: probable complete loss of accuracy in modulus 6: probable complete loss of accuracy in modulus 7: probable complete loss of accuracy in modulus 8: probable complete loss of accuracy in modulus 9: probable complete loss of accuracy in modulus 10: probable complete loss of accuracy in modulus 11: probable complete loss of accuracy in modulus 12: probable complete loss of accuracy in modulus 13: probable complete loss of accuracy in modulus 14: probable complete loss of accuracy in modulus 15: probable complete loss of accuracy in modulus 16: probable complete loss of accuracy in modulus 17: probable complete loss of accuracy in modulus 18: probable complete loss of accuracy in modulus 19: probable complete loss of accuracy in modulus 20: probable complete loss of accuracy in modulus 21: probable complete loss of accuracy in modulus 8^35 [1] 4.056482e+31 8^36 [1] 3.245186e+32 8^19 [1] 1.441152e+17 8^19%%41 [1] 36 8^20 [1] 1.152922e+18 8^20%%41 [1] 1 Warning message: probable complete loss of accuracy in modulus On 1/30/06, Ionut Florescu [EMAIL PROTECTED] wrote: I am a statistician and I come up to an interesting problem in cryptography. I would like to use R since there are some statistical procedures that I need to use. However, I run into a problem when using the modulus operator %%. I am using R 2.2.1 and when I calculate modulus for large numbers (that I need with my problem) R gives me warnings. For instance if one does: a=1:40; 8^a %% 41 one obtains zeros which is not possible since 8 to any power is not a multiple of 41. In addition when working with numbers larger that this and with the mod operator R crashes randomly. I believe this is because R stores large integers as real numbers thus there may be lack of accuracy when applying the modulus operator and converting back to integers. So my question is this: Is it possible to increase the size of memory used for storing integers? Say from 32 bits to 512 bits (Typical size of integers in cryptography). Thank you, any help would be greatly appreciated. Ionut Florescu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Integer bit size and the modulus operator
The other thing that you have to be aware of is that 8^n is not 8 multiplied by itself n times. You are probably using logs to compute this. Here is a sample of 8^(1:20). The value of 8^2 is 64.004 (not exactly an integer); roundoff errors are apparent in the other values. 8^(1:20) [1] 8.e+00 6.4004e+01 5.1201e+02 4.0961e+03 [5] 3.27680002e+04 2.62144002e+05 2.09715199e+06 1.67772159e+07 [9] 1.34217728e+08 1.073741824001e+09 8.589934592005e+09 6.871947673603e+10 [13] 5.497558138877e+11 4.398046511103e+12 3.5184372088832001e+13 2.8147497671065600e+14 [17] 2.2517998136852482e+15 1.8014398509481984e+16 1.4411518807585588e+17 1.1529215046068471e+18 On 1/30/06, Ionut Florescu [EMAIL PROTECTED] wrote: I am a statistician and I come up to an interesting problem in cryptography. I would like to use R since there are some statistical procedures that I need to use. However, I run into a problem when using the modulus operator %%. I am using R 2.2.1 and when I calculate modulus for large numbers (that I need with my problem) R gives me warnings. For instance if one does: a=1:40; 8^a %% 41 one obtains zeros which is not possible since 8 to any power is not a multiple of 41. In addition when working with numbers larger that this and with the mod operator R crashes randomly. I believe this is because R stores large integers as real numbers thus there may be lack of accuracy when applying the modulus operator and converting back to integers. So my question is this: Is it possible to increase the size of memory used for storing integers? Say from 32 bits to 512 bits (Typical size of integers in cryptography). Thank you, any help would be greatly appreciated. Ionut Florescu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Loops that last for ever...
I think this does what your loop is doing. Take about 0.5 seconds. system.time( + result - lapply(xs, function(.val){ + .sums - filter(.val, rep(1,5)) # add 5 connected values together + .sums[-c(1,2,length(.sums)-1, length(.sums))] + }) + ) [1] 0.50 0.00 0.54 NA NA On 1/30/06, Constantine Tsardounis [EMAIL PROTECTED] wrote: Hello, good morning or evening!... After studying some of the examples at S-poetry Document, I tried to implement some of the concepts in my R script, that intensively uses looping constructs. However I did not manage any improvement. My main problem is that I have a list of a lot of data e.g.: xs [[1]] [1][1000] [[2]] [1][840] ... [[50]] [1][945] Having a script with loops inside loops (for example in a Monte-Carlo simulation) takes a lot of minutes before it is completed. Is there another easier way to perform functions for each of the [[i]] ? Using probably apply? or constructing a specific function? or using the so-called vectorising tricks? One example could be the following, that calculates the sums 1:5, 2:6, 3:7,..., for each of xs[[i]] : xs - lapply(1:500, function(x) rnorm(1000)) totalsum - list() sums - list() first - list() for(i in 1:length(xs)) { totalsum[i] - sum(xs[[i]]) for(j in 1:length(xs[[i]])) { if(j == 1) { sums[[i]] - list() } if(j = 5) { sums[[i]][j] - sum(xs[[i]][(j-4):j]) } } } Of course the functions I actually call are more complicated, increasing the total time of calculations to a lot of minutes,... 1 . How could I optimize (or better eliminate?...) the above loop? Any other suggestions for my scripting habits? Another problem that I am facing is that calculating a lot of lists (50), that contain results of various econometric tests of all the variables, in the form of example.list[[i]] - expression demands more than 50 lines at the beginning of the script that initiate the lists (e.g. example.list.1 - list() example.list.2 - list() ... example.list.50 - list() 2 .Is there a way to avoid that? Thank you very very much in advance, Constantine Tsardounis __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Loops that last for ever...
fingers were too fast. Left off the creation of the two lists (total sums) that you wanted. system.time( result - lapply(xs, function(.val){ .total - sum(.val) # total sum .sums - filter(.val, rep(1,5)) # sum 5 consective values list(total=.total, sum=.sums[-c(1,2,length(.sums)-1, length(.sums))]) }) ) # create lists of sum and totals total - lapply(result, '[[', 'total') sums - lapply(result, '[[', 'sum') On 1/30/06, Constantine Tsardounis [EMAIL PROTECTED] wrote: Hello, good morning or evening!... After studying some of the examples at S-poetry Document, I tried to implement some of the concepts in my R script, that intensively uses looping constructs. However I did not manage any improvement. My main problem is that I have a list of a lot of data e.g.: xs [[1]] [1][1000] [[2]] [1][840] ... [[50]] [1][945] Having a script with loops inside loops (for example in a Monte-Carlo simulation) takes a lot of minutes before it is completed. Is there another easier way to perform functions for each of the [[i]] ? Using probably apply? or constructing a specific function? or using the so-called vectorising tricks? One example could be the following, that calculates the sums 1:5, 2:6, 3:7,..., for each of xs[[i]] : xs - lapply(1:500, function(x) rnorm(1000)) totalsum - list() sums - list() first - list() for(i in 1:length(xs)) { totalsum[i] - sum(xs[[i]]) for(j in 1:length(xs[[i]])) { if(j == 1) { sums[[i]] - list() } if(j = 5) { sums[[i]][j] - sum(xs[[i]][(j-4):j]) } } } Of course the functions I actually call are more complicated, increasing the total time of calculations to a lot of minutes,... 1 . How could I optimize (or better eliminate?...) the above loop? Any other suggestions for my scripting habits? Another problem that I am facing is that calculating a lot of lists (50), that contain results of various econometric tests of all the variables, in the form of example.list[[i]] - expression demands more than 50 lines at the beginning of the script that initiate the lists (e.g. example.list.1 - list() example.list.2 - list() ... example.list.50 - list() 2 .Is there a way to avoid that? Thank you very very much in advance, Constantine Tsardounis __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] beginner Q: hashtable or dictionary?
use a 'list': x - list() x[['test']] - 64 x[['next one']] - c(1,2,3,4) x $test [1] 64 $next one [1] 1 2 3 4 x[['test']] [1] 64 On 1/29/06, context grey [EMAIL PROTECTED] wrote: Hi, Is there something like a hashtable or (python) dictionary in R/Splus? (If not, is there a reason why it's not needed / typical way to accomplish the same thing?) Thank you __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Regex question
Is this what you want? result - readLines('/tempxx.txt') result [1] *NEW RECORD *ID-001 *AB-text*NEW RECORD [6] *ID-002 *AB-text result - result[grep('^.ID-', result)] # select only ID lines result [1] *ID-001 *ID-002 sub('^.ID-', '', result) [1] 001 002 On 1/28/06, Andrej Kastrin [EMAIL PROTECTED] wrote: Dear R useRs, is there any simple, build in function to match specific regular expression in data file and write it to a vector. I have the following text file: *NEW RECORD *ID-001 *AB-text *NEW RECORD *ID-002 *AB-text etc. Now I have to match all ID fields and print them to a vector: 001 002 etc. I know that this is very simple with Perl or R-Perl interface, but if possible, I want to do that 'on the hard way'. Cheers, Andrej __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] DOS command using system
Why don't you use 'unlink': unlink('c:/Program Files/DOSPROGRAM/input.dat') If you really want to use 'del', then you have to invoke the command processor: system('cmd /c del c:\\Program Files\\DOSPROGRAM\\input.dat') On 1/26/06, Taka Matzmoto [EMAIL PROTECTED] wrote: HI R users I have one question for using DOS command through system I like to delete a file that is located at C:\Program Files\DOSPROGRAM\input.dat I can use a DOS command del on Dos prompt like this C:\Documents and Settings del C:\Program Files\DOSPROGRAM\input.dat to delete input.dat file. When I try to do the same thing on R using system command system('del C:\Program Files\DOSPROGRAM\input.dat') or system(del C:\Program Files\DOSPROGRAM\input.dat) or system(paste(del, \C:\\Program Files\\DOSPROGRAM\\input.dat\,sep= )) All the three system commands did work Could you help me to figure out ? Thanks in advance TM __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Data management problem: convert text string to matrix of 0's and 1's
Here is another way: x - scan('/tempxx.txt', what='', sep='\n', blank.lines.skip=F) Read 7 items x [1] icsrvepf fpevrsci ics p f ic x - strsplit(x, '') # break into single characters template - c(i=0, c=0, s=0, r=0, v=0, e=0, p=0, f=0) mat - lapply(x, function(.l){ + .result - template # initialize the result + .result[.l] - 1 + .result + }) do.call('rbind', mat) i c s r v e p f [1,] 1 1 1 1 1 1 1 1 [2,] 1 1 1 1 1 1 1 1 [3,] 1 1 1 0 0 0 0 0 [4,] 0 0 0 0 0 0 1 0 [5,] 0 0 0 0 0 0 0 0 [6,] 0 0 0 0 0 0 0 1 [7,] 1 1 0 0 0 0 0 0 On 1/26/06, Thomas Lumley [EMAIL PROTECTED] wrote: On Thu, 26 Jan 2006, Dale Steele wrote: The data looks like: -- icsrvepf fpevrsci ics p f ic -- I would like to convert the about to a matrix of the form: i c s r v e p f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 One possibility is to use grep() a [1] icsrvepf fpevrsci p fic grep(i,a) [1] 1 2 6 so results-matrix(0,nrow=length(a),ncol=length(behaviours)) colnames(results)-behaviours for(b in behaviours) results[grep(b,a),b]-1 results i c s r v e p f [1,] 1 1 1 1 1 1 1 1 [2,] 1 1 1 1 1 1 1 1 [3,] 0 0 0 0 0 0 1 0 [4,] 0 0 0 0 0 0 0 0 [5,] 0 0 0 0 0 0 0 1 [6,] 1 1 0 0 0 0 0 0 -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Number of replications of a term
?table ids - c( ID1, ID2, ID2, ID3, ID3,ID3, ID5) x - table(ids) x ids ID1 ID2 ID3 ID5 1 2 3 1 count - x[ids] # index using the names in the string count ids ID1 ID2 ID2 ID3 ID3 ID3 ID5 1 2 2 3 3 3 1 On 1/24/06, Laetitia Marisa [EMAIL PROTECTED] wrote: Hello, Is there a simple and fast function that returns a vector of the number of replications for each object of a vector ? For example : I have a vector of IDs : ids - c( ID1, ID2, ID2, ID3, ID3,ID3, ID5) I want the function returns the following vector where each term is the number of replicates for the given id : c( 1, 2, 2, 3,3,3,1 ) Of course I have a vector of more than 40 000 ID and the function I wrote (it orders my data and checks on ID:Name of the data if the next term is the same as the previous one (see below) ) is really slow (30minutes for 44290 terms). But I don't have time by now to write a C function. Thanks a lot for your help, Laetitia. Here is the function I have written maybe I have done something not optimized : repVector - function(obj){ # order IDName ord - gif.indexByIDName(obj) ordobj - obj[ord,] nspots - nrow(obj) # vector of spot replicates number spotrep - rep(NA, nspots ) # function to get ID:Name for a given spot spotidname - function(ind){ paste(ordobj$genes[ind, c(ID,Name) ], collapse=:) } spot - 1 while( spot nspots ){ i-1 while( spotidname(spot) == spotidname(spot + i) ){ i - i + 1 } spotrep[spot : (spot + i-1)] - i spot - spot + i #cat(spot : ,spot,\n) } obj$genes$spotrep - spotrep[order(ord)] obj } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Converting from a dataset to a single column
?unlist temp-data.frame (col1=c(5,10,14,56,7),col2=c(4,2,8,3,34),col3=c(28,4,52,34,67)) temp col1 col2 col3 154 28 2 1024 3 148 52 4 563 34 57 34 67 unlist(temp) col11 col12 col13 col14 col15 col21 col22 col23 col24 col25 col31 col32 col33 col34 col35 5101456 7 4 2 8 33428 4 523467 On 1/23/06, r user [EMAIL PROTECTED] wrote: I have a dataset of 3 columns and 5 rows. temp-data.frame (col1=c(5,10,14,56,7),col2=c(4,2,8,3,34),col3=c(28,4,52,34,67)) I wish to convert this to a single column, with column 1 on top and column 3 on bottom. i.e. 5 10 14 56 7 4 2 8 3 34 28 4 52 34 67 Are there any functions that do this, and that will work well on much larger datasets (e.g. 1000 rows, 6000 columns)? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] White Noise
Is this what you want by adding a random number to the values? x - matrix(1:9, ncol=3) x [,1] [,2] [,3] [1,]147 [2,]258 [3,]369 x + runif(9, -.1, .1) [,1] [,2] [,3] [1,] 0.9577315 3.927673 6.953332 [2,] 2.0530684 5.072495 8.076258 [3,] 2.9885848 5.987975 8.936416 On 1/22/06, Laura Quinn [EMAIL PROTECTED] wrote: I'm wanting to create a series of near-identical matrices via the addition of white noise to my starting matrix. Is there a function within R which will allow me to do this? Thank you Laura Quinn Institute of Atmospheric Science School of Earth and Environment University of Leeds Leeds LS2 9JT tel: +44 113 343 1596 fax: +44 113 343 6716 mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Making a markov transition matrix
Is this what you want: set.seed(1001) # Raw data in long format -- raw - data.frame(name=c(f1,f1,f1,f1,f2,f2,f2,f2), year=c(83, 84, 85, 86, 83, 84, 85, 86), state=sample(1:3, 8, replace=TRUE) ) # Shift to wide format -- fixedup - reshape(raw, timevar=year, idvar=name, v.names=state, direction=wide) trans - as.matrix(fixedup) result - NULL for (i in 2:(ncol(trans) - 1)){ result - rbind(result, cbind(name=trans[,1], prev=trans[,i], next=trans[,i+1])) } result markov - table(try$prev.state, try$new.state) On 1/21/06, Ajay Narottam Shah [EMAIL PROTECTED] wrote: Folks, I am holding a dataset where firms are observed for a fixed (and small) set of years. The data is in long format - one record for one firm for one point in time. A state variable is observed (a factor). I wish to make a markov transition matrix about the time-series evolution of that state variable. The code below does this. But it's hardcoded to the specific years that I observe. How might one generalise this and make a general function which does this? :-) -ans. set.seed(1001) # Raw data in long format -- raw - data.frame(name=c(f1,f1,f1,f1,f2,f2,f2,f2), year=c(83, 84, 85, 86, 83, 84, 85, 86), state=sample(1:3, 8, replace=TRUE) ) # Shift to wide format -- fixedup - reshape(raw, timevar=year, idvar=name, v.names=state, direction=wide) # Now tediously build up records for an intermediate data structure try - rbind( data.frame(prev=fixedup$state.83, new=fixedup$state.84), data.frame(prev=fixedup$state.84, new=fixedup$state.85), data.frame(prev=fixedup$state.85, new=fixedup$state.86) ) # This is a bad method because it is hardcoded to the specific values # of year. markov - table(destination$prev.state, destination$new.state) -- Ajay Shah http://www.mayin.org/ajayshah [EMAIL PROTECTED] http://ajayshahblog.blogspot.com *(:-? - wizard who doesn't know the answer. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Making a markov transition matrix
Ignore last reply. I sent the wrong script. set.seed(1001) # Raw data in long format -- raw - data.frame(name=c(f1,f1,f1,f1,f2,f2,f2,f2), + year=c(83, 84, 85, 86, 83, 84, 85, 86), + state=sample(1:3, 8, replace=TRUE) + ) # Shift to wide format -- fixedup - reshape(raw, timevar=year, idvar=name, v.names=state, + direction=wide) trans - as.matrix(fixedup) result - NULL # loop through all the columns and build up the 'result' for (i in 2:(ncol(trans) - 1)){ + result - rbind(result, cbind(name=trans[,1], PREV=trans[,i], NEXT=trans[,i+1])) + } result name PREV NEXT 1 f1 3 2 5 f2 2 3 1 f1 2 2 5 f2 3 1 1 f1 2 2 5 f2 1 1 (markov - table(result[,PREV], result[,NEXT])) 1 2 3 1 1 0 0 2 0 2 1 3 1 1 0 On 1/21/06, jim holtman [EMAIL PROTECTED] wrote: Is this what you want: set.seed(1001) # Raw data in long format -- raw - data.frame(name=c(f1,f1,f1,f1,f2,f2,f2,f2), year=c(83, 84, 85, 86, 83, 84, 85, 86), state=sample(1:3, 8, replace=TRUE) ) # Shift to wide format -- fixedup - reshape(raw, timevar=year, idvar=name, v.names=state, direction=wide) trans - as.matrix(fixedup) result - NULL for (i in 2:(ncol(trans) - 1)){ result - rbind(result, cbind(name=trans[,1], prev=trans[,i], next=trans[,i+1])) } result markov - table(try$prev.state, try$new.state) On 1/21/06, Ajay Narottam Shah [EMAIL PROTECTED] wrote: Folks, I am holding a dataset where firms are observed for a fixed (and small) set of years. The data is in long format - one record for one firm for one point in time. A state variable is observed (a factor). I wish to make a markov transition matrix about the time-series evolution of that state variable. The code below does this. But it's hardcoded to the specific years that I observe. How might one generalise this and make a general function which does this? :-) -ans. set.seed(1001) # Raw data in long format -- raw - data.frame(name=c(f1,f1,f1,f1,f2,f2,f2,f2), year=c(83, 84, 85, 86, 83, 84, 85, 86), state=sample(1:3, 8, replace=TRUE) ) # Shift to wide format -- fixedup - reshape(raw, timevar=year, idvar=name, v.names=state, direction=wide) # Now tediously build up records for an intermediate data structure try - rbind( data.frame(prev=fixedup$state.83, new=fixedup$state.84), data.frame(prev=fixedup$state.84, new=fixedup$state.85), data.frame(prev=fixedup$state.85, new=fixedup$state.86) ) # This is a bad method because it is hardcoded to the specific values # of year. markov - table(destination$prev.state, destination$new.state) -- Ajay Shah http://www.mayin.org/ajayshah [EMAIL PROTECTED] http://ajayshahblog.blogspot.com *(:-? - wizard who doesn't know the answer. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Coercing a list to integer?
?unlist y - list(1,2,3,4,5) y [[1]] [1] 1 [[2]] [1] 2 [[3]] [1] 3 [[4]] [1] 4 [[5]] [1] 5 unlist(y) [1] 1 2 3 4 5 On 1/18/06, Norman Goodacre [EMAIL PROTECTED] wrote: Dear group, I am nearly beside myself. After an entire night spent on a niggling little detail, I am no closer to to the truth. I loaded an Excel file in .csv form into R. It apparentely loads as a list, but not the kind of list you can use. Oh no, it converts into a list that cannot be converted into an integer, numeric, or vector, only a matrix, whihc is useless without integers. How can I get a list of the form [1] 1,2,3,4,5 into the form [1] 1 [2] 2 [3] 3 [4] 4 [5] 5? Depending on hwo you define a list, apparentely, it goes one way or the other. x - list(1:5) means you have [1] 1,2,3,4,5 y - list(1,2,3,4,5) means you have [1] 1 [2] 2 [3] 3 [4] 4 [5] 5 Can anyone help?# I woudl greatly appreciate it. Sincerely, Norman Goodacre - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Saving data in an R package - how to maintain that t avariable is a 'factor' when it is coded as 1, 2, 3...
colClasses x - read.table('clipboard', header=T, colClasses=c(rep('factor',4),rep('numeric',4))) x Pig Evit Cu Litter Start Weight Feed Time 1 46011 1 1 26.5 26.5NA1 2 46011 1 1 26.5 27.5 5.252 3 46011 1 1 26.5 36.5 17.603 4 46011 1 1 26.5 40.2 28.504 str(x) `data.frame': 4 obs. of 8 variables: $ Pig : Factor w/ 1 level 4601: 1 1 1 1 $ Evit : Factor w/ 1 level 1: 1 1 1 1 $ Cu: Factor w/ 1 level 1: 1 1 1 1 $ Litter: Factor w/ 1 level 1: 1 1 1 1 $ Start : num 26.5 26.5 26.5 26.5 $ Weight: num 26.5 27.6 36.5 40.3 $ Feed : numNA 5.2 17.6 28.5 $ Time : num 1 2 3 4 On 1/13/06, Søren Højsgaard [EMAIL PROTECTED] wrote: I have a .txt file obtained by saving a data frame in which the first four columns are factors (but represented as 1,2,3 etc). The first four lines are Pig Evit Cu Litter Start Weight Feed Time 4601 1 1 1 26.5 26.5 NA 1 4601 1 1 1 26.5 27.5 5.25 2 4601 1 1 1 26.5 36.5 17.6 3 4601 1 1 1 26.5 40.2 28.5 4 I would like to include that data set in an R-package. When I load the data from the package the first four columns are read in as numeric variables. This is consistent with the documentation of read.table - but it is not what I want! I can of course change the coding of the variables, but there ought to be another way. Can anyone help me on that? Best regards Søren Højsgaard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Datetimes differences
Is this what you want? toto$ans - difftime(x,y) toto T1 T2 ans 1 12/31/03 23:49 1/1/04 0:58 69 21/1/04 1:14 1/1/04 1:16 2 31/1/04 0:02 NA On 1/11/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I want to obtain datetime differences in mins in an other column, in front of my datetimes. I have tried this : T1 - c(12/31/03 23:49,1/1/04 1:14,1/1/04 0:02) T2 - c(1/1/04 0:58,1/1/04 1:16,) toto - data.frame(T1,T2) toto y - strptime(T1,%m/%d/%y %H:%M) x - strptime(T2,%m/%d/%y %H:%M) difftime(x,y) but, i don't know how can i do in order to obtain something like this : ans - c(69,2,NA) res - data.frame(T1,T2,ans) res what is to be done ? Thanks. Florent Bonneu Laboratoire de Statistique et Probabilités bureau 148 bât. 1R2 Université Toulouse 3 118 route de Narbonne - 31062 Toulouse cedex 9 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Suggestion for big files [was: Re: A comment about R:]
If what you are reading in is numeric data, then it would require (807 * 118519 * 8) 760MB just to store a single copy of the object -- more memory than you have on your computer. If you were reading it in, then the problem is the paging that was occurring. You have to look at storing this in a database and working on a subset of the data. Do you really need to have all 807 variables in memory at the same time? If you use 'scan', you could specify that you do not want some of the variables read in so it might make a more reasonably sized objects. On 1/5/06, François Pinard [EMAIL PROTECTED] wrote: [ronggui] R's week when handling large data file. I has a data file : 807 vars, 118519 obs.and its CVS format. Stata can read it in in 2 minus,but In my PC,R almost can not handle. my pc's cpu 1.7G ;RAM 512M. Just (another) thought. I used to use SPSS, many, many years ago, on CDC machines, where the CPU had limited memory and no kind of paging architecture. Files did not need to be very large for being too large. SPSS had a feature that was then useful, about the capability of sampling a big dataset directly at file read time, quite before processing starts. Maybe something similar could help in R (that is, instead of reading the whole data in memory, _then_ sampling it.) One can read records from a file, up to a preset amount of them. If the file happens to contain more records than that preset number (the number of records in the whole file is not known beforehand), already read records may be dropped at random and replaced by other records coming from the file being read. If the random selection algorithm is properly chosen, it can be made so that all records in the original file have equal probability of being kept in the final subset. If such a sampling facility was built right within usual R reading routines (triggered by an extra argument, say), it could offer a compromise for processing large files, and also sometimes accelerate computations for big problems, even when memory is not at stake. -- François Pinard http://pinard.progiciels-bpi.ca __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Plotting the mean of data
Here is one way of doing it: x # your data subject week value 111 4 221 8 331 3 441 5 512 5 622 6 732 2 842 6 913 3 10 23 7 11 33 3 12 43 7 x.mean - tapply(x$value, x$week, mean) # take the mean by 'week' x.mean # resulting vector 123 5.00 4.75 5.00 # plot using 'names' to get the 'week' plot(as.numeric(names(x.mean)), x.mean, xlab='week', ylab='mean', type='b') On 1/2/06, Kare Edvardsen [EMAIL PROTECTED] wrote: Hi all! I've got a datstructure like this: subject week value 1 1 4 2 1 8 3 1 3 4 1 5 1 2 5 2 2 6 3 2 2 4 2 6 1 3 3 2 3 7 3 3 3 4 3 7 I'd like to plot the mean of 'value' against week. Is there a direct way of doing this or do I have to make a new structure with the calculated values and then plot it? All the best! -- ### Kare Edvardsen [EMAIL PROTECTED] Norwegian Institute for Air Research (NILU) Polarmiljosenteret NO-9296 Tromso http://www.nilu.no Swb. +47 77 75 03 75 Dir. +47 77 75 03 90 Fax. +47 77 75 03 76 Mob. +47 90 74 60 69 ### __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Count or summary data
Here is one way and how to access the data: x [,1] [,2] [1,] -1 0.05 [2,]1 0.05 [3,]1 0.00 [4,]0 0.05 [5,] -1 0.00 [6,]0 0.10 [7,]1 0.10 [8,] -1 0.00 [9,] -1 0.10 [10,]0 0.05 [11,]0 0.10 [12,] -1 0.10 [13,]1 0.00 [14,] -1 0.05 [15,]1 0.00 tapply(x[,1], x[,2], table) $0 -1 1 2 3 $0.05 -1 0 1 2 2 1 $0.1 -1 0 1 2 2 1 y - tapply(x[,1], x[,2], table) y[[0]][-1] -1 2 y[[0.1]][0] 0 2 On 12/30/05, Xiyan Lon [EMAIL PROTECTED] wrote: Dear all, I want to summary and count my data something like te.Ce [,1] [,2] [1,] -1 0.05 [2,]1 0.05 [3,]1 0.00 [4,]0 0.05 [5,] -1 0.00 [6,]0 0.10 [7,]1 0.10 [8,] -1 0.00 [9,] -1 0.10 [10,]0 0.05 [11,]0 0.10 [12,] -1 0.10 [13,]1 0.00 [14,] -1 0.05 [15,]1 0.00 How could I count (summary) all my data which I need the result like for 0.05 -1 0 1 2 2 1 for 0.00 -1 0 1 2 0 3 for 0.10 -1 0 1 2 2 1 I have tried with summary but I did not find what I need. Maybe someone could help me. Happy new year. Xiyan Lon __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Axis/Ticks/Scale
Here is one way to do it with a smaller set of data, but the 'range' is the same: x - c(1,1000,100) y - pretty(range(x)) y [1] 0e+00 2e+05 4e+05 6e+05 8e+05 1e+06 plot(x,1:3,xaxt='n', xlab=X * 10^5) axis(1, at=y, labels=y/10) On 12/28/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Dear All, Apologies for this simple question and thanks in advance for any help given. Suppose I wanted to plot 1 million observations and produce the command plot(rnorm(100)) The labels of the xaxis are 0, e+00 2 e+05 etc. These are clearly not very attractive (The plots are for a PhD. thesis). I would like the axes to be 0,2,4,6,8,10 with a *10^5 on the right hand side. Is there a simple command for this? Best Wishes Roger __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to plot curves with more than 8 colors
It might be hard to differentiate between each color with having 255 of them. Here is an example of using ColorRampPalette to create a set of colors. You can experiment with as many differing ones as you want to to get the difference that you want: # use 5 colors to create a sequence (you can add more colors if you want to) f.c - colorRampPalette(c('red','green','orange','blue','yellow'))(255) plot(0,type='n', ylim=c(0,255), xlim=c(0,1)) for (i in 1:255) segments(0, i, 1, i, col=f.c[i]) On 12/27/05, Vincent Deng [EMAIL PROTECTED] wrote: Hi, Thanks for your kindly reply. I think maybe I didn't specify color codes properly. That is,the difference between each color is not sharp enough for me to identify them as different colors. So can you tell me about how to specify the color properly so that the difference among each color can be identified clearly? Thanks again and again ... On 12/27/05, Uwe Ligges [EMAIL PROTECTED] wrote: Vincent Deng wrote: Dear Uwe, Sorry, I did not describe my question clearly. I created a matrix to store color code using rgb function. abc = rgb(6:36,0,0,maxColorValue = 255) And after running codes like this for (i in c(1:20)) { points(...,...,col=abc[i]) lines(...,col=abc[i]) } R still used 8 colors of abc color codes repeatedly to draw the diagram Any helps? No, it does not (in fact, all appears to be more or less black on my screen ;-)). Another example: plot(1:255, col=rgb(1:255,0,0,maxColorValue = 255)) Uwe Ligges Best Regards... On 12/27/05, Uwe Ligges [EMAIL PROTECTED] wrote: Vincent Deng wrote: Hi, I'm a new hand in R language. I have about 20 groups of data[x,y] and want to plot them on a graph. To do this, I write a for-loop as following: (some codes are omitted for simplicity) for (i in c(1:20)) { points(...,...,col=i) lines(...,col=i) } The problem is R only plot them with 8 colors repeatly. Could anyone help me solve this problem? Or is there any package providing plot function without color limit? After typing ?colors I get a nice help page that points me to a lot of other functions that generate more than 8 colors. Maybe your installation of R is broken and you cannot see this help page? You certainly tried to get help on colors as well. There is no limit of the color number in the functions above, simply specify the color you want to get. The only color limit applies for the device and for most devices and rgb colors this is 256^3. Uwe Ligges Best Regards... __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] data frame
Here is one way. You can change depending on what you want the offsets to be: s4 - seq(length=10, from=1, by=5) s4 [1] 1 6 11 16 21 26 31 36 41 46 f.x - function(vec, n) c(rep(0,n), vec)[1:length(vec)] f.x(s4,2) [1] 0 0 1 6 11 16 21 26 31 36 df - data.frame(s4=s4, s4.1=f.x(s4,2), s4.2=f.x(s4,4)) df s4 s4.1 s4.2 1 100 2 600 3 1110 4 1660 5 21 111 6 26 166 7 31 21 11 8 36 26 16 9 41 31 21 10 46 36 26 df$sum - rowSums(df) df s4 s4.1 s4.2 sum 1 100 1 2 600 6 3 1110 12 4 1660 22 5 21 111 33 6 26 166 48 7 31 21 11 63 8 36 26 16 78 9 41 31 21 93 10 46 36 26 108 On 12/22/05, Rhett Eckstein [EMAIL PROTECTED] wrote: Dear R users: s4 - seq(length=10, from=1, by=5) s-data.frame(s4,s4,s4) I would like to do some modification to s. And I want the form like the following,if it is possible, how should I do? The last column is the sum of previous three column. s4 s4.1 s4.2sum 1 11 26 6 3 11112 4 16622 5 21 111 33 6 26 166 48 7 3121 11 63 8 3626 16 78 9 4131 21 93 10 4636 26 108 Thanks for any help !! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reading long matrix
Here is a way of reading the data into a 'list'. You can convert the list to any array of the proper dimensions. input - scan('/tempxx.txt.r', what='') Read 21 items input [1] SPECIES1 999001099 900110109 011101000 901100101 110100019 [7] 901110019 SPECIES2 99999 900110119 011101100 901010101 [13] 11019 90019 SPECIES3 999001099 900100109 011100010 [19] 901100100 110100019 901110019 # find the names breaks - grep([[:alpha:]][[:alnum:]]+, input) # determine the sizes map - cbind(breaks, diff(c(breaks, length(input)+1))) out - list() # repeat for each data block for (i in 1:nrow(map)){ + .set - NULL + for (j in 1:(map[i, 2] - 1)){ + .set - rbind(.set, strsplit(input[map[i, 1] + j], '')[[1]]) + } + out[[input[map[i, 1 - .set + } out $SPECIES1 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 9 9 9 0 0 1 0 9 9 [2,] 9 0 0 1 1 0 1 0 9 [3,] 0 1 1 1 0 1 0 0 0 [4,] 9 0 1 1 0 0 1 0 1 [5,] 1 1 0 1 0 0 0 1 9 [6,] 9 0 1 1 1 0 0 1 9 $SPECIES2 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 9 9 9 0 0 0 0 9 9 [2,] 9 0 0 1 1 0 1 1 9 [3,] 0 1 1 1 0 1 1 0 0 [4,] 9 0 1 0 1 0 1 0 1 [5,] 1 1 0 0 0 0 0 1 9 [6,] 9 0 0 0 0 0 0 1 9 $SPECIES3 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 9 9 9 0 0 1 0 9 9 [2,] 9 0 0 1 0 0 1 0 9 [3,] 0 1 1 1 0 0 0 1 0 [4,] 9 0 1 1 0 0 1 0 0 [5,] 1 1 0 1 0 0 0 1 9 [6,] 9 0 1 1 1 0 0 1 9 On 12/22/05, Colin Beale [EMAIL PROTECTED] wrote: Hi, I'm needing some help finding a function to read a large text file into an array in R. The data are essentially presence / absence / na data for many species and come as a grid with each species name (after two spaces) at the beginning of the matrix defining the map for that species. An excerpt could therefore be: SPECIES1 999001099 900110109 011101000 901100101 110100019 901110019 SPECIES2 99999 900110119 011101100 901010101 11019 90019 SPECIES3 999001099 900100109 011100010 901100100 110100019 901110019 where 9 is actually na, 0 is absence and 1 presence. The final array I want to create should have dimensions that are the x and y coordinates and the number of species (known in advance). (In this example dim = c(9,6,3)). It would be sort of neat if the code could also read the species name into the appropriate names attribute, but this is a refinement that I could probably do if someone can help me read the data into R and into an array in the first place. I'm currently thinking a line by line approach using readLines might be the best option, but I've got a very long file - well over 100 species, each a matrix of 70 x 100 datapoints. making this option rther time consuming, I expect - especially as the next dataset has 1300 species and a much larger grid... Any hints would be gratefully recieved. Colin Beale Macaulay Land Use Research Institute __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help to find only one class and differennt class
try this: set.seed(1) # generate some test data x.1 - data.frame(seg=sample(1:6,20,T), class=sample(c('good', 'poor'),20,T)) x.1 (x.sp - split(x.1, x.1$seg)) # test each segment for occurance of class. lapply(x.sp, function(.seg){ if (all(.seg$class == 'good')) return('good') if (all(.seg$class == 'poor')) return('poor') return(good poor) }) On 12/20/05, Muhammad Subianto [EMAIL PROTECTED] wrote: Dear R users, I have a problem, which I can not find a solution. Probably someone could help me? I have a result from my classification, like this credit.toy [[1]] age married ownhouse income gender class 1 20-30 no nolow male good 2 40-50 no yes medium female good [[2]] age married ownhouse income gender class 1 20-30 yes yes high male poor 2 20-30 no yes high male good 3 20-30 yes nolow female poor 4 60-70 yes yeslow female poor 5 60-70 no yes high male poor [[3]] age married ownhouse income gender class 1 30-40 yes no high male good 2 20-30 no yes medium female good [[4]] age married ownhouse income gender class 1 50-60 yes yeslow female poor 2 40-50 yes no medium male poor 3 20-30 no no high female poor [[5]] age married ownhouse income gender class 1 40-50 no yeslow female good 2 60-70 no yes medium male poor 3 30-40 yes no high female poor [[6]] age married ownhouse income gender class 1 30-40 no no medium female good 2 50-60 yes yes high female good 3 30-40 yes no high female good credit.toy[[5]]$class [1] good poor poor Levels: good poor How can I count there are only one class and differennt class. I need the result something like good class : 1,3,6 poor class : 4 good and poor class : 2,5 Thanks in advance. Sincerely, Muhammad Subianto __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] given a mid-month date, get the month-end date
Here is one way using POSIX: (you can create a function to do this) x - as.POSIXlt('2005-12-16') # a date x [1] 2005-12-16 dput(x) #structure of the date structure(list(sec = 0, min = 0, hour = 0, mday = 16, mon = 11, year = 105, wday = 5, yday = 349, isdst = 0), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) x$mday - 1 # reset to first of the month seq(x, by='month', length=2)[2] # select 2nd number in the sequence [1] 2006-01-01 EST On 12/19/05, t c [EMAIL PROTECTED] wrote: I have a vector of dates. I wish to find the month end date for each. Any suggestions? e.g. For 12/15/05, I want 12/31/05, For 10/15/1995, I want 10/31/1995, etc __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] given a mid-month date, get the month-end date
Forgot you were asking for the end date, so just subtract a day: seq(x, by='month', length=2)[2] - 24*3600 [1] 2005-12-31 EST On 12/19/05, jim holtman [EMAIL PROTECTED] wrote: Here is one way using POSIX: (you can create a function to do this) x - as.POSIXlt('2005-12-16') # a date x [1] 2005-12-16 dput(x) #structure of the date structure(list(sec = 0, min = 0, hour = 0, mday = 16, mon = 11, year = 105, wday = 5, yday = 349, isdst = 0), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt)) x$mday - 1 # reset to first of the month seq(x, by='month', length=2)[2] # select 2nd number in the sequence [1] 2006-01-01 EST On 12/19/05, t c [EMAIL PROTECTED] wrote: I have a vector of dates. I wish to find the month end date for each. Any suggestions? e.g. For 12/15/05, I want 12/31/05, For 10/15/1995, I want 10/31/1995, etc __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] unable to force the vector format
Not too sure what you want the means of, but try ?colMeans ?rowMeans On 12/14/05, Charles Plessy [EMAIL PROTECTED] wrote: Dear all, I am so ashamed to pollute the list with a trivial question, but it is a long time I have not used R, and I need a result in the next one or two hour... I have a table which I have loaded with read.table, and I want to make the mean of its columns. slides - read.table(slides.txt) slides [1:5,] V1V2 V3 V4 V5 V6 V7 V8 1 PLB00090AA02 0.147 0.018 0.046 0.064 -0.018 -0.008 -0.063 2 PLB00090BC08 0.171 0.011 -0.001 0.009 0.052 0.032 -0.065 3 PLB00090CG02 0.029 -0.014 -0.042 0.006 0.024 -0.009 -0.043 4 PLB00091AA08 0.033 0.050 -0.022 -0.002 0.038 0.015 -0.037 5 PLB00091BE02 0.183 0.039 0.052 -0.014 -0.034 -0.037 0.037 but I can not get the mean : mean(slides [1,2:8]) V2 V3 V4 V5 V6 V7 V8 0.147 0.018 0.046 0.064 -0.018 -0.008 -0.063 obviously, I fail to tell R that I am using a vector. y- c(1,2,3,4) mean(y) [1] 2.5 but as.vector does not solve my problem lapply(as.vector(slides[1,2:8]),sum) $V2 [1] 0.147 $V3 [1] 0.018 $V4 [1] 0.046 $V5 [1] 0.064 $V6 [1] -0.018 $V7 [1] -0.008 $V8 [1] -0.063 In the end, I would like to use lapply to fill a new column in the table with the means. (and then extract the closest ones to zero...) Once again, sorry for this mail, whose answer is probably trivial, but it would be an enormous help if somebody could sent it to me! -- Charles __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Manipulating matrices
aggregate(x$varA, by=list(x$year, x$name), mean) On 12/13/05, Sérgio Nunes [EMAIL PROTECTED] wrote: Hi, I'm pretty new to R and I've been having some problems filtering data in matrices. I have the following initial dataset: || year | name | varA || I have multiple values for varA for the same year and the same name. Having this as the input I would like to obtain the following: || year | name | {varA mean} || Where I only have one line for each year and name with the mean of the values of varA in varA mean. Is there a simple way to achieve this without using control structures (for or while cycles)? Thanks in advance for any help. Best regards, Sérgio Nunes __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] append
Here is a way: x - 1:10 mySeq - c(42,42) byTwo - rep(seq(length(x) + 1), each=2)[1:length(x)] y - lapply(split(x, byTwo), function(a){ c(a, mySeq) }) unlist(y) On 12/10/05, Judy Chung [EMAIL PROTECTED] wrote: Dear R users: append(1:5, 0:1, after=2) [1] 1 2 0 1 3 4 5 If I want to repeat the appended value every 2 like the following: [1] 1 2 0 1 3 4 0 1 5 How should I modify? Thank you for any help. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] What is wrong with this FOR-loop?
try this: (you were trying to index with non-integer numbers) run_rows-seq(0,1,0.05) run_cols-seq(0.3,0.6,0.05) res-matrix(NA,length(run_rows),length(run_cols)) for(i in 1:length(run_rows)) { for(j in 1:length(run_cols)) { res[i,j] = run_rows[i] + run_cols[j] #niether the above, nor res[[i,j]]=i+j work, why? } } On 12/5/05, Serguei Kaniovski [EMAIL PROTECTED] wrote: Hi, I have a more complex example, but the problem boils down to this FOR-loop not filling in the res-matrix run_rows-seq(0,1,0.05) run_cols-seq(0.3,0.6,0.05) res-matrix(NA,length(run_rows),length(run_cols)) for(i in run_rows) { for(j in run_cols) { res[i,j]=i+j #niether the above, nor res[[i,j]]=i+j work, why? } } Thank you, Serguei __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] what is best for scripting?
A lot is personal preference. I use Perl since it is very good at doing any of the preprocessing that I need to setup data for R; e.g., reading a text file, extracting the data and then setting it up in a format for R. I can then invoke R from the script. Here is a sample of come Perl code that actually calls some other Perl scripts and then invokes R (this script logs onto a UNIX system, extracts some data, preprocesses with other Perl scripts and then calls R with a set of commands with the appropriate substitutions: use Net::Telnet; my $t = Net::Telnet-new(Timeout = 900, Host = $ftpName); #setup the host $t-login($user, $password); #login print STDERR $t-cmd(acctcom $perfDir/pacct.$yesterday /tmp/pacct.$yesterday.$hostname); # # create the FTP commands to xfer the data # my $ftp = Net::FTP-new($ftpName) or die FTP new; $ftp-login($user, $password) or die 'login'; $ftp-cwd($perfDir) or die 'cwd'; ftpGet($ftp, SRVL_serval_${yesterday}.longps, longps.$hostname.$yesterday); ftpGet($ftp, SRVL_serval_${yesterday}.sar, sar.$hostname.$yesterday); ftpGet($ftp, /tmp/pacct.$yesterday.$hostname, pacct.$yesterday.$hostname); ftpGet($ftp, /u01/WIData/audit/user.log, user.log); ftpGet($ftp, /u01/WIData/audit/userdet.log, userdet.log); $ftp-quit(); system(perl /perf/bin/sarextr.pl sar.$hostname.$yesterday); system(perl /perf/bin/driveuse.pl sar.$hostname.$yesterday); system(awk -f /perf/bin/longps.awk longps.$hostname.$yesterday ps.$hostname.$yesterday); system(perl /perf/bin/psextrV2.pl ps.$hostname.$yesterday); system(perl /perf/bin/sumprocs.pl ps.$hostname.$yesterday.cpu); system(perl /perf/bin/userdet.pl userdet.log userdet.txt); system(perl /perf/bin/acct-extr.pl pacct.$yesterday.$hostname); open RBATCH, rcmds.txt or die RBATCH; # # create the R command file # print RBATCH EOF; setwd('/perf/data') load('/perf/data/.RData') # restore environment my.stats('start') postscript(file=proc.$hostname.$yesterday.ps, width=10, height=7.5, horizontal=TRUE, family=Courier) plot.procs(ps.$hostname.$yesterday) plot.sar(sar.$hostname.$yesterday.srx) source('/perf/bin/Accounting Functions.R') acct.log - acct.read.file(pacct.${yesterday}.${hostname}.pacct) acct.proc.cpu(acct.log, sar.overlay=c(sar.$hostname.$yesterday.srx,2)) acct.cpu.polygon(acct.log[acct.log\$cmd=='wiqt', ]) acct.count.polygon(acct.log, y.limit=500) plot.srx.file(sar.$hostname.$yesterday.srx, post.script=F, layout=c(5,5)) dev.off() my.stats('done') EOF print STDERR $t-cmd(rm /tmp/pacct.$yesterday.$hostname); $t-close(); close RBATCH; system(\C:/Program Files/R/rw2010/bin/Rterm.exe\ --max-mem-size=512M --no-save rcmds.txt $hostname.$yesterday.batch); On 12/2/05, Molins, Jordi [EMAIL PROTECTED] wrote: I am using R in Windows. I see that I will have to use batch processes with R. I will have to read and write text files, and run some R code; probably some external code too. I have never done scripting. Is there any document that explains simple steps with examples? I also have heard that Python is a good scripting language. Is it worth the effort? (I do not have too much free time, so if I could do without, much better ...). Has anybody strong opinions on that? Past experiences? Thank you! Jordi The information contained herein is confidential and is inte...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fw: Re: Is there anything like a write.fwf() or possibility to print adata.frame without rownames?
Why don't you just use 'sprintf' and set the format up to what you want: x - head(iris) cat(sprintf(%6.1f%6.1f%6.1f%6.1f %-20s\n, x[,1], x[,2], x[,3], x[,4], x[,5]), sep='') 5.1 3.5 1.4 0.2 setosa 4.9 3.0 1.4 0.2 setosa 4.7 3.2 1.3 0.2 setosa 4.6 3.1 1.5 0.2 setosa 5.0 3.6 1.4 0.2 setosa 5.4 3.9 1.7 0.4 setosa On 11/22/05, Gabor Grothendieck [EMAIL PROTECTED] wrote: On 11/22/05, Gorjanc Gregor [EMAIL PROTECTED] wrote: Hi can you please explain why do you need it? What do you want to do with the exported file? I wonder what type of software can not accept any reasonable delimiter and requires fwf files. The only workaround I can imagine is to transfer all columns to character and add leading spaces to each item which is shorter than longest item in specified column to equalize length of all items in column and then use write.table( tab, file.txt, sep= , row.names=F) as suggested. But I still wonder why? OK, I did not want to be to specific, but here it goes. I am using some special software for variance component estimation and prediction in genetics. Programs are VCE and PEST (http://w3.tzv.fal.de/%7Eeg/) and they both read data in FW (fixed width) format. For those programs you can only give data in such format and it is really tedious to do so, but that is the way it is. If you used e.g. write.table( tab, file.xls, sep=\t, row.names=F) you can open it directly by spreadsheet program just by clickung on it and everything shall be properly aligned. I am fully aware of this, but I do need FW format. I will try with sprintf(), but this looks very though for me. Petr Pikal wrote: Hi did you tried something like write.table( tab, file.txt, sep=\t, row.names=F) which writes to tab separated file? Petr thanks, but I do not want a tab delimited file. I need spaces between columns. write.table( tab, file.txt, sep=, row.names=F) Can it do what you want? Ronggui thanks, but this does not work also. For example I get something like this bellow 26 1 42 DA DA lipa Monika 26 1 42 DA DA lipa Monika 27 1 41 DA DA smreka Monika 27 1 41 DA DA smreka Monika and you can see, that there is a problem, when all values in a column do not have the same length. I need to get 26 1 42 DA DA lipa Monika 26 1 42 DA DA lipa Monika 27 1 41 DA DA smreka Monika 27 1 41 DA DA smreka Monika i.e. columns should be properly aligned. Try this: irish - head(iris) write(t(apply(irish, 2, format)), file = , ncol = ncol(irish)) 5.1 3.5 1.4 0.2 setosa 4.9 3.0 1.4 0.2 setosa 4.7 3.2 1.3 0.2 setosa 4.6 3.1 1.5 0.2 setosa 5.0 3.6 1.4 0.2 setosa 5.4 3.9 1.7 0.4 setosa __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] modify boxplot
Without modifying the code, you can take the output from 'boxplot(...plot=F)', change the values in the 'stats' object within the value returned, and then pass it to 'bxp'; x - boxplot(yourdata,...,plot=F) x$stats[5,] - quantile(yourdata, .9) bxp(x) On 11/21/05, alessandro carletti [EMAIL PROTECTED] wrote: Hi everybody, I'm trying to modify the boxplot just to set the upper whisker to the 90 percentile value, but I still couldn't find the solution. Can anyone help me? Thanks Alessandro Carletti __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Converting numeric variable to time
x.f - function(x) x %/% 100 + (x %% 100) / 60 x.f(c(727, 1134)) [1] 7.45000 11.56667 I assume for the elapsed time you can just subtract the numbers. On 11/17/05, Ravi Varadhan [EMAIL PROTECTED] wrote: Hi, I have a data set where times of sample collection are (for some unknown reasons) stored as numerical values such as 727, 1134, etc., to represent the times 7:27, 11:34, etc. How can I recover these times from the numerical values? Also, can I obtain the elapsed time between two intervals? Any help is greatly appreciated. Thanks, Ravi. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] change some levels of a factor column in data frame according to a condition
try this: # create data x.by http://x.by - data.frame(crit1=rep(c(1,2),c(10,10)), crit2=sample(letters[1:4],20,T), val=runif(20)) levels(x.by$crit2) - c(levels(x.by$crit2), 'small') # add 'small' to the levels y - by(x.by http://x.by, x.by$crit1, function(.grp){ .small - order(.grp$val) # find the smallest values .grp$crit2[.small[1:min(2,length(.small))]] - 'small' # make sure we don't exceed vector .grp }) do.call('rbind', y) # put it back together On 11/14/05, Gesmann, Markus [EMAIL PROTECTED] wrote: Dear R-users, I am looking for an elegant way to change some levels of a factor column in data frame according to a condition. Lets look at the following data frame: data.frame(crit1=gl(2,5), crit2=factor(letters[1:10]), x=rnorm(10)) crit1 crit2 x 1 1 a -1.06957692 2 1 b 0.24368402 3 1 c -0.24958322 4 1 d -1.37577955 5 1 e -0.01713288 6 2 f -1.25203573 7 2 g -1.94348533 8 2 h -0.16041719 9 2 i -1.91572616 10 2 j -0.20256478 Now I would like to find for each level in crit1 the two smallest values of x and change the levels of crit2 to small, so the result would look like this: crit1 crit2 x 1 1 small -1.06957692 2 1 b 0.24368402 3 1 c -0.24958322 4 1 small -1.37577955 5 1 e -0.01713288 6 2 f -1.25203573 7 2 small -1.94348533 8 2 h -0.16041719 9 2 small -1.91572616 10 2 j -0.20256478 Thank you for advice! Markus Gesmann LNSCNTMCS01*** The information in this E-Mail and in any attachments is CON...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] computation on a table
This will work if you are using matrices (if you have data frames, convert them to matrix): table1 q1 q3 q4 q8 q9 A 5 2 0 1 3 B 2 0 2 4 4 table2 q1 q2 q3 q4 q5 q6 q7 q8 q9 C 10 7 4 2 6 9 3 1 2 index - match(colnames(table2), colnames(table1), nomatch=0) t(t(table1[,index]) / table2[index != 0, drop=FALSE]) q1 q3 q4 q8 q9 A 0.5 0.5 0 1 1.5 B 0.2 0.0 1 4 2.0 On 11/12/05, Claus Atzenbeck [EMAIL PROTECTED] wrote: Hello, I have a table (1) of the form q1 q3 q4 q8 q9 A 5 2 0 1 3 B 2 0 2 4 4 I have another table (2): q1 q2 q3 q4 q5 q6 q7 q8 q9 C 10 7 4 2 6 9 3 1 2 I would like to divide the numbers in table (1) by the number of the appropriate column in table (2): q1 q3 q4 q8 q9 A 5/10 2/4 0/2 1/1 3/2 B 2/10 0/4 2/2 4/1 4/2 The result would look lie this: q1 q3 q4 q8 q9 A 0.5 0.5 0 1 1.5 B 0.2 0 1 4 2 BACKGROUND: I have a data frame with measured times for answering questions. I want to know how many PERCENT of the answers are wrong, caused by reason A or B. This gives me the subset of false answers. The table looks like table (1): fail - subset(questions, type==wrong) fail$qid - factor(fail$qid) failtab - table(fail$failtype, fail$qid) The following gives me information about how often a specific question was asked. This is similar to table (2) above. count - table(questions$failtype, questions$qid) count - colSums(count) One solution would be to delete the line that calls factor(...) on the subset and calculate failtab/count. However, then I have the problem that I have to get rid of all columns of the table that have '0' in all rows. Thanks for any hint. Claus __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] dataframe without repetition
toto[!duplicated(apply(toto,1,paste, collapse='.')),] id dpt 1 id1 13 3 id2 34 4 id3 30 On 11/9/05, Bruno Cutayar [EMAIL PROTECTED] wrote: Hello, with a data.frame like this : toto - data.frame (id=c(id1,id1,id2,id3,id3,id3),dpt=c(13,13,34,30,30,30)) toto id dpt 1 id1 13 2 id1 13 3 id2 34 4 id3 30 5 id3 30 6 id3 30 what is the most efficient ways to obtain : id dpt 1 id1 13 2 id2 34 3 id3 30 ? thanks in advance for your reply Bruno Si vous n'etes pas destinataires de ce message, merci d'avertir l'expediteur de l'erreur de distribution et de le detruire immediatement. Ce message contient des informations confidentielles ou appartenant a La Francaise des Jeux. Il est etabli a l'intention exclusive de ses destinataires. Toute divulgation, utilisation, diffusion ou reproduction (totale ou partielle) de ce message ou des informations qu'il contient, doit etre prealablement autorisee. Tout message electronique est susceptible d'alteration et son integrite ne peut etre assuree. La Francaise des Jeux decline toute responsabilite au titre de ce message s'il a ete modifie ou falsifie. If you are not the intended recipient of this e-mail, please notify the sender of the wrong delivery and delete it immediately from your system. This e-mail contains confidential information or information belonging to La Francaise des Jeux and is intended solely for the addressees. The unauthorised disclosure, use, dissemination or copying (either whole or partial) of this e-mail, or any information it contains, is prohibited. E-mails are susceptible to alteration and their integrity cannot be guaranteed. La Francaise des Jeux shall not be liable for this e-mail if modified or falsified. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Using split and sapply to return entire lines
Not the most forward way, but it works: y - lapply(split(seq(x$month), x$month), function(.x){ .max - which.max(x$length[.x]) x[.x[.max],] }) do.call('rbind', y) y - lapply(split(seq(x$month), x$month), function(.x){ data.frame(month=x$month[.x[1]], length=mean(x$length[.x]), ratio=mean(x$ratio[.x]), monthly1=mean(x$monthly1[.x]), monthly2=mean(x$monthly2[.x])) }) do.call('rbind', y) On 11/8/05, Todd A. Gibson [EMAIL PROTECTED] wrote: Hello, I have a data manipulation problem that I can easily resolve by using perl or python to pre-process the data, but I would prefer to do it directly in R. Given, for example: month length ratio monthly1 monthly2 1 Jan 23 0.1 9 6 2 Jan 45 0.2 9 6 3 Jan 16 0.3 9 6 4 Feb 14 0.2 1 9 5 Mar 98 0.4 2 2 6 Mar 02 0.6 2 2 (FWIW, monthly1 and monthly2 are unchanged for each month) I understand how to do aggregations on single fields using split and sapply, but how can I get entire lines. For example, For the maximum of data$length grouped by data$month I would like to get back some form of: 2 Jan 45 0.2 9 6 4 Feb 14 0.2 1 9 5 Mar 98 0.4 2 2 For mean, I would like to average all columns: Jan 28 0.2 9 6 Feb 14 0.2 1 9 Mar 50 0.5 2 2 Thank you, -TAG Todd A. Gibson __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Anything like associative arrays in R?
Is this what you want? x - list() for (i in c('test', 'some', 'more')){ for(j in c('lv1', 'lv2', 'lv3')){ x[[i]][[j]] - runif(10) } } x x[['some']][['lv2']] On 11/2/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Let me preface my question by stressing that I am much less interested in the answer than in learning a way I could have *found the answer myself*. (As helpful as the participants in this list are, I have far too many R-related questions to resolve by posting here, and as I've written before, in my experience the R documentation has not been very helpful, but I remain hopeful that I may have managed to miss some crucial document.) The task I want to accomplish is very simple: to define and sequentially initialize M x N variables *programmatically*, according to two different categories, containing N and M values, respectively. In languages with associative arrays, the typical way to do this is to define a 2-d associative array; e.g. in Perl one could do for $i ( 'foo', 'bar', 'baz' ) { for $j ( 'eenie', 'meenie', 'minie', 'moe' ) { $table{ $i }{ $j } = read_table( path/to/data/${i}_${j}.dat ); } } How does one do this in R? In particular, what's the equivalent of the above in R? Most importantly, how could I have found out this answer from the R docs? Many thanks in advance, kj __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Matrix operations please help
This will give you a matrix with the row/column names of the respective values. Not sure what you meant by a list, but you can convert the matrix to a list. m2 a b c d e A 1 5 9 13 17 B 2 6 10 14 18 C 3 7 11 15 19 D 4 8 12 16 20 cbind(row=rownames(m2)[row(m2)[lower.tri(m2)]], + col=colnames(m2)[col(m2)[lower.tri(m2)]], + value=m2[lower.tri(m2)]) row col value [1,] B a 2 [2,] C a 3 [3,] D a 4 [4,] C b 7 [5,] D b 8 [6,] D c 12 On 10/31/05, Srinivas Iyyer [EMAIL PROTECTED] wrote: Dear Group, I am a novice R programmer with little statistical background. I am a molecular biologist by training. I generated a correlation matrix (157 X 157) for 157 variables. I want to selection only the unique values (values that are either side of the diagnol). I want these unique correltaion values in a list. How can I do this. could any one help me please. thank you. sr __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] splitting a character field in R
x - 'dfabcxy' strsplit(x, 'abc') [[1]] [1] df xy On 10/28/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Dear R users, I have a dataframe with one character field, and I would like to create two new fields (columns) in my dataset, by spliting the existing character field into two using an existing substring. ... something that in SAS I could solve e.g. combining substr(which I am aware exist in R) and index for determining the position of the pattern within the string. e.g. if my dataframe is ... A B 1 dgabcrt 2 fgrtabc 3 sabcuuu Then by splitting by substring abc I would get ... A B B1 B2 1 dgabcrt dg rt 2 fgrtabc fgrt 3 sabcuuu s uuu Do you know how to do this basic string(dataframe) manipulation in R Saludos, Manuel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] splitting a character field in R
x - c('dfabcxt','wwabc','abcyy','xyz') (y - strsplit(x, 'abc')) [[1]] [1] df xt [[2]] [1] ww [[3]] [1] yy [[4]] [1] xyz sapply(y, [, 1) # first column [1] df ww xyz sapply(y, [, 2) # second column [1] xt NA yy NA On 10/28/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi Jim, Thanks for your post, I was aware of strsplit, but really could not find out how i could use it. I tried like in your example ... A-c(1,2,3) B-c(dgabcrt,fgrtabc,sabcuuu) C-strsplit(B,abc) C [[1]] [1] dg rt [[2]] [1] fgrt [[3]] [1] s uuu Which looks promissing, but here C is a list with three elements. But how to create the two vectors I need from here, that is (dg,fgrt, s) and (rt,,uuu) (or how to get access to the substrings rt or uuu). Greetings Manuel jim holtman [EMAIL PROTECTED] To: [EMAIL PROTECTED] [EMAIL PROTECTED] om cc: r-help@stat.math.ethz.ch Subject: Re: [R] splitting a character field in R 28.10.2005 16:00 x - 'dfabcxy' strsplit(x, 'abc') [[1]] [1] df xy On 10/28/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Dear R users, I have a dataframe with one character field, and I would like to create two new fields (columns) in my dataset, by spliting the existing character field into two using an existing substring. ... something that in SAS I could solve e.g. combining substr(which I am aware exist in R) and index for determining the position of the pattern within the string. e.g. if my dataframe is ... A B 1 dgabcrt 2 fgrtabc 3 sabcuuu Then by splitting by substring abc I would get ... A B B1 B2 1 dgabcrt dg rt 2 fgrtabc fgrt 3 sabcuuu s uuu Do you know how to do this basic string(dataframe) manipulation in R Saludos, Manuel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Repeating lines in a data frame
Try this: x year species length count 1 1998 1 150 1 2 1998 2 200 1 3 1998 3 250 2 4 1999 1 150 3 5 1999 2 200 4 6 1999 3 250 5 7 2000 1 150 1 8 2000 2 200 1 9 2000 3 250 1 10 2001 1 150 2 11 2001 2 200 3 12 2001 3 250 1 13 2002 1 150 1 14 2002 2 200 2 15 2002 3 250 3 y - unlist(lapply(seq(nrow(x)), function(.row)rep(.row, x$count[.row]))) # replicate the row numbers y [1] 1 2 3 3 4 4 4 5 5 5 5 6 6 6 6 6 7 8 9 10 10 11 11 11 12 13 14 14 15 15 15 result - x[y,] # pick out the rows result$count - 1 # set the count to 1 result year species length count 1 1998 1 150 1 2 1998 2 200 1 3 1998 3 250 1 3.1 1998 3 250 1 4 1999 1 150 1 4.1 1999 1 150 1 4.2 1999 1 150 1 5 1999 2 200 1 5.1 1999 2 200 1 5.2 1999 2 200 1 5.3 1999 2 200 1 6 1999 3 250 1 6.1 1999 3 250 1 6.2 1999 3 250 1 6.3 1999 3 250 1 6.4 1999 3 250 1 7 2000 1 150 1 8 2000 2 200 1 9 2000 3 250 1 10 2001 1 150 1 10.1 2001 1 150 1 11 2001 2 200 1 11.1 2001 2 200 1 11.2 2001 2 200 1 12 2001 3 250 1 13 2002 1 150 1 14 2002 2 200 1 14.1 2002 2 200 1 15 2002 3 250 1 15.1 2002 3 250 1 15.2 2002 3 250 1 On 10/18/05, Guenther, Cameron [EMAIL PROTECTED] wrote: Hello, I have a much larger dataset that is similar in form to: year species length count 1998 1 150 1 1998 2 200 1 1998 3 250 2 1999 1 150 3 1999 2 200 4 1999 3 250 5 2000 1 150 1 2000 2 200 1 2000 3 250 1 2001 1 150 2 2001 2 200 3 2001 3 250 1 2002 1 150 1 2002 2 200 2 2002 3 250 3 What I want is to have a line of data for each year x species x length group combination I would like the ouput to be: Year species length count 1998 1 150 1 1998 2 200 1 1998 3 250 1 1998 3 250 1 1999 1 150 1 1999 1 150 1 1999 1 150 1 1999 2 200 1 . . . Can anyone help me with a for statement of a function that can accomplish this? Thanks Cameron Guenther Associate Research Scientist FWC/FWRI, Marine Fisheries Research 100 8th Avenue S.E. St. Petersburg, FL 33701 (727)896-8626 Ext. 4305 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] aggregate slow with many rows - alternative?
Here is the way that I would do it. Using 'lapply' to process the list and create a matrix; take less than 1 second: dat - data.frame(D=sample(32000:33000, 33000, T), + Fid=sample(1:10,33000,T), A=sample(1:5,33000,T)) system.time({ + result - lapply(split(seq(nrow(dat)), dat$D), function(.d){ # split by first level + lapply(split(.d, dat$Fid[.d]), function(.f){ # now by the second + # create the sum and count + c(D=dat$D[.f[1]], Fid=dat$Fid[.f[1]], sum=sum(dat$A[.f]), cnt=length(.f)) + }) + }) + mat - do.call('rbind',lapply(result, function(x) do.call('rbind',x))) + }) [1] 0.66 0.00 0.73 NA NA mat[1:20,] D Fid sum cnt 1 32000 1 8 3 2 32000 2 11 4 3 32000 3 11 3 4 32000 4 2 1 5 32000 5 8 2 6 32000 6 4 2 7 32000 7 21 6 8 32000 8 13 3 9 32000 9 12 4 10 32000 10 10 3 1 32001 1 12 4 2 32001 2 2 1 3 32001 3 10 4 4 32001 4 12 3 5 32001 5 10 3 6 32001 6 8 2 7 32001 7 22 7 8 32001 8 3 2 9 32001 9 7 3 10 32001 10 3 2 On 10/14/05, TEMPL Matthias [EMAIL PROTECTED] wrote: Hi, Yesterday, I have analysed data with 16 rows and 10 columns. Aggregation would be impossible with a data frame format, but when converting it to a matrix with *numeric* entries (check, if the variables are of class numeric!) the computation needs only 7 seconds on a Pentium III. I´m sadly to say, that this is also slow in comparsion with the proc summary in SAS (less than one second), but the code is much more elegant in R! Best, Matthias Hi, I use the code below to aggregate / cnt my test data. It works fine, but the problem is with my real data (33'000 rows) where the function is really slow (nothing happened in half an hour). Does anybody know of other functions that I could use? Thanks, Hans-Peter -- dat - data.frame( Datum = c( 32586, 32587, 32587, 32625, 32656, 32656, 32656, 32672, 32672, 32699 ), FischerID = c( 58395, 58395, 58395, 88434, 89953, 89953, 89953, 64395, 62896, 62870 ), Anzahl = c( 2, 2, 1, 1, 2, 1, 7, 1, 1, 2 ) ) f - function(x) data.frame( Datum = x[1,1], FischerID = x[1,2], Anzahl = sum( x[,3] ), Cnt = dim( x )[1] ) t.a - do.call(rbind, by(dat, dat[,1:2], f)) # slow for 33'000 rows t.a - t.a[order( t.a[,1], t.a[,2] ),] # show data dat t.a __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] color for points
If you know explicitly that there are just 60 points, you can use: plot(csr(),col=c(rep('blue',20), rep('red',20), rep('green',20))) On 10/8/05, Sam R. Smith [EMAIL PROTECTED] wrote: Hi, I have the following code to randomly generate the points: csr -function(n=60){ x=runif(n) y=runif(n) f=cbind(x,y) } plot(csr()) I wonder how to code to make the first twenty points to be BLUE; second twenty points to be RED; the last twenty points to be GREEN? Thanks, Sam - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Interpolation in time
Is this what you want? yr-c(rep(2000,14)) doy-c(16:29) dat-c(3.2,NA,NA,NA,NA,NA,NA,5.1,NA,NA,NA,NA,NA,4.6) ta-cbind(yr,doy,dat) ta yr doy dat [1,] 2000 16 3.2 [2,] 2000 17 NA [3,] 2000 18 NA [4,] 2000 19 NA [5,] 2000 20 NA [6,] 2000 21 NA [7,] 2000 22 NA [8,] 2000 23 5.1 [9,] 2000 24 NA [10,] 2000 25 NA [11,] 2000 26 NA [12,] 2000 27 NA [13,] 2000 28 NA [14,] 2000 29 4.6 good - !is.na(ta[,'dat']) x.f - approxfun(ta[good,'doy'], ta[good,'dat'], rule=2) ta[!good, 'dat'] - x.f(ta[!good, 'doy']) ta yr doy dat [1,] 2000 16 3.20 [2,] 2000 17 3.471429 [3,] 2000 18 3.742857 [4,] 2000 19 4.014286 [5,] 2000 20 4.285714 [6,] 2000 21 4.557143 [7,] 2000 22 4.828571 [8,] 2000 23 5.10 [9,] 2000 24 5.016667 [10,] 2000 25 4.93 [11,] 2000 26 4.85 [12,] 2000 27 4.77 [13,] 2000 28 4.68 [14,] 2000 29 4.60 On 10/6/05, Anette Nørgaard [EMAIL PROTECTED] wrote: Can anybody help me write a code on the following data example, which fills out all NA values by using a linear interpolation with the two closest values? Doy is day of year (%j). Code example: yr-c(rep(2000,14)) doy-c(16:29) dat-c(3.2,NA,NA,NA,NA,NA,NA,5.1,NA,NA,NA,NA,NA,4.6) ta-cbind(yr,doy,dat) ta yr doy dat [1,] 2000 16 3.2 [2,] 2000 17 NA [3,] 2000 18 NA [4,] 2000 19 NA [5,] 2000 20 NA [6,] 2000 21 NA [7,] 2000 22 NA [8,] 2000 23 5.1 [9,] 2000 24 NA [10,] 2000 25 NA [11,] 2000 26 NA [12,] 2000 27 NA [13,] 2000 28 NA [14,] 2000 29 4.6 Anette Norgaard [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] newbie questions - looping through hierarchial datafille
out where inventories/stratums/plots/trees finished and started so I could create summary statistics for each of them. For example, how many plots in a stratum? How many trees in a plot? An example of the sas code I would (not checked for errors!!!). If anybody could give me some idea on what the right approach in R would be for a similar analysis it would be greatly appreciated. regards Andrew Data datafile; infile 'test.txt'; input @1 tag $1. @@; retain inventory stratum plot tree leader; if tag = 'A' then input @3 inventory $.; if tag = 'X' then input @3 stratum_no $. total $. yearest $. ; if tag = 'P' then input @3 plot_no $. age $. slope $. species $; if tag = 'T' then input @3 tree_no $. frequency ; if tag = 'L' then input @3 leader_no $ diameter height ; if tag = 'F' then input @3 start $ finish $ feature $; if tag = 'F' then output; run; proc sort data = datafile; by inventory stratum_no plot_no tree_no leader_no; * calculate mean dbh in each plot data dbh set datafile; by inventory stratum_no plot_no tree_no leader_no if first.leader_no then output; proc summary data = diameter; by inventory stratum plot tree; var diameter; output out = mean mean=; run; A BENALLA_1 X 1 10 YE=1985 P 1 20.25 slope=14 SPP:P.RAD T 1 25 L 0 28.5 21.3528 F 0 21.3528 SFNSW_DIC:P F 21.3528 100 SFNSW_DIC:P T 2 25 L 0 32 23.1 F 0 6.5 SFNSW_DIC:A F 6.5 23.1 SFNSW_DIC:C F 23.1 100 SFNSW_DIC:C T 3 25 L 0 39.5 22.2407 F 0 4.7 SFNSW_DIC:A F 4.7 6.7 SFNSW_DIC:C P 2 20.25 slope=13 SPP:P.RAD T 1 25 L 0 38 22.1474 F 0 1 SFNSW_DIC:G F 1 2.3 SFNSW_DIC:A T 1001 25 L 0 38 22.1474 F 0 1 SFNSW_DIC:G F 1 2.3 SFNSW_DIC:A T 2 25 L 0 32.5 21.7386 F 0 2 SFNSW_DIC:A F 2 3.3 SFNSW_DIC:G F 3.3 10.4 SFNSW_DIC:C X 2 10 YE=1985 P 1 20.25 slope=14 SPP:P.RAD T 1 25 L 0 28.5 21.3528 F 0 21.3528 SFNSW_DIC:P F 21.3528 100 SFNSW_DIC:P T 2 25 L 0 32 23.1 F 0 6.5 SFNSW_DIC:A F 6.5 23.1 SFNSW_DIC:C F 23.1 100 SFNSW_DIC:C T 3 25 L 0 39.5 22.2407 F 0 4.7 SFNSW_DIC:A F 4.7 6.7 SFNSW_DIC:C P 2 20.25 slope=13 SPP:P.RAD T 1 25 L 0 38 22.1474 F 0 1 SFNSW_DIC:G F 1 2.3 SFNSW_DIC:A T 1001 25 L 0 38 22.1474 F 0 1 SFNSW_DIC:G F 1 2.3 SFNSW_DIC:A T 2 25 L 0 32.5 21.7386 F 0 2 SFNSW_DIC:A F 2 3.3 SFNSW_DIC:G F 3.3 10.4 SFNSW_DIC:C [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] missing handling
Use 'which(...arr.ind=T)' x.1 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 6 10 3 4 10 7 9 8 4 10 [2,] 8 7 4 7 4 8 3 NA 3 4 [3,] 7 7 10 10 3 5 3 2 2 2 [4,] 3 4 5 10 10 2 6 9 4 5 [5,] 3 5 9 5 6 NA 3 NA 6 7 [6,] 9 6 10 5 10 4 2 10 NA 5 [7,] 5 2 5 10 3 7 6 4 6 8 [8,] 2 6 1 8 9 2 7 8 3 8 [9,] 9 1 4 9 8 10 2 NA 1 7 [10,] 2 4 8 7 NA 4 3 NA 5 5 x.4 [1] 5.5 5.5 5.0 7.5 8.0 5.0 3.0 8.0 4.0 6.0 Med - apply(x.1, 2, median, na.rm=T) # get median Ind - which(is.na(x.1), arr.ind=T) # determine which are NA x.1[Ind] - Med[Ind[,'col']] # replace with median x.1 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 6 10 3 4 10 7 9 8 4 10 [2,] 8 7 4 7 4 8 3 8 3 4 [3,] 7 7 10 10 3 5 3 2 2 2 [4,] 3 4 5 10 10 2 6 9 4 5 [5,] 3 5 9 5 6 5 3 8 6 7 [6,] 9 6 10 5 10 4 2 10 4 5 [7,] 5 2 5 10 3 7 6 4 6 8 [8,] 2 6 1 8 9 2 7 8 3 8 [9,] 9 1 4 9 8 10 2 8 1 7 [10,] 2 4 8 7 8 4 3 8 5 5 On 9/27/05, Weiwei Shi [EMAIL PROTECTED] wrote: Hi, I have the following codes to replace missing using median, assuming missing only occurs on continuous variables: trn1-read.table('trn1.fv', header=F, na.string='.', sep='|') # median m.trn1-sapply(1:ncol(trn1), function(i) median(trn1[,i], na.rm=T)) #replace trn2-trn1 for (each in 1:nrow(trn1)){ index.missing=which(is.na(trn1[each,])) trn2[each,]-replace(trn1[each,], index.missing, m.trn1[index.missing]) } Anyone can suggest some ways to improve it since replacing 10 takes 1.5sec: system.time(for (each in 1:10){index.missing=which(is.na(trn1[each,])); trn2[each,]-replace(trn1[each,], index.missing, m.trn1[index.missing]);}) [1] 1.53 0.00 1.53 0.00 0.00 Another general question is are there some packages in R doing missing handling? Thanks, -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Automatic creation of file names
Actually it is easier to use 'sprintf' in this situation: sprintf(file%04d, 1:10) [1] file0001 file0002 file0003 file0004 file0005 file0006 file0007 file0008 [9] file0009 file0010 On 9/22/05, Ted Harding [EMAIL PROTECTED] wrote: On 22-Sep-05 Mike Prager wrote: Walter -- P.S. The advantage of using formatC over pasting the digits (1:1000) directly is that when one uses leading zeroes, as in the formatC example shown, the resulting filenames will sort into proper order. ...MHP You can use paste() with something like formatC(number,digits=0,wid=3,flag=0) (where number is your loop index) to generate the filenames. on 9/22/2005 10:21 AM Leite,Walter said the following: I have a question about how to save to the hard drive the one thousand datasets I generated in a simulation. ://www.R-project.org/posting-guide.htmlhttp://project.org/posting-guide.html For this precise question, the replies for filename creation, though useful, have been slightly off-target. Walter may presumably want the ilenames to be in sortable order corresponding to the numerical order of creation, i.e. like file0001 file0002 ... file1000 The precise formatC specification required for this would be formatC(n,digits=0,wid=4,format=d,flag=0) so that formatC(1,digits=0,wid=4,format=d,flag=0) # [1] 0001 - file0001 formatC(999,digits=0,wid=4,format=d,flag=0) # [1] 0999 - file0999 formatC(1000,digits=0,wid=4,format=d,flag=0) # [1] 1000 - file1000 The suggestions with wid=3 would give formatC(999,digits=0,wid=3,format=d,flag=0) # [1] 999 - file999 formatC(1000,digits=0,wid=3,format=d,flag=0) # [1] 1000 - file1000 which are now in the wrong order (since file1000 sorts alphabetically prior to file999. Also, if format=d is not specified we get things like formatC(100,digits=0,wid=3,flag=0) # [1] 1e+02 - file1e+02 which, while a valid filename, is on its head for sorting (since now the exponent sorts fastest!). Best wishes to all, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 22-Sep-05 Time: 17:51:36 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] column-binary data
Each card column had 12 rows, so as binary it comes in as 12 bits. The question is does this come as a 16 bit integer, or a string of 12 bits that I have to extract from. Either case is not that difficult to do. On 9/16/05, Ted Harding [EMAIL PROTECTED] wrote: On 16-Sep-05 David Barron wrote: I have a number of datasets that are multipunch column-binary format. Does anyone have any advice on how to read this into R? Thanks. David Do you mean something like the old HOLLERITH PUNCHED CARD BINARY FORMAT? 1011100110110 0100011001001 010100111001100010011 0010100010101100100101001 0001000110011100011101011 01000100111010010101001110001 01001011010100111010100101101 (here 1 = hole in card, binary representation of 7-bit ASCII encoding, high-order bit on top). If so, or if you precisely describe the binary format you have, then the above or similar should be easy to get into R. Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 16-Sep-05 Time: 19:56:01 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 247 0281 What the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] [newbie] Want to perform some simple regressions.
try 'nls' Here is your data applied to it. It looks like you had an 'exact' fit: x.1[1:10,] V1V2 1 0 2.205955 2 1 8.150580 3 2 15.851324 4 3 22.442796 5 4 29.358580 6 5 36.460605 7 6 43.751692 8 7 51.223688 9 8 58.866102 10 9 66.668220 x.p - nls(V2 ~ (a*V1+b)*log(c*V1+d),x.1,start=list(a=1,b=1,c=1,d=1)) x.p Nonlinear regression model model: V2 ~ (a * V1 + b) * log(c * V1 + d) data: x.1 abcd 1.994722 6.807986 1.495003 1.301922 residual sum-of-squares: 1.006867 On 8/28/05, Thomas Baruchel [EMAIL PROTECTED] wrote: On Sun, Aug 28, 2005 at 09:48:15AM +0200, Thomas Baruchel wrote: Is R the right choice ? Please, could you step by step show me how you would do on this example (data below) in order to let me I forgot my data :-( 0 2.205954909440447 1 8.150580118785099 2 15.851323727378597 3 22.442795956953574 4 29.358579800271354 5 36.46060528847214 6 43.7516923268591 7 51.223688311610026 8 58.86610205087116 9 66.66821956399055 10 74.61990268453171 11 82.71184423952718 12 90.93560520053082 13 99.28356700194489 14 107.74885489906521 15 116.3252559311549 16 125.00714110112291 17 133.78939523822717 18 142.6673553086964 19 151.63675679510055 20 160.69368733376777 21 169.834546691509 22 179.05601219606618 23 188.35500882314003 24 197.72868324657364 25 207.17438125936408 26 216.68962806440814 27 226.2721110130965 28 235.9196644372003 29 245.63025627606442 30 255.40197624835042 31 265.23302535689197 32 275.12170654792556 33 285.06641637317705 34 295.0656375259694 35 305.1179321414606 36 315.2219357669857 37 325.3763519217964 38 335.5799471767038 39 345.8315466936063 40 356.13003017290697 41 366.4743281636434 42 376.8634186969678 43 387.2963242085816 44 397.77210871999046 45 408.2898752521091 46 418.8487634479048 47 429.44794738349896 48 440.08663354951693 49 450.76405898653184 50 461.479489560246 51 472.2322183636179 52 483.02156423451737 53 493.84687037869463 54 504.707503088911 55 515.6028505520102 56 526.5323217365377 57 537.4953453542455 58 548.4913688894654 59 559.5198576909147 60 570.5802941210067 61 581.6721767581994 62 592.7950196483222 63 603.9483516011882 64 615.1317155291274 65 626.3446678243708 66 637.586724806 67 648.8576269992603 68 660.1568089487967 69 671.4839283904737 70 682.838600952985 71 694.2204526835204 72 705.6291196304554 73 717.0642474479981 74 728.5254910213728 75 740.0125141112243 76 751.5249890160294 77 763.062596251391 78 774.6250242451752 79 786.2119690475241 80 797.8231340548524 81 809.4582297469931 82 821.1169734367211 83 832.7990890309349 84 844.5043068028273 85 856.2323631744205 Regards, -- Thomas Baruchel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Convergys +1 513 723 2929 What the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] writing to a fixed format (fortran) file
One way of creating fixed width output is to use 'sprintf' to create the string and write the resulting data out. I used your input and just selected columns 7-11 as an example. You will have to supply whatever field width you want. I create a matrix and null out the row and column names and then write the matrix out without quotes. HTH x.1 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 1 1 1 1 19.5 2.42 0.02 5.81 9.7 0.4 102 4.8 320 4.8 2 2 1 1 0.0 0.00 0.00 0.00 4.7 -4.0 178 5.4 301 0.2 3 3 1 1 8.2 1.64 0.08 6.93 6.9 -3.6 275 2.7 84 -11.1 4 4 1 1 0.0 0.00 0.00 0.00 20.6 -4.8 221 5.6 327 -10.4 5 5 1 1 0.0 0.00 0.00 0.00 11.6 8.2 168 4.3 269 6.8 6 6 1 1 0.0 0.00 0.00 0.00 18.7 16.9 155 5.6 287 8.2 7 7 1 1 0.0 0.00 0.00 0.00 7.0 2.1 195 2.7 22 0.1 8 8 1 1 0.0 0.00 0.00 0.00 17.6 6.5 281 2.0 146 1.5 9 9 1 1 41.2 1.54 0.82 6.96 12.2 7.8 268 5.5 356 4.5 10 10 1 1 0.0 0.00 0.00 0.00 14.6 -1.4 250 3.6 344 6.4 11 11 1 1 0.0 0.00 0.00 0.00 14.5 -3.7 300 0.0 0 -16.9 12 12 1 1 0.0 0.00 0.00 0.00 8.8 -2.6 308 0.0 0 2.9 13 13 1 1 0.0 0.00 0.00 0.00 6.4 1.6 226 3.3 335 3.8 # field widths are 6,6,8,5,7. You also control the number of decimals that appear x.2 - sprintf(%6.2f%6.1f%8.1f%5.0f%7.1f,x.1[,7], x.1[,8], x.1[,9], + x.1[,10], x.1[,11]) x.2 - as.matrix(x.2) # convert to a character matrix dimnames(x.2) - list(rep('', nrow(x.2)), '') # blank row and column names noquote(x.2) 5.81 9.7 0.4 1024.8 0.00 4.7-4.0 1785.4 6.93 6.9-3.6 2752.7 0.00 20.6-4.8 2215.6 0.00 11.6 8.2 1684.3 0.00 18.716.9 1555.6 0.00 7.0 2.1 1952.7 0.00 17.6 6.5 2812.0 6.96 12.2 7.8 2685.5 0.00 14.6-1.4 2503.6 0.00 14.5-3.7 3000.0 0.00 8.8-2.6 3080.0 0.00 6.4 1.6 2263.3 On 8/27/05, Jean Eid [EMAIL PROTECTED] wrote: why not write.table with sep=\t On Sat, 27 Aug 2005, Duncan Golicher wrote: Could anyone help with what should be a simple task? I have data as a fixed format (fortran) table. I have no trouble getting it into R using read.table. Each column is separated by a space, including the first column that begins with a space, and aligned. It reads into R as if separated by tabs. However I want to manipulate two columns of data then write the results out into exactly the same fortran format for use in another program. It should be simple, but I've tried a variety of experiments with print, cat and format, none of which have come close. Here is a sample of the data. 1 11 19.5 2.42 0.02 5.81 9.7 0.4 102. 4.8 320. 4.8 2 11 0.0 0.00 0.00 0.00 4.7 -4.0 178. 5.4 301. 0.2 3 11 8.2 1.64 0.08 6.93 6.9 -3.6 275. 2.7 84. -11.1 4 11 0.0 0.00 0.00 0.00 20.6 -4.8 221. 5.6 327. -10.4 5 11 0.0 0.00 0.00 0.00 11.6 8.2 168. 4.3 269. 6.8 6 11 0.0 0.00 0.00 0.00 18.7 16.9 155. 5.6 287. 8.2 7 11 0.0 0.00 0.00 0.00 7.0 2.1 195. 2.7 22. 0.1 8 11 0.0 0.00 0.00 0.00 17.6 6.5 281. 2.0 146. 1.5 9 11 41.2 1.54 0.82 6.96 12.2 7.8 268. 5.5 356. 4.5 10 11 0.0 0.00 0.00 0.00 14.6 -1.4 250. 3.6 344. 6.4 11 11 0.0 0.00 0.00 0.00 14.5 -3.7 300. 0.00. -16.9 12 11 0.0 0.00 0.00 0.00 8.8 -2.6 308. 0.00. 2.9 13 11 0.0 0.00 0.00 0.00 6.4 1.6 226. 3.3 335. 3.8 -- Dr Duncan Golicher Ecologia y Sistematica Terrestre Conservación de la Biodiversidad El Colegio de la Frontera Sur San Cristobal de Las Casas, Chiapas, Mexico Email: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Convergys +1 513 723 2929 What the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Re-sort list of vectors
Not that I like loops, but here is a quick and dirty way of doing it: Result - list() for (i in names(x)){ for (j in names(x[[i]])){ Result[[j]][[i]] - x[[i]][[j]] } } On 8/15/05, Liaw, Andy [EMAIL PROTECTED] wrote: You could try using one of the sparse representations of matrices in the SparseM or Matrix packages. Both packages have vignettes. Andy From: Jan Hummel Thanks a lot! But unfortunately I will not know the dimensions of both lists. And further, the lists may be (partly) disjoint as: x - list(1=c(a=1, b=2, c=3), 2=c(d=4, b=5, e=6)). And last but not least I'm really have to have access to the names of the named list items. The problem I dealt with is in unlist() merging the names together, as you can see in your example given: V1, V2 and V3. Because off interpreting the names later as identifiers in db queries I'm really interested in getting something like list(a=c(1=1), b=c(1=2, 2=5), c=c(1=3), d=c(1=4), e=c(1=6)) for the above input. By giving the result this way I'm able to extract both names from two sets as well as the according value between both items. One point could be to build a matrix but this matrix would have many NA's. So I prefer Lists of Lists. Any ideas? cheers Jan -Ursprüngliche Nachricht- Von: Liaw, Andy [mailto:[EMAIL PROTECTED] Gesendet: Montag, 15. August 2005 17:31 An: Jan Hummel; r-help@stat.math.ethz.ch Betreff: RE: [R] Re-sort list of vectors If all vectors in the list have the same length, why not use a matrix? Then you'd just transpose the matrix if you need to. If you really have to have it as a list, here's one possibility: x - list(1=c(a=1, b=2, c=3), 2=c(a=4, b=5, c=6)) x $1 a b c 1 2 3 $2 a b c 4 5 6 as.list(as.data.frame(t(matrix(unlist(x), nrow=3 $V1 [1] 1 4 $V2 [1] 2 5 $V3 [1] 3 6 Andy From: Jan Hummel Hi. Can anyone suggest a simple way to re-sort in R a list of vectors of the following form? input $1 a b c 1 2 3 $2 a b c 4 5 6 Output should be something like: a 1 1 2 4 b 1 2 2 5 c 1 3 2 6 I've been futzing with mapply(), outer(), split(), rbind() and so on but haven't found an elegant solution. Thanks, Jan. P.S. E-mailed CCs of posted replies appreciated. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Convergys +1 513 723 2929 What the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] a question about data manipulation
use 'split' x.1 - data.frame(COL1=1:50, COL2=50:1, id=sample(1:4,50,T)) x.2 - split(x.1, x.1$id) str(x.2) List of 4 $ 1:`data.frame': 10 obs. of 3 variables: ..$ COL1: int [1:10] 5 10 11 12 22 24 27 34 38 47 ..$ COL2: int [1:10] 46 41 40 39 29 27 24 17 13 4 ..$ id : int [1:10] 1 1 1 1 1 1 1 1 1 1 $ 2:`data.frame': 13 obs. of 3 variables: ..$ COL1: int [1:13] 1 2 14 16 19 25 26 28 30 31 ... ..$ COL2: int [1:13] 50 49 37 35 32 26 25 23 21 20 ... ..$ id : int [1:13] 2 2 2 2 2 2 2 2 2 2 ... $ 3:`data.frame': 14 obs. of 3 variables: ..$ COL1: int [1:14] 3 8 9 13 17 23 32 36 39 42 ... ..$ COL2: int [1:14] 48 43 42 38 34 28 19 15 12 9 ... ..$ id : int [1:14] 3 3 3 3 3 3 3 3 3 3 ... $ 4:`data.frame': 13 obs. of 3 variables: ..$ COL1: int [1:13] 4 6 7 15 18 20 21 29 35 37 ... ..$ COL2: int [1:13] 47 45 44 36 33 31 30 22 16 14 ... ..$ id : int [1:13] 4 4 4 4 4 4 4 4 4 4 ... names(x.2) [1] 1 2 3 4 x.2[['1']] COL1 COL2 id 5 5 46 1 10 10 41 1 11 11 40 1 12 12 39 1 22 22 29 1 24 24 27 1 27 27 24 1 34 34 17 1 38 38 13 1 47 474 1 x.2[['3']] COL1 COL2 id 3 3 48 3 8 8 43 3 9 9 42 3 13 13 38 3 17 17 34 3 23 23 28 3 32 32 19 3 36 36 15 3 39 39 12 3 42 429 3 44 447 3 45 456 3 49 492 3 50 501 3 On 8/2/05, qi zhang [EMAIL PROTECTED] wrote: Dear R-user, I have a simple question, I just can't figure out a easy way to handle it. My importing data x is like this: COL1 COL2 id 1 12 49 1 2 70 120 1 3 58 124 1 51 14 13 2 52 88 100 2 53 90 134 2 I want to change the format of the data, i want to group data into differenct part according id,so that when i use x[1], which will refer me to the information about first id.I use the command: list(list(N=2,n=c(100,150),matrix(c(x[x$id==1,][,1],x[x$id==1,][,2]),nr=2,nc=3)),list(N=2,n=c(100,150),matrix(c(x[x$id==2,][,1],x[x$id==2,][,2]),nr=2,nc=3))) so the data becomes : [[1]] [[1]]$N [1] 2 [[1]]$n [1] 100 150 [[1]][[3]] [,1] [,2] [,3] [1,] 12 58 120 [2,] 70 49 124 [[2]] [[2]]$N [1] 2 [[2]]$n [1] 100 150 [[2]][[3]] [,1] [,2] [,3] [1,] 14 90 100 [2,] 88 13 134 This is the format I want, but problem is that for my data, id is not only 1 to 2,but 1 to 100, so my code is not efficient. Could you help me find a efficient way? Thanks. Qi Zhang PhD student, University of Cincinnati [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Convergys +1 513 723 2929 What the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] elegant solution to transform vector into percentages?
use 'cut': store-c(1,1.4,3,1.1,0.3,0.6,4,5) x.1 - cut(store, breaks=c(-Inf,.8,1.2,Inf)) table(x.1)/length(x.1)*100 x.1 (-Inf,0.8] (0.8,1.2] (1.2,Inf] 25 25 50 On 7/26/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi, I am looking for an elegant way to transform a vector into percentages of values that meet certain criteria. store-c(1,1.4,3,1.1,0.3,0.6,4,5) # now I want to get the precentages of values # that fall into the categories =M , M =N , N # let M -.8 N - 1.2 # In my real example I have many more of these cutoff-points # What I did is: out - matrix(NA,1,3) out[1,1] - ( (sum(store=M)) /length(store) )*100 out[1,2] - ( (sum(store M store= N )) /length(store) )*100 out[1,3] - ( (sum(store N)) /length(store) )*100 colnames(out)-c(percent=M,percentM =N,percentN) out But this gets very tedious if I have many cutoff-points. Does anybody know a more elegant way to do this task? Thanks so much. Cheers, Jens __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman What the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html