Re: [R] Bug (?) in read.fwf
have you tried as.is=TRUE On Nov 8, 2007 6:20 AM, [EMAIL PROTECTED] wrote: Hi, I'm trying to use read.fwf temp = read.fwf (Raw data.txt, widths = c (11, 21, 10, rep (16, 6)) ,skip = 2, n = 2, stringsAsFactors = FALSE, strip.white = TRUE) but no matter what I do the strings are turned into factors. I believe it's the n=2 parameter that causes the problem as it seems to work without this. Am I missing something? Thanks in advance, David Jessop Issued by UBS AG or affiliates to professional investors for information only and its accuracy/completeness is not guaranteed. All opinions may change without notice and may differ to opinions/recommendations expressed by other business areas of UBS. UBS may maintain long/short positions and trade in instruments referred to. Unless stated otherwise, this is not a personal recommendation, offer or solicitation to buy/sell and any prices/quotations are indicative only. UBS may provide investment banking and other services to, and/or its employees may be directors of, companies referred to. To the extent permitted by law, UBS does not accept any liability arising from the use of this communication. (c) 2007 UBS. All rights reserved. Intended for recipient only and not for further distribution without the consent of UBS. UBS Limited is a company registered in England Wales under company number 2035362, whose registered office is at 1 Finsbury Avenue, London, EC2M 2PP, United Kingdom. UBS AG (London Branch) is registered as a branch of a foreign company under number BR004507, whose registered office is at 1 Finsbury Avenue, London, EC2M 2PP, United Kingdom. UBS Clearing and Execution Services Limited is a company registered in England Wales under company number 03123037, whose registered office is at 1 Finsbury Avenue, London, EC2M 2PP, United Kingdom. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pattern matching accross multiple matrices
You are putting your results back into A which might change things as you execute. This might be a faster way: result - matrix(NA,dim(A)[1], dim(A)[2]) # now compute the cases result[(A ==1) (D == 1) (P ==1)] - Case1 result[(A == -1) (D == -1) (P == -1)] - Case2 ... On Nov 8, 2007 12:27 PM, Martin Tomko [EMAIL PROTECTED] wrote: Hi all, I have a set of patterns which can occur in a series of (3) matrices. I want to identify those and create a fourth one with the identifiers of the cases. Something like: for (i in 1:l) { for (j in 1:w) { A[A[i,j]==1 D[i,j]==1 P[i,j]==1] - Case1; A[A[i,j]==-1 D[i,j]==-1 P[i,j]==-1] - Case2; etc } } the code seems to run, but is very slow Could anyone please suggest a better approach? I was thinking that 3 matrices could be stacked in a cube, and the column of a cube searched for a pattern, but am not sure how to do that... Thanks Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate percentages in a table of data
This will do the calculations and the plot: x - scan(textConnection(255 0 255 0 255 255 255 0 255 0 + 255 255 255 255 0 255 255 0 255 0 + 255 255 255 255 255 255 255 0 255 0 + 255 255 255 255 0 255 255 0 255 0 + 255 255 0 255 255 255 255 0 255 0 + 255 255 255 0 255 0 0 255 0 255), what=0) Read 60 items x - matrix(x, ncol=10, byrow=TRUE) num.zeros - apply(x, 1, function(z) sum(z == 0)) num.zeros * 10 [1] 40 30 20 30 30 40 plot(num.zeros * 10, type='o') On Nov 8, 2007 1:54 PM, Luca Penasa [EMAIL PROTECTED] wrote: Hi everybody, Im a newbie, but i hope someone can help me in this work... Ill try to explain what i need to do in the best way, but my english is not good... Iv imported a big table of data, this table is something like this: 255 0 255 0 255 255 255 0 255 0 255 255 255 255 0 255 255 0 255 0 255 255 255 255 255 255 255 0 255 0 255 255 255 255 0 255 255 0 255 0 255 255 0 255 255 255 255 0 255 0 255 255 255 0 255 0 0 255 0 255 I need to calculate for every row the number of cells with 255 and the number of cells with 0... from this values i would like to obtain the percentage of 0 presents in the row after i want to plot the data in a graph showing the variations of this percentage along the rows... What i want to obtain is an array of this type: 40 30 20 30 30 40 Someone can give me a hint on how to obtain this?? maybe anybody can suggest me the functions i could use... what software do you suggest me for plot the data?? i was thinking in gnuplot so i can plot the graph in svg format... Please help me thank you Luca Penasa, geology student, University of Padua, Italy -- Email.it, the professional e-mail, gratis per te: http://www.email.it/f Sponsor: Fai conoscere la tua azienda con l'invio di newsletter e campagne email marketing. Con soli 250 Euro incrementi la tua visibilità! * Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=7150d=8-11 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create an array of list?
I think something like this is what you are after. This will create 7 pairs of lists with the parameters that I think you want. I don't have the data (if you want to sent it to me, I may be able to test it) so you will have to test it yourself. # create a list for the results result - vector('list' 7) for (n in 1:7){ # initialize the pair of list in the result result[[n]] - vector('list', 2) for (ii in 1:2){ sublist - vector('list', 3) # for the parameters for (jj in 1:3){ if(cc[n, ii, jj] == 0) sublist[[jj]] - levels(MyModel[, jj]) else sublist[[jj]] - cc[n, ii, jj] } names(sublist) - names(MyModel) result[[n]][[ii]] - sublist } } str(result) # see what it looks like On Nov 8, 2007 6:31 PM, Gang Chen [EMAIL PROTECTED] wrote: Thanks again for the response! For example, I want to run the following contrast(fit.lme, list(Trust=U, Sex=levels(Model$Sex), Freq=levels(Model$Freq)), list(Trust=T, Sex=levels(Model$Sex), Freq=levels(Model$Freq))) The 2nd and 3rd arguments are two lists that I'm trying to construct based on the data frame 'Model'. Of course I could provide the two lists explicitly as the above command. However for a general usage, I would like to build the two lists from the user's input. That is how the issue of creating an array of list came about. In the example I provided, it would run 7 separate contrasts line the one shown above, each of which contains 2 lists, and each list has 3 named components (Trust, Sex, and Freq) each of which is of unequal components (depending on the contrast specification). And that is why I wanted to have an array of 7 X 2 X 3. Hope this is clearer. Any better solutions? Thanks, Gang On Nov 8, 2007, at 6:18 PM, jim holtman wrote: I am still not sure what you expect as output. Can you provide an example of what you think that you need. What is it that you are trying to construct? How do you then plan to use them? There might be other ways of going about it if we knew what the intent was -- what is the structure that you are trying to create? The code that you have is probably having problems with the number of elements in the replacement, so to see what the alternatives are, can you give an explicit example of what you would like as an outcome and then how you intend to use it. On Nov 8, 2007 5:19 PM, Gang Chen [EMAIL PROTECTED] wrote: Thanks for the response! I want to create those lists so that I could use them in a function ('contrast' in contrast package) as arguments. Any suggestions? Thanks, Gang On Nov 8, 2007, at 5:12 PM, jim holtman wrote: Can you tell us what you want to do, and not how you want to do it. Without the data it is hard to see. Some of your indexing probably does not have the correct number of parameters when trying to do the replacement. An explanation of what you expect the output to be would be useful in determining what the script might look like. On Nov 8, 2007 4:51 PM, Gang Chen [EMAIL PROTECTED] wrote: I have trouble creating an array of lists? For example, I want to do something like this clist - array(data=NA, dim=c(7, 2, 3)); for (n in 1:7) { for (ii in 1:2) { for (jj in 1:3) { if (cc[n, ii, jj] == 0) { clist[n, ii, ][[jj]] - list(levels(MyModel[,colnames(MyModel)[jj]])); } else { clist[n, ii, ][[jj]] - cc[n, ii, jj]; } names(clist[n, ii, ][[jj]]) - colnames(MyModel)[jj]; } } } but I get an error: Error in `*tmp*`[n, ii, ] : incorrect number of dimensions Is it because each list has different number of components? The two variables involved in the loop, character matrix cc and dataframe MyModel are shown below: cc , , 1 [,1] [,2] [1,] U T [2,] 0 0 [3,] 0 0 [4,] 0 0 [5,] U T [6,] U T [7,] U T , , 2 [,1] [,2] [1,] 0 0 [2,] M F [3,] 0 0 [4,] 0 0 [5,] 0 0 [6,] 0 0 [7,] 0 0 , , 3 [,1] [,2] [1,] 0 0 [2,] 0 0 [3,] Lo Hi [4,] No Hi [5,] Hi Hi [6,] Lo Lo [7,] No No MyModel Trust Sex Freq 1 T F Hi 2 T F Hi 3 T F Hi 4 T F Hi 5 T F Hi 6 T F Hi 7 T F Hi 8 T F Hi 9 T F Lo 10 T F Lo 11 T F Lo 12 T F Lo 13 T F Lo 14 T F Lo 15 T F Lo 16 T F Lo 17 T F No 18 T F No 19 T F No 20 T F No 21 T F No 22 T F No 23 T F No 24 T F No 25 T M Hi 26 T M Hi 27 T M Hi 28 T M Hi 29 T M Hi 30 T M Hi 31 T M Hi 32 T M Hi 33 T M Lo 34 T M Lo 35 T M Lo 36 T M Lo 37 T M Lo 38 T M Lo 39 T
Re: [R] How to create an array of list?
I am still not sure what you expect as output. Can you provide an example of what you think that you need. What is it that you are trying to construct? How do you then plan to use them? There might be other ways of going about it if we knew what the intent was -- what is the structure that you are trying to create? The code that you have is probably having problems with the number of elements in the replacement, so to see what the alternatives are, can you give an explicit example of what you would like as an outcome and then how you intend to use it. On Nov 8, 2007 5:19 PM, Gang Chen [EMAIL PROTECTED] wrote: Thanks for the response! I want to create those lists so that I could use them in a function ('contrast' in contrast package) as arguments. Any suggestions? Thanks, Gang On Nov 8, 2007, at 5:12 PM, jim holtman wrote: Can you tell us what you want to do, and not how you want to do it. Without the data it is hard to see. Some of your indexing probably does not have the correct number of parameters when trying to do the replacement. An explanation of what you expect the output to be would be useful in determining what the script might look like. On Nov 8, 2007 4:51 PM, Gang Chen [EMAIL PROTECTED] wrote: I have trouble creating an array of lists? For example, I want to do something like this clist - array(data=NA, dim=c(7, 2, 3)); for (n in 1:7) { for (ii in 1:2) { for (jj in 1:3) { if (cc[n, ii, jj] == 0) { clist[n, ii, ][[jj]] - list(levels(MyModel[,colnames(MyModel)[jj]])); } else { clist[n, ii, ][[jj]] - cc[n, ii, jj]; } names(clist[n, ii, ][[jj]]) - colnames(MyModel)[jj]; } } } but I get an error: Error in `*tmp*`[n, ii, ] : incorrect number of dimensions Is it because each list has different number of components? The two variables involved in the loop, character matrix cc and dataframe MyModel are shown below: cc , , 1 [,1] [,2] [1,] U T [2,] 0 0 [3,] 0 0 [4,] 0 0 [5,] U T [6,] U T [7,] U T , , 2 [,1] [,2] [1,] 0 0 [2,] M F [3,] 0 0 [4,] 0 0 [5,] 0 0 [6,] 0 0 [7,] 0 0 , , 3 [,1] [,2] [1,] 0 0 [2,] 0 0 [3,] Lo Hi [4,] No Hi [5,] Hi Hi [6,] Lo Lo [7,] No No MyModel Trust Sex Freq 1 T F Hi 2 T F Hi 3 T F Hi 4 T F Hi 5 T F Hi 6 T F Hi 7 T F Hi 8 T F Hi 9 T F Lo 10 T F Lo 11 T F Lo 12 T F Lo 13 T F Lo 14 T F Lo 15 T F Lo 16 T F Lo 17 T F No 18 T F No 19 T F No 20 T F No 21 T F No 22 T F No 23 T F No 24 T F No 25 T M Hi 26 T M Hi 27 T M Hi 28 T M Hi 29 T M Hi 30 T M Hi 31 T M Hi 32 T M Hi 33 T M Lo 34 T M Lo 35 T M Lo 36 T M Lo 37 T M Lo 38 T M Lo 39 T M Lo 40 T M Lo 41 T M No 42 T M No 43 T M No 44 T M No 45 T M No 46 T M No 47 T M No 48 T M No 49 U F Hi 50 U F Hi 51 U F Hi 52 U F Hi 53 U F Hi 54 U F Hi 55 U F Hi 56 U F Hi 57 U F Lo 58 U F Lo 59 U F Lo 60 U F Lo 61 U F Lo 62 U F Lo 63 U F Lo 64 U F Lo 65 U F No 66 U F No 67 U F No 68 U F No 69 U F No 70 U F No 71 U F No 72 U F No 73 U M Hi 74 U M Hi 75 U M Hi 76 U M Hi 77 U M Hi 78 U M Hi 79 U M Hi 80 U M Hi 81 U M Lo 82 U M Lo 83 U M Lo 84 U M Lo 85 U M Lo 86 U M Lo 87 U M Lo 88 U M Lo 89 U M No 90 U M No 91 U M No 92 U M No 93 U M No 94 U M No 95 U M No 96 U M No Thanks, Gang __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting
Re: [R] Problem with R version 2.6.0
Have you tried using 'setwd'? I have no problem with changing directories and executing scripts. Can you provide an example of the script that you are trying to execute? How does it crash? Does is to it only when you 'source' it? More information is needed. On Nov 9, 2007 10:21 AM, Dimitri Liakhovitski [EMAIL PROTECTED] wrote: I just installed R 2.6.0 (had R 2.5 before). Here is my problem. Usually, when I work with R I first go to File-Change dir and browse to a folder that seats OUTSIDE of the folder C:\Program Files\R\R-2.6.0 and then create my script there (and open and re-open it there). I never had any problems with R 2.4 or R 2.5. However, after I installed R 2.6.0, R crashes every time I try to open a script - if I work outside the R folder. Interestingly, no problems when I work in the folder C:\Program Files\R\R-2.6.0 (and create my new folders and subfolders there). Any advice? Dimitri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to more efficently read in a big matrix
If they are all numeric, then read it in with: x - scan('yourfile', what=0) # assuming blank separators This will create a single vector of the values. Now this comes in in row order if that is what your data file has, so you could just add dimensions of dim(x) - c(487, 238305) rows and columns are transposed, but if you have enough memory, you can transpose them, or just leave the data as is, and change your processing to reorder the rows/cols. This should lets you read it in in the fastest manner and then play with it. On Nov 9, 2007 11:52 PM, affy snp [EMAIL PROTECTED] wrote: Hi Jim, Thanks a lot! I am currently running it on my laptop but without any success. I could upload it to a server which is with 8Gb memory and it might be better to go from there. Actually, I could have the whole file splitted in two parts, one with 2nd column to 95th column, the other one with the rest of columns. However, I need all rows for the two parts. The file is in txt format and around 480Mb, very large though. Yes, it is of numeric values. I appreciate! Allen On Nov 9, 2007 11:46 PM, jim holtman [EMAIL PROTECTED] wrote: If they are all numeric, you can use 'scan' to read them in. With that amount of data, you will need almost 1GB to contain the single object. If you want to do any processing, you will probably need a machine with at least 3-4GB of physical memory, preferrably a 64-bit version of R. What type of computer are you using? Do you really need all the data in at once, or can you process it in smaller batches (e.g., 20,000 rows at a time)? So a little more detail on what you actually want to do with the data would be useful, since it does create a very large object. BTW how large is the file you are reading and what is its format? Have you considered a database with this amount of data? On Nov 9, 2007 11:39 PM, affy snp [EMAIL PROTECTED] wrote: Dear list, I need to read in a big table with 487 columns and 238,305 rows (row names and column names are supplied). Is there a code to read in the table in a fast way? I tried the read.table() but it seems that it takes forever :( Thanks a lot! Best, Allen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to more efficently read in a big matrix
Your data is mixed: numeric and characters/factors. You can use skip=1 to skip the header line, but it looks like the rest is mixed. In you example there are only 5 columns; are you just showing the first 5 columns? if there is the pattern that you show, then you would have a scan like: scan('yourfile', what=list('', 0, '', 0, '')) You can extend the 'what' to the size of the column that you have; e.g. what=c(rep(c(list(''), list(0)), rep=243), list('')) On Nov 10, 2007 12:29 AM, affy snp [EMAIL PROTECTED] wrote: Hi Jim, I tired scan() first and got x - scan(file=243_47mel_withnormal_expression_log2.txt, what=0) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got 'probe_set' So I guess it requires the file be numeric. But I do have row names and header. The real file looks like (I am listing the header and first 4 rows of the file): probe_set WM_806_Signal_A WM_806_call WM_1716_Signal_A WM_1716_call SNP_A-1909444 1.59 B 1.48B SNP_A-2237149 2.24 B 1.87B SNP_A-2118217 2.04 AB 1.70 AB SNP_A-1866065 1.80 NoCall 1.39 A So how can I get rid of the header and row.names to use scan()? Thanks! Allen On Nov 10, 2007 12:18 AM, jim holtman [EMAIL PROTECTED] wrote: Here is an example of reading in file of 3M numbers (11MB of text file) on my laptop: system.time(x - scan('/tempyy', what=0)) Read 300 items user system elapsed 6.220.166.53 str(x) num [1:300] 1 2 3 4 5 6 7 8 9 10 ... gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 169954 4.6 35 9.4 35 9.4 Vcells 3102277 23.77803840 59.6 7200206 55.0 object.size(x) [1] 2424 This took about 7 seconds. You have about 40X more data, so it should be interesting to see how it scales up. The object size if 24MB, so 40X more is about 1GB. On Nov 9, 2007 11:52 PM, affy snp [EMAIL PROTECTED] wrote: Hi Jim, Thanks a lot! I am currently running it on my laptop but without any success. I could upload it to a server which is with 8Gb memory and it might be better to go from there. Actually, I could have the whole file splitted in two parts, one with 2nd column to 95th column, the other one with the rest of columns. However, I need all rows for the two parts. The file is in txt format and around 480Mb, very large though. Yes, it is of numeric values. I appreciate! Allen On Nov 9, 2007 11:46 PM, jim holtman [EMAIL PROTECTED] wrote: If they are all numeric, you can use 'scan' to read them in. With that amount of data, you will need almost 1GB to contain the single object. If you want to do any processing, you will probably need a machine with at least 3-4GB of physical memory, preferrably a 64-bit version of R. What type of computer are you using? Do you really need all the data in at once, or can you process it in smaller batches (e.g., 20,000 rows at a time)? So a little more detail on what you actually want to do with the data would be useful, since it does create a very large object. BTW how large is the file you are reading and what is its format? Have you considered a database with this amount of data? On Nov 9, 2007 11:39 PM, affy snp [EMAIL PROTECTED] wrote: Dear list, I need to read in a big table with 487 columns and 238,305 rows (row names and column names are supplied). Is there a code to read in the table in a fast way? I tried the read.table() but it seems that it takes forever :( Thanks a lot! Best, Allen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to more efficently read in a big matrix
It sounds like the data is not all numeric; you have a 'factor' in your read statement. It also sounds like either some of your lines are incomplete in the number of columns since are you trying to read in a B as a numeric. So if you have a character, then one way of doing it is: x - scan('yourfile', what=c(list(''), rep(list(0), 486))) This will read the first column in as a character and the other 486 as numeric. On Nov 10, 2007 12:19 AM, affy snp [EMAIL PROTECTED] wrote: Thanks Jim. I tried: A-read.table(file=243_47mel_withnormal_expression_log2.txt, +header=TRUE,row.names=1,colClasses=c('factor', rep('numeric',486))) by specifying colClass but it did not work. The error message I got is: A-read.table(file=243_47mel_withnormal_expression_log2.txt,header=TRUE,row.names=1,colClasses=c('factor', rep('numeric',486))) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got 'B' Let me try what you suggested. Thanks! Allen On Nov 10, 2007 12:07 AM, jim holtman [EMAIL PROTECTED] wrote: If they are all numeric, then read it in with: x - scan('yourfile', what=0) # assuming blank separators This will create a single vector of the values. Now this comes in in row order if that is what your data file has, so you could just add dimensions of dim(x) - c(487, 238305) rows and columns are transposed, but if you have enough memory, you can transpose them, or just leave the data as is, and change your processing to reorder the rows/cols. This should lets you read it in in the fastest manner and then play with it. On Nov 9, 2007 11:52 PM, affy snp [EMAIL PROTECTED] wrote: Hi Jim, Thanks a lot! I am currently running it on my laptop but without any success. I could upload it to a server which is with 8Gb memory and it might be better to go from there. Actually, I could have the whole file splitted in two parts, one with 2nd column to 95th column, the other one with the rest of columns. However, I need all rows for the two parts. The file is in txt format and around 480Mb, very large though. Yes, it is of numeric values. I appreciate! Allen On Nov 9, 2007 11:46 PM, jim holtman [EMAIL PROTECTED] wrote: If they are all numeric, you can use 'scan' to read them in. With that amount of data, you will need almost 1GB to contain the single object. If you want to do any processing, you will probably need a machine with at least 3-4GB of physical memory, preferrably a 64-bit version of R. What type of computer are you using? Do you really need all the data in at once, or can you process it in smaller batches (e.g., 20,000 rows at a time)? So a little more detail on what you actually want to do with the data would be useful, since it does create a very large object. BTW how large is the file you are reading and what is its format? Have you considered a database with this amount of data? On Nov 9, 2007 11:39 PM, affy snp [EMAIL PROTECTED] wrote: Dear list, I need to read in a big table with 487 columns and 238,305 rows (row names and column names are supplied). Is there a code to read in the table in a fast way? I tried the read.table() but it seems that it takes forever :( Thanks a lot! Best, Allen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to emerge two tables by taking the ave.
Here is the way to read the data and convert it. Your data was a dataframe with the first column being the id: x - read.table(textConnection(id b1 b2 b3 + a1 246 + a2 12 NA + a3 46 NA), header=TRUE) y - read.table(textConnection(idb1 b2 b3 + a1 NA44 + a2 22 NA + a3 122), header=TRUE) # look at what x y are: str(x) 'data.frame': 3 obs. of 4 variables: $ id: Factor w/ 3 levels a1,a2,a3: 1 2 3 $ b1: int 2 1 4 $ b2: int 4 2 6 $ b3: int 6 NA NA str(y) 'data.frame': 3 obs. of 4 variables: $ id: Factor w/ 3 levels a1,a2,a3: 1 2 3 $ b1: int NA 2 1 $ b2: int 4 2 2 $ b3: int 4 NA 2 # to convert to matrix, get rid of first column x - as.matrix(x[,-1]) y - as.matrix(y[,-1]) z - mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y) dim(z) - dim(x) z [,1] [,2] [,3] [1,] 2.045 [2,] 1.52 NaN [3,] 2.542 is.na(z) - is.nan(z) z [,1] [,2] [,3] [1,] 2.045 [2,] 1.52 NA [3,] 2.542 On Nov 11, 2007 10:47 PM, affy snp [EMAIL PROTECTED] wrote: Hi,Jim. I created two txt files as: x.txt id b1 b2 b3 a1 246 a2 12 NA a3 46 NA y.txt idb1 b2 b3 a1 NA44 a2 22 NA a3 122 I tried it one more time but got different z: x-read.table(file=x.txt,header=TRUE,row.names=1,na.strings = NA) Warning message: In read.table(file = x.txt, header = TRUE, row.names = 1, na.strings = NA) : incomplete final line found by readTableHeader on 'x.txt' x b1 b2 b3 a1 2 4 6 a2 1 2 NA a3 4 6 NA y-read.table(file=y.txt,header=TRUE,row.names=1,na.strings = NA) Warning message: In read.table(file = y.txt, header = TRUE, row.names = 1, na.strings = NA) : incomplete final line found by readTableHeader on 'y.txt' y b1 b2 b3 a1 NA 4 4 a2 2 2 NA a3 1 2 2 z - mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y) dim(z) - dim(x) Error in dim(z) - dim(x) : dims [product 9] do not match the length of object [3] z - mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y) z b1 b2 b3 2.00 3.33 4.00 Allen On Nov 11, 2007 10:41 PM, jim holtman [EMAIL PROTECTED] wrote: What did your text files look like? It would appear that there was not a line feed on the last line of the file. Also what does 'str' of x and y show? It appears that one is a data frame and one is a matrix. That might be causing some of the problems. On Nov 11, 2007 10:30 PM, affy snp [EMAIL PROTECTED] wrote: Hi Jim, Thanks a lot! I am wondering why I ended up getting the result as follows: x-read.table(file=x.txt,header=TRUE,row.names=1,na.strings = NA) Warning message: In read.table(file = x.txt, header = TRUE, row.names = 1, na.strings = NA) : incomplete final line found by readTableHeader on 'x.txt' x b1 b2 b3 a1 2 4 6 a2 1 2 NA a3 4 6 NA y-as.matrix(read.table(file=y.txt,header=TRUE,row.names=1,na.strings = NA)) Warning message: In read.table(file = y.txt, header = TRUE, row.names = 1, na.strings = NA) : incomplete final line found by readTableHeader on 'y.txt' y b1 b2 b3 a1 NA 4 4 a2 2 2 NA a3 1 2 2 z - mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y) z b1 b2 b3 NA NA NA NA NA 2.33 3.50 3.50 2.75 3.50 4.00 2.75 4.00 NA 4.00 dim(z) - dim(x) z [,1] [,2] [,3] [1,] 2.33 2.75 2.75 [2,] 3.50 3.50 4.00 [3,] 3.50 4.00 4.00 is.na(z) - is.nan(z) z [,1] [,2] [,3] [1,] 2.33 2.75 2.75 [2,] 3.50 3.50 4.00 [3,] 3.50 4.00 4.00 Allen On Nov 11, 2007 5:27 PM, jim holtman [EMAIL PROTECTED] wrote: Here is one way of doing it: x [,1] [,2] [,3] [1,]246 [2,]12 NA [3,]46 NA y [,1] [,2] [,3] [1,] NA44 [2,]22 NA [3,]122 z - mapply(function(a,b)mean(c(a,b), na.rm=TRUE), x, y) dim(z) - dim(x) z [,1] [,2] [,3] [1,] 2.045 [2,] 1.52 NaN [3,] 2.542 # to change it to NA is.na(z) - is.nan(z) z [,1] [,2] [,3] [1,] 2.045 [2,] 1.52 NA [3,] 2.542 On Nov 11, 2007 4:52 PM, affy snp [EMAIL PROTECTED] wrote: Dear list, I am new to R and very inexperienced. Sorry for the trouble. I have two txt files and want to merge them by taking the average. More specifically, for example, the txt file1, with row names and column names, consists of 238000 rows and 196 columns. Each column corresponds to a sample. The data is mixed with numeric or NA. So what I plan to do is: (1) Take the 1st column from txt file 1
Re: [R] help in long loops
What happens if you have multiple matches in the comparison between the content_feat and ob_feat? Why don't you just iterate through the content_feat and use 'match' to find the corresponding match in ob_feat? This should speed it up. Also why are you using 'as.matrix' when the values in the 'if' statement are objects of size 1? Are any of the objects dataframes? If so, convert them to matrices for efficiency. On Nov 12, 2007 12:09 PM, Mahmudul Haque [EMAIL PROTECTED] wrote: hi, please help me out in the following case. seems like it stuck in some where(already 7 hrs passed). what I want is to combine 4 matrix in to one matrix of desired length. final_matrix-function(ob_feat,content_feat,link_feat,link_feat_transformed){ complete_feat-matrix(rep(-1),nrow=11402,ncol=278) for(i in 1:8944) {q-c(0) for(j in 1:11402) { if(as.matrix(content_feat[i,2])==as.matrix(ob_feat[j,2])) {complete_feat[i,1]=as.matrix(ob_feat[j,2]) complete_feat[i,2:97]=as.matrix(content_feat[i,3:98]) complete_feat[i,98:99]=as.matrix(ob_feat[j,3:4]) complete_feat[i,100:140]=as.matrix(link_feat[j,3:43]) complete_feat[i,141:278]=as.matrix(link_feat_transformed[j,3:140]) q-1} if(q==1) break; } } for (i in 8945:11402){ complete_feat[i,1]=as.matrix(ob_feat[i,2]) complete_feat[i,98:99]=as.matrix(ob_feat[i,3:4]) complete_feat[i,100:140]=as.matrix(link_feat[i,3:43]) complete_feat[i,141:278]=as.matrix(link_feat_transformed[i,3:140]) } list(complete_feat=complete_feat) } kind regards, mahmudul haque __ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to pool a group of samples and take the ave.
Try something like this: myAvg - rowMeans(A[,48:243]) B - A[1:47,] / myAvg On Nov 12, 2007 1:37 PM, affy snp [EMAIL PROTECTED] wrote: Dear list, Hi! I have a table A, 238304 rows and 243 columns (representing samples). First of all, I would like to pool a group of samples from 48th column to 243rd column and take the average across them and make a single column,saying as the reference column. Second, I want to use each column of first 47 columns in table A divided by the reference column and end up with a new table B with 238304 rows and 47 columns. Is there any simple code which especially could do sth like reference_column-(A[,48]+A[,49]+...A[,243])/196 and B-A[,1:47]/reference_column? Thank you very much for your help! Best, Allen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to pool a group of samples and take the ave.
Was supposed to be: B - A[,1:47] / myAvg On Nov 12, 2007 3:10 PM, jim holtman [EMAIL PROTECTED] wrote: Try something like this: myAvg - rowMeans(A[,48:243]) B - A[1:47,] / myAvg On Nov 12, 2007 1:37 PM, affy snp [EMAIL PROTECTED] wrote: Dear list, Hi! I have a table A, 238304 rows and 243 columns (representing samples). First of all, I would like to pool a group of samples from 48th column to 243rd column and take the average across them and make a single column,saying as the reference column. Second, I want to use each column of first 47 columns in table A divided by the reference column and end up with a new table B with 238304 rows and 47 columns. Is there any simple code which especially could do sth like reference_column-(A[,48]+A[,49]+...A[,243])/196 and B-A[,1:47]/reference_column? Thank you very much for your help! Best, Allen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] time plotting problem
Now your first data point is 9/26/09; is it supposed to be 9/26/06? On Nov 12, 2007 1:47 PM, John Kane [EMAIL PROTECTED] wrote: I am completely misunderstanding how to handle dates. I want to plot a couple of data series against some dates. Simple example 1 below works fine. Unfortunately I have multiple observations per day (no time breakdowns) and observations across years. (example 2 very simplistic version ) Can anyone suggest a quick fix or point me to something to read? I thought that zoo might do it but I seem to be missing something there too. Any suggestions gratefully recieved. Example 1 consecutive dates same year. = x - days 9/26/09 9/27/06 9/28/06 9/29/06 9/29/06 9/29/06 10/1/06 10/1/06 10/2/06 10/3/06 mydata - read.table(textConnection(x), header=TRUE, as.is=TRUE); mydata mydates - as.Date(mydata[,1], %m/%d/%y); mydates mynums - rnorm(10) plot(mydates, mynums) Example 2 (things go blooy!) non-consecutive dates different years. = x - days 9/26/09 9/27/06 9/28/06 9/29/06 9/29/06 9/29/06 10/1/07 # - year changes 10/1/07 10/2/07 10/3/07 mydata - read.table(textConnection(x), header=TRUE, as.is=TRUE); mydata mydates - as.Date(mydata[,1], %m/%d/%y); mydates mynums - rnorm(10) plot(mydates, mynums) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] update matrix with subset of it where only row names match
Here is one way of doing it that uses the row and column names: # create test data mat1 - matrix(0, nrow=10, ncol=3) dimnames(mat1) - list(paste('row', 1:10, sep=''), LETTERS[1:3]) mat2 - matrix(1:3, ncol=1, dimnames=list(c('row3', 'row7', 'row5'), B)) mat2 B row3 1 row7 2 row5 3 # create indexing matrix indx - cbind(match(rownames(mat2), rownames(mat1)), match(colnames(mat2), colnames(mat1))) indx [,1] [,2] [1,]32 [2,]72 [3,]52 mat1[indx] - mat2 mat1 A B C row1 0 0 0 row2 0 0 0 row3 0 1 0 row4 0 0 0 row5 0 3 0 row6 0 0 0 row7 0 2 0 row8 0 0 0 row9 0 0 0 row10 0 0 0 On Nov 12, 2007 4:54 PM, Martin Waller [EMAIL PROTECTED] wrote: I guess this has a simple solution: I have matrix 'mat1' which has row and column names, e.g.: A B C row10 0 0 row20 0 0 rown0 0 0 I have a another matrix 'mat2', essentially a subset of 'mat1' where the rownames are all in 'mat1' e.g.: B row35 row86 row54 7 I want to insert the values of matrix mat2 for column B (in reality it could be some or all of column names A, B or C, etc.) (same name in both matrices if that matters - rownames of mat2 guaranteed to be in mat1) into matrix mat1 where the rownames match, so final desired result is: matrix mat1: A B C row10 0 0 row20 0 0 row30 5 0 ... row80 6 0 ... row54 0 7 0 .. rown0 0 0 My solution was (along the lines of): mat1[rownames(mat2)%in%rownames(mat1),B]=mat2[,B] Is there a better way? It doesn't 'feel' right? Thanks - hope I explained it right (its late and I had a little drink about an hour ago,etc). Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cleaning database: grep()? apply()?
Here is how to wittle it down for the first two parts of your question. I am not exactly what you are after in the third part. Is it that you want specific DATEs or do you want the ratio of the DATE[max]/DATE[min]? x - read.table(textConnection(CODENAME DATE DATA1 + 4813'ADVANCED TELECOM'19870.013 + 3845'ADVANCED THERAPEUTIC SYS LTD'198710.1 + 3845'ADVANCED THERAPEUTIC SYS LTD'19892.463 + 3845'ADVANCED THERAPEUTIC SYS LTD'19881.563 + 2836'ADVANCED TISSUE SCI -CL A' 19870.847 + 2836'ADVANCED TISSUE SCI -CL A' 1989 0.872 + 2836'ADVANCED TISSUE SCI -CL A' 1988 0.529), header=TRUE) # matches on things to delete delete_indx - grep(-CL A$|-OLD$|-ADS$, x$NAME) # delete them x - x[-delete_indx,] x CODE NAME DATE DATA1 1 4813 ADVANCED TELECOM 1987 0.013 2 3845 ADVANCED THERAPEUTIC SYS LTD 1987 10.100 3 3845 ADVANCED THERAPEUTIC SYS LTD 1989 2.463 4 3845 ADVANCED THERAPEUTIC SYS LTD 1988 1.563 # I assume you want to use NAME to check for ranges of data date_range - tapply(x$DATE, x$NAME, function(dates) diff(range(dates))) date_range ADVANCED TELECOM ADVANCED THERAPEUTIC SYS LTD 02 ADVANCED TISSUE SCI -CL A NA # delete ones with less than 3 years names_to_delete - names(date_range[date_range 2]) # delete those entries x - x[!(x$NAME %in% names_to_delete),] x CODE NAME DATE DATA1 2 3845 ADVANCED THERAPEUTIC SYS LTD 1987 10.100 3 3845 ADVANCED THERAPEUTIC SYS LTD 1989 2.463 4 3845 ADVANCED THERAPEUTIC SYS LTD 1988 1.563 On Nov 13, 2007 2:34 PM, Jonas Malmros [EMAIL PROTECTED] wrote: Dear R users, I have a huge database and I need to adjust it somewhat. Here is a very little cut out from database: CODENAME DATE DATA1 4813ADVANCED TELECOM19870.013 3845ADVANCED THERAPEUTIC SYS LTD198710.1 3845ADVANCED THERAPEUTIC SYS LTD19892.463 3845ADVANCED THERAPEUTIC SYS LTD19881.563 2836ADVANCED TISSUE SCI -CL A 19870.847 2836ADVANCED TISSUE SCI -CL A 1989 0.872 2836ADVANCED TISSUE SCI -CL A 1988 0.529 What I need is: 1) Delete all cases containing -CL A (and also -OLD, -ADS, etc) at the end 2) Delete all cases that have less than 3 years of data 3) For each remaining case compute ratio DATA1(1989) / DATA1(1987) [and then ratios involving other data variables] and output this into new database consisting of CODE, NAME, RATIOs. Maybe someone can suggest an effective way to do these things? I imagine the first one would involve grep(), and 2 and 3 would involve apply family of functions, but I cannot get my mind around the actual code to perform this adjustments. I am new to R, I do write code but usually it consists of for-functions and plotting. I would much appreciate your help. Thank you in advance! -- Jonas Malmros Stockholm University Stockholm, Sweden __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] update matrix with subset of it where only row names match
Lets take a look at your solution: mat1 - matrix(0, nrow=10, ncol=3) dimnames(mat1) - list(paste('row', 1:10, sep=''), LETTERS[1:3]) mat2 - matrix(1:3, ncol=1, dimnames=list(c('row3', 'row7', 'row5'), B)) mat2 B row3 1 row7 2 row5 3 mat1[rownames(mat2)%in%rownames(mat1),B]=mat2[,B] Error in mat1[rownames(mat2) %in% rownames(mat1), B] = mat2[, B] : number of items to replace is not a multiple of replacement length rownames(mat2)%in%rownames(mat1) [1] TRUE TRUE TRUE mat2[,B] row3 row7 row5 123 I got an error statement using your statement with %in%. This is because it produces a vector a 3 TRUE values are you can see above. With recycling to will the matrix, you get the error message. What you want to provide is the index value of the rows to replace in. What you would need in this case is the following statement: mat1[match(rownames(mat2), rownames(mat1)),B]=mat2[,B] Now your solution would have to be changed everytime you wanted a different column replaced. My solution determined which of the column names matched in the objects. In R, there are a number of ways of doing things. As to which is 'better', it all depends. In most cases it is probably a matter of 'style' or what a person is used to. Better does come into play when you are taking about performance and there might be a factor of 10X, 100X or 1000X depending on how you used some statements. I happen to like to try to break things down into some simple steps so if I have to go back later, I think I might be able to understand it again. If you are coming from a C/Java background, then one of hard things to get your mind around it to think in terms of 'vectorized' operations and also the difference in some of the ways that you create/manipulate data structures in R vs. some other languages. HTH On Nov 13, 2007 4:44 PM, Martin Waller [EMAIL PROTECTED] wrote: jim holtman wrote: Here is one way of doing it that uses the row and column names: # create test data mat1 - matrix(0, nrow=10, ncol=3) dimnames(mat1) - list(paste('row', 1:10, sep=''), LETTERS[1:3]) mat2 - matrix(1:3, ncol=1, dimnames=list(c('row3', 'row7', 'row5'), B)) mat2 B row3 1 row7 2 row5 3 # create indexing matrix indx - cbind(match(rownames(mat2), rownames(mat1)), match(colnames(mat2), colnames(mat1))) indx [,1] [,2] [1,]32 [2,]72 [3,]52 mat1[indx] - mat2 mat1 A B C row1 0 0 0 row2 0 0 0 row3 0 1 0 row4 0 0 0 row5 0 3 0 row6 0 0 0 row7 0 2 0 row8 0 0 0 row9 0 0 0 row10 0 0 0 On Nov 12, 2007 4:54 PM, Martin Waller [EMAIL PROTECTED] wrote: snip OK - I see that, and thanks for your response, but (and excuse my ignorance, less than 2 months in R...) can you help me to see why this is 'better' (whatever that means, if anything)? From a newbie (at least my) POV, it seems less clear than my original solution. Again, please bear in mind I am relatively new so please be patient if I'm not seeing something that's obvious to yourself. I have a genuine desire to learn. Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems with splinefun()
Exactly what values do you want right away? You can do: result - spline(.) and then reference 'result$x' and 'result$y'. Can you be more specific on your request and provide an example of what you are currently doing (with data) and what you expect the results to be. On Nov 14, 2007 5:31 AM, david csongor [EMAIL PROTECTED] wrote: I am working with the function: splinefun() ... When plugging in the variables, I get the function program as if though having only entered 'splinefun. only way to get the values is by spline(xxx,yyy, n=length(xxx)/10, ties = mean)$x and spline(xxx,yyy, n=length(xxx)/10, ties = mean)$y. I'm just wondering if there is something wrong with the package or if I'm doing something wrong... Is there a way to get the values right away? Has this happened to anyone? Thanx in advance for the the ever great help one gets asking stuff this way!!! /David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get row numbers of a subset of rows
Here is a way of doing it using 'rle': x - read.table(textConnection( SNPChromosome PhysicalPosition + 1 SNP_A-1909444 1 7924293 + 2 SNP_A-2237149 1 8173763 + 3 SNP_A-4303947 1 8191853 + 4 SNP_A-2236359 1 8323433 + 5 SNP_A-2205441 1 8393263 + 6 SNP_A-1909445 1 7924293 + 7 SNP_A-2237146 2 8173763 + 8 SNP_A-4303946 2 8191853 + 9 SNP_A-2236357 2 8323433 + 10 SNP_A-2205442 2 8393263), header=TRUE) # use rle to get the 'runs' y - rle(x$Chromosome) # create dataframe with start/ends and values start - head(cumsum(c(1, y$lengths)), -1) index - data.frame(values=y$values, start=start, end=start + y$lengths - 1) index values start end 1 1 1 6 2 2 7 10 On Nov 14, 2007 10:56 AM, affy snp [EMAIL PROTECTED] wrote: Hello list, I read in a txt file using B-read.table(file=data.snp,header=TRUE,row.names=NULL) by specifying the row.names=NULL so that the rows are numbered. Below is an example after how the table looks like using B[1:10,1:3] SNPChromosome PhysicalPosition 1 SNP_A-1909444 1 7924293 2 SNP_A-2237149 1 8173763 3 SNP_A-4303947 1 8191853 4 SNP_A-2236359 1 8323433 5 SNP_A-2205441 1 8393263 6 SNP_A-1909445 1 7924293 7 SNP_A-2237146 2 8173763 8 SNP_A-4303946 2 8191853 9 SNP_A-2236357 2 8323433 10 SNP_A-2205442 2 8393263 I am wondering if there is a way to return the start and end row numbers for a subset of rows. For example, If I specify B[,2]=1, I would like to get start=1 and end=6 if B[,2]=2, then start=7 and end=10 Is there any way in R to quickly do this? Thanks a bunch! Allen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get row numbers of a subset of rows
That works for the specific value of '1', but you would have to repeat it for other values in the column. If you had 100 different ranges in that column, what would you do? Here is another solution using 'range' on the same data: tapply(seq_len(nrow(x)), x$Chromosome, range) $`1` [1] 1 6 $`2` [1] 7 10 On Nov 14, 2007 12:04 PM, Bert Gunter [EMAIL PROTECTED] wrote: Am I missing something? ... Why not: range(seq(nrow(B))[B[,2]==1] ) ?? ## note: == not = Alternatively, and easily generalized (to start with a frame which is a subset of the original and any subset of rows, contiguous or not) range(as.numeric(row.names(B)[B[,2]==1])) Again, am I missing something that makes this obvious solution impossible? (Wouldn't be the first time.) Bert Gunter Genentech Nonclinical Statistics -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of jim holtman Sent: Wednesday, November 14, 2007 8:39 AM To: affy snp Cc: r-help@r-project.org Subject: Re: [R] How to get row numbers of a subset of rows Here is a way of doing it using 'rle': x - read.table(textConnection( SNPChromosome PhysicalPosition + 1 SNP_A-1909444 1 7924293 + 2 SNP_A-2237149 1 8173763 + 3 SNP_A-4303947 1 8191853 + 4 SNP_A-2236359 1 8323433 + 5 SNP_A-2205441 1 8393263 + 6 SNP_A-1909445 1 7924293 + 7 SNP_A-2237146 2 8173763 + 8 SNP_A-4303946 2 8191853 + 9 SNP_A-2236357 2 8323433 + 10 SNP_A-2205442 2 8393263), header=TRUE) # use rle to get the 'runs' y - rle(x$Chromosome) # create dataframe with start/ends and values start - head(cumsum(c(1, y$lengths)), -1) index - data.frame(values=y$values, start=start, end=start + y$lengths - 1) index values start end 1 1 1 6 2 2 7 10 On Nov 14, 2007 10:56 AM, affy snp [EMAIL PROTECTED] wrote: Hello list, I read in a txt file using B-read.table(file=data.snp,header=TRUE,row.names=NULL) by specifying the row.names=NULL so that the rows are numbered. Below is an example after how the table looks like using B[1:10,1:3] SNPChromosome PhysicalPosition 1 SNP_A-1909444 1 7924293 2 SNP_A-2237149 1 8173763 3 SNP_A-4303947 1 8191853 4 SNP_A-2236359 1 8323433 5 SNP_A-2205441 1 8393263 6 SNP_A-1909445 1 7924293 7 SNP_A-2237146 2 8173763 8 SNP_A-4303946 2 8191853 9 SNP_A-2236357 2 8323433 10 SNP_A-2205442 2 8393263 I am wondering if there is a way to return the start and end row numbers for a subset of rows. For example, If I specify B[,2]=1, I would like to get start=1 and end=6 if B[,2]=2, then start=7 and end=10 Is there any way in R to quickly do this? Thanks a bunch! Allen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] enumeration variable by groups
Here is a way to do it: x - scan(textConnection(1 48 1 45 2 50 2 42 1 41 2 51 1 52 1 43 2 52), what=0L) Read 18 items x - matrix(x, ncol=2, byrow=TRUE) colnames(x) - c('gender', 'score') x gender score [1,] 148 [2,] 145 [3,] 250 [4,] 242 [5,] 141 [6,] 251 [7,] 152 [8,] 143 [9,] 252 # split out categories y - split(seq_len(nrow(x)), x[,1]) # combine into new matrix x.new - do.call('rbind', lapply(y, function(.row) cbind(x[.row,], index=seq_along(.row x.new gender score index [1,] 148 1 [2,] 145 2 [3,] 141 3 [4,] 152 4 [5,] 143 5 [6,] 250 1 [7,] 242 2 [8,] 251 3 [9,] 252 4 On Nov 14, 2007 12:58 PM, lamack lamack [EMAIL PROTECTED] wrote: Dear all, How can I create an enumeration variable by groups? I have: gender score 1 48 1 45 2 50 2 42 1 41 2 51 1 52 1 43 2 52 and Y would like to get: genderscoreindex 148 1 145 2 141 3 152 4 143 5 250 1 242 2 251 3 252 4 best regards _ [[replacing trailing spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Romoving elements from a vector. Looking for the opposite of c(), New user
You can also check out the 'set' operations: setdiff, intersect, union. On Nov 15, 2007 12:08 PM, John Kane [EMAIL PROTECTED] wrote: I think you've read Thomas's request in reverse. and what he want is: x[!x %in% z] Thanks for the %in% approach BTW. --- Charilaos Skiadas [EMAIL PROTECTED] wrote: On Nov 15, 2007, at 9:15 AM, Thomas Fr��jd wrote: Hi I have three vectors say x, y, z. One of them, x contains observations on a variable. To x I want to append all observations from y and remove all from z. For appending c() is easily used x - c(x,y) But how do I remove all observations in z from x? You can say I am looking for the opposite of c(). If you are looking for the opposite of c, provided you want to remove the first part of things, then perhaps this would work: z-c(x,y) z[-(1:length(x))] However, if you wanted to remove all appearances of elements of x from c(x,y), regardless of whether those elements appear in the x part of in the y part, I think you would want: z[!z %in% x] Probably there are other ways. Welcome to R! Best regards Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems working with large data
A little more information might be useful. If your matrix is numeric, then a single copy will require about 250MB of memory. What type of system are you on and how much memory do you have? When you say you are having problems, what are they? Is it a problem reading the data in? Are you getting allocation errors? Is your system paging? If you have 2GB of memory, you should be fine depending on how many copies of the data you have. On Nov 15, 2007 10:53 AM, [EMAIL PROTECTED] wrote: Hi, I'm working with a numeric matrix with 55columns and 581012 rows and I'm having problems in allocation of memory using some of the functions in R: for example lda, rda (library MASS), princomp(package mva) and mvnorm.etest (energy package). I've read tips to use less memory in help(read.table) and managed to use some of this functions, but haven't been able to work with mvnorm.etest. I would like to know the better way to solve this problem, as well as doing it faster. Best regards, Pedro Marques __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read complicated file
-198 175 -17912 -47 27 186 -18030 0 -25 -91 164 117 -155188 149 -28 24 5 20 -31 52 -78 45 -133-63 -77 75 -183 130 -119-47 -8 -40 64 209 166 48 -65 -244111 110 -106-248-21 -1732 -38 111 30 -174257 59 -180-73 -278-124-22 107 164 73 160 -136-37 119 -10 100 -4 0 182 152 35 256 70 148 -9 -4 0 49 128 -44 21 36 143 -114-59 -1107 -40 -80 -70 99 27 -27 184 293 257 -83 44 101 65 -68 -167158 94 -39 130 59 -34934 47 -10870 141 55 138 -20 -83 81 -15 74 -107140 -280107 -32583 125 -64 200 -122123 -280 21 ... The first bit up to END can be skipped. That's the first 90 lines. Then I need to do something like this: while data still exist in the file { skip 3 lines scan 81 values into temp scan 82nd value, which is 11, 12, 21, 22. Depending on value, temp is added to one of these vars } The data are written in clumps. Each clump has 3 lines with info I don't need. Then it has 81 values which are the actual data I want to read into some variable temp Then the 82nd value tells me which of 4 variables to add temp onto. Any tips on how to approach this using scan() greatly appreciated. I know I can use skip as an argument to scan. Thanks very much for any help! Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overlapping intervals
Here is one way of doing it: c.i - read.table(textConnection( 17130612 17587118 + 17712302 18221688 + 21225764 21387314 + 25012714 30748348 + 33852816 34480192 + 36012944 36209144 + 36252300 36280276 + 36737468 36971144 + 43693832 43878548)) d.i - read.table(textConnection( 17712302 18100404 + 21203780 21387314 + 25012714 30748348 + 33852816 34384588 + 34794536 35996440)) closeAllConnections() # setup data.frame for comparing x - rbind(data.frame(t=c.i$V1, oper=1, type='c'), + data.frame(t=c.i$V2, oper=-1, type='c'), + data.frame(t=d.i$V1,oper=1, type='d'), + data.frame(t=d.i$V2, oper=-1, type='d')) # put in time order x - x[order(x$t),] # determine overlaps x$over - cumsum(x$oper) x t oper type over 1 171306121c1 10 17587118 -1c0 2 177123021c1 19 177123021d2 24 18100404 -1d1 11 18221688 -1c0 20 212037801d1 3 212257641c2 12 21387314 -1c1 25 21387314 -1d0 4 250127141c1 21 250127141d2 13 30748348 -1c1 26 30748348 -1d0 5 338528161c1 22 338528161d2 27 34384588 -1d1 14 34480192 -1c0 23 347945361d1 28 35996440 -1d0 6 360129441c1 15 36209144 -1c0 7 362523001c1 16 36280276 -1c0 8 367374681c1 17 36971144 -1c0 9 436938321c1 18 43878548 -1c0 # assuming that c d don't overlap themselves, oper=2 indicate an overlap overlap - which(x$over == 2) # print overlaps for (i in overlap){ + print(x[i + c(-1,0,1,2),]) + } t oper type over 2 177123021c1 19 177123021d2 24 18100404 -1d1 11 18221688 -1c0 t oper type over 20 212037801d1 3 212257641c2 12 21387314 -1c1 25 21387314 -1d0 t oper type over 4 250127141c1 21 250127141d2 13 30748348 -1c1 26 30748348 -1d0 t oper type over 5 338528161c1 22 338528161d2 27 34384588 -1d1 14 34480192 -1c0 On Feb 1, 2008 4:03 PM, mohamed nur anisah [EMAIL PROTECTED] wrote: hi!! Below I have 4 columns vector of c and d which are unequal in length.These c and d have 2 columns each where these 2 columns represent an interval values. How am I going to get an overlapping over these interval values?? Please help me sort this problem!! Thanks in advance.. c d 17130612 17587118 17712302 18100404 17712302 18221688 21203780 21387314 21225764 21387314 25012714 30748348 25012714 30748348 33852816 34384588 33852816 34480192 34794536 35996440 36012944 36209144 36252300 36280276 36737468 36971144 43693832 43878548 - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting 3 vectors on one graph.
What you want to do is to use 'plot' for the initial vector and then lines to add the other two. You will have to set the range of the y-axis in the initial call to plot. The sequence would probably look like this: plot(a, ylim=range(a, b, c), col='black') lines(b, col='red') lines(c, col='green') On Feb 1, 2008 7:06 PM, cvandy [EMAIL PROTECTED] wrote: I'm an R newbie and am trying to plot 3 vectors, say a,b,c. I have downloaded 3 R manuals and searched your forum. There are plenty of X vs Y examples, but cannot find how to plot 3, or more vectors one one graph. I'm sure I overlooked something. Thanks for any help. CHV -- View this message in context: http://www.nabble.com/Plotting-3-vectors-on-one-graph.-tp15236552p15236552.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ignore error t.test in a loop
?try e.g., for(i in x) try(t.test(...)) On Feb 2, 2008 7:43 AM, My Coyne [EMAIL PROTECTED] wrote: Hi, I place a t.test in a loop and would like to continue to process the loop even when t.test encounter error. How do I do that?For example, in one iteration, the data is completely constant and t.test gives error, the entire program terminates. I would like to write the information out to a file, and the loop should continue. Thanks My D. Coyne [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] calculation fraction/ratio
Is this what you want? x - read.table(textConnection(Index A + 1 1 + 1 2 + 1 3 + 2 4 + 2 3 + 3 7 + 3 9 + 3 3 + 3 1), header=TRUE) data.frame(x, value=ave(x$A, x$Index, FUN=function(z) z / sum(z))) Index A value 1 1 1 0.167 2 1 2 0.333 3 1 3 0.500 4 2 4 0.5714286 5 2 3 0.4285714 6 3 7 0.350 7 3 9 0.450 8 3 3 0.150 9 3 1 0.050 On Feb 1, 2008 6:19 PM, YIHSU CHEN [EMAIL PROTECTED] wrote: Dear R users I wonder if there is a quick way to calculate the ratio/fraction of a list/data frame. For example, if I have a data frame with two fields: Index and A. I would like to know the fractions of A's within the same Index. That is, for Index =1, three fractions will be 1/(1+2+3)=0.17, 2/(1+2+3)=0.33, and 3/1+2+3=0.5. Likewise for Index =2 and Index 3. So, I then generate a new vector of 0.17, 0.33, 0.5... ,etc. Index A 1 1 1 2 1 3 2 4 2 3 3 7 3 9 3 3 3 1 Thank you so much Yihsu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Call for papers for CMG'08
UseR, I am on the conference committee for CMG'08 (Computer Measurement Group - www.cmg.org). At the last several conferences I have presented papers, and workshops, on the use of R in computer performance analysis and visualization. I am sending this out to see if anyone would be interested in submitting a paper for our next conference that will be held in Las Vegas in December, 2008. If so, you can check the website for details, or ask me. I am especially interested in papers on the visualization of data (since this will be one of the hot tracks at the conference), and especially using R, since this is an opportunity to introduce an audience whose job it is to analyze data to the power of R. If you don't want to submit a paper, but have some ideas, or experiences, for using R as it relates to computer performance, please let me know since I might be able to use some of the material to show how other organizations are using R in this context. Please feel free to contact me with any questions or comments. Jim Holtman [EMAIL PROTECTED] +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] precision in seq
FAQ 7.31 On 2/4/08, Eric Elguero [EMAIL PROTECTED] wrote: Hi everybody, this is a warning more than a question. I noticed that seq produces approximate results: seq(0,1,0.05)[19]==0.9 [1] TRUE seq(0,1,0.05)[20]==0.95 [1] FALSE seq(0,1,0.05)[21]==1 [1] TRUE seq(0,1,0.05)[20]-0.95 [1] 1.110223024625157e-16 I do not understand why 0.9 and 1 are correct (within some tolerance or strictly exact?) and 0.95 is not. this one works: ((0:20)/20)[20]==0.95 [1] TRUE Eric Elguero __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] counting identical data in a column
Is this what you want? x - read.table(textConnection( chrN start end + 1 chr1 11122333 11122633 + 2 chr1 11122333 11122633 + 3 chr3 11122333 11122633 + 8 chr3 111273334 111273634 + 7 chr2 12122334 12122634 + 4 chr1 21122377 21122677 + 5 chr2 33122355 33122655 + 6 chr2 33122355 33122655), header=TRUE) x$count - ave(x$start, x$start, FUN=length) x chrN start end count 1 chr1 11122333 11122633 3 2 chr1 11122333 11122633 3 3 chr3 11122333 11122633 3 8 chr3 111273334 111273634 1 7 chr2 12122334 12122634 1 4 chr1 21122377 21122677 1 5 chr2 33122355 33122655 2 6 chr2 33122355 33122655 2 On 2/4/08, joseph [EMAIL PROTECTED] wrote: Hi Peter I have the following data frame with chromosome name, start and end positions: chrN start end 1 chr1 11122333 11122633 2 chr1 11122333 11122633 3 chr3 11122333 11122633 8 chr3 111273334 111273634 7 chr2 12122334 12122634 4 chr1 21122377 21122677 5 chr2 33122355 33122655 6 chr2 33122355 33122655 I would like to count the positions that have the same start and add a new column with the count number; the new data frame should look like this: chrN start end count 1 chr1 11122333 11122633 3 2 chr1 11122333 11122633 3 3 chr3 11122333 11122633 3 8 chr3 111273334 111273634 1 7 chr2 12122334 12122634 1 4 chr1 21122377 21122677 1 5 chr2 33122355 33122655 2 6 chr2 33122355 33122655 2 Can you please show me how to achieve this? Thanks Joseph Be a better friend, newshound, and [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] precision in seq
If you want 0,0.05,0.1,...0.95,1.00 then think about encoding as characters: sprintf(%.2f, seq(0, 1, 0.05)) [1] 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 [13] 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 then you won't have the problem of dealing with floating point numbers, and still have the ability to later convert the character strings back to numeric for processing. Character strings will give you the exact matches that you were expecting (but won't get) with floating point. On 2/5/08, Eric Elguero [EMAIL PROTECTED] wrote: thank you to all who answered. 0+0.05+ + 0.05+0.05+0.05+0.05+0.05+0.05+ + 0.05+0.05+0.05+0.05+0.05+0.05+ + 0.05+0.05+0.05+0.05+0.05+0.05 - 0.95 [1] 3.330669e-16 seq(0,1,0.05)[20] - 0.95 [1] 1.110223e-16 0+19*0.05 - 0.95 [1] 1.110223e-16 so this is the way seq calculates. I would have guessed that addition was more accurate than multiplication, but that is not the case. this one however bothers me: 19/20-0.95 [1] 0 I noticed this problem when I tried to extract rows of a matrix according to whether values of some vector where in the set (0,0.05,...,0.95,1), with something like x%in%seq(0,1,0.05) Now I understand that I should not use this construction unless x is of type integer. Would you agree? Eric Elguero __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector loop
Not too sure of exactly what you want to do with the loop. Here is one that prints out the values: x - 1:10 for (i in x) print(i) [1] 1 [1] 2 [1] 3 [1] 4 [1] 5 [1] 6 [1] 7 [1] 8 [1] 9 [1] 10 On 2/5/08, mohamed nur anisah [EMAIL PROTECTED] wrote: hi, I'm in my learning process of doing a programming with for loop. How to make a loop of a vector of length 10 where elements are 1,2,3,4,5,6,7,8,9,10. Any suggestion needed!! Many thanks. Cheers, Anisah - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error message from apply()
The error message was coming from the call to colMeans where 'x' was not a matrix; it was a vector that resulted from the 'apply' call. Did you intend to use 'mean' instead like this example: data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, -0.7539687, + 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, 0.9936430, + 0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152, -0.7458182, - + 0.2231588, -0.5051651, -0.74871174, 0.9450363, 0.4797723, -0.9033313, - + 0.5825065, 0.8523742, 0.7402795, -0.7134312, -0.8162558, 0.6345438, - + 0.05704138), 3,10) num - apply(data2_1, 2, function(x) {sum(x (mean(x, na.rm = TRUE) + + 1*sd(x, na.rm = TRUE)), na.rm = TRUE)}) num [1] 0 1 1 1 0 0 1 1 0 0 On Feb 5, 2008 8:43 PM, Ng Stanley [EMAIL PROTECTED] wrote: Hi, I keep getting the error message. Please help. Error in colMeans(x, na.rm = TRUE) : 'x' must be an array of at least two dimensions The codes are: data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, -0.7539687, 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, 0.9936430, 0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152, -0.7458182, - 0.2231588, -0.5051651, -0.74871174, 0.9450363, 0.4797723, -0.9033313, - 0.5825065, 0.8523742, 0.7402795, -0.7134312, -0.8162558, 0.6345438, - 0.05704138), 3,10) num - apply(data2_1, 2, function(x) {sum(x (colMeans(x, na.rm = TRUE) + 1*sd(x, na.rm = TRUE)), na.rm = TRUE)}) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to plot an user-defined function
Your function 'll' only returns a single value when passed a vector: x - seq(0,2,.1) ll(x) [1] -7.571559 'plot' expects to pass a vector to the function and have it return a vector of the same length; e.g., sin(x) [1] 0. 0.09983342 0.19866933 0.29552021 0.38941834 0.47942554 0.56464247 0.64421769 0.71735609 [10] 0.78332691 0.84147098 0.89120736 0.93203909 0.96355819 0.98544973 0.99749499 0.99957360 0.99166481 [19] 0.97384763 0.94630009 0.90929743 So you either have to rewrite your function, or have a loop that will evaluate the function at each individual point and then plot it. On Feb 5, 2008 7:06 PM, John Smith [EMAIL PROTECTED] wrote: Dear R-users, Suppose I have defined a likelihood function as ll(tau), how can I plot this likelihood function by calling it by plot? I want to do it like this: ll - function(tau) { w - 1 / (s^2 + tau^2) mu - sum(theta * w) / sum(w) -1/2*sum((theta-mu)^2*w -log(w)) } plot(ll, 0, 2) But have the following error: Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ In addition: Warning messages: 1: In s^2 + tau^2 : longer object length is not a multiple of shorter object length 2: In theta * w : longer object length is not a multiple of shorter object length 3: In (theta - mu)^2 * w : longer object length is not a multiple of shorter object length Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error message from apply()
You matrix only has 3 rows, so when you do 'apply(data2_1,2,...)' you are extracting columns which only have a length of 3 while thr has a length of 10 str(data2_1) num [1:3, 1:10] 0.958 0.271 -0.950 -0.130 -0.754 ... str(thr) num [1:10] 1.060 0.528 0.104 0.925 -0.256 ... That is why you get the error message of a size mismatch. On Feb 5, 2008 10:21 PM, Ng Stanley [EMAIL PROTECTED] wrote: Replacing colMeans by mean removed the warning messages. Thanks However, when I precompute thr, and pass it to function(x), the error returns. Using the shorter data2_1, doesn't give any warnings. What is happening ? data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, -0.7539687, 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, 0.9936430, 0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152, -0.7458182, - 0.2231588, -0.5051651, -0.74871174, 0.9450363, 0.4797723, -0.9033313, - 0.5825065, 0.8523742, 0.7402795, -0.7134312, -0.8162558, 0.6345438, - 0.05704138), 3,10) # data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, - 0.7539687, 0.5344464, -0.8205933, 0.1581723, -0.5351588), 3,3) thr - colMeans(data2_1, na.rm = TRUE) + sd(data2_1, na.rm = TRUE) num - apply(data2_1, 2, function(x) { sum(x (thr), na.rm = TRUE) }) On 2/6/08, jim holtman [EMAIL PROTECTED] wrote: The error message was coming from the call to colMeans where 'x' was not a matrix; it was a vector that resulted from the 'apply' call. Did you intend to use 'mean' instead like this example: data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, - 0.7539687, + 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, 0.9936430, + 0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152, -0.7458182, - + 0.2231588, -0.5051651, -0.74871174, 0.9450363, 0.4797723, -0.9033313, - + 0.5825065, 0.8523742, 0.7402795, -0.7134312, -0.8162558, 0.6345438, - + 0.05704138), 3,10) num - apply(data2_1, 2, function(x) {sum(x (mean(x, na.rm = TRUE) + + 1*sd(x, na.rm = TRUE)), na.rm = TRUE)}) num [1] 0 1 1 1 0 0 1 1 0 0 On Feb 5, 2008 8:43 PM, Ng Stanley [EMAIL PROTECTED] wrote: Hi, I keep getting the error message. Please help. Error in colMeans(x, na.rm = TRUE) : 'x' must be an array of at least two dimensions The codes are: data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, - 0.7539687, 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, 0.9936430, 0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152, -0.7458182, - 0.2231588, -0.5051651, -0.74871174, 0.9450363, 0.4797723, -0.9033313, - 0.5825065, 0.8523742, 0.7402795, -0.7134312, -0.8162558, 0.6345438, - 0.05704138), 3,10) num - apply(data2_1, 2, function(x) {sum(x (colMeans(x, na.rm = TRUE) + 1*sd(x, na.rm = TRUE)), na.rm = TRUE)}) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] inserting text lines in a dat frame
Try this and see if it is what you want: x - read.table(textConnection( V1V2 V3 1 chr1 11255 55 2 chr1 11320 29 3 chr1 11400 45 4 chr2 21680 35 5 chr2 21750 84 6 chr2 21820 29 7 chr2 31890 46 8 chr3 32100 29 9 chr3 52380 29 10 chr3 66450 46 ), header=TRUE) cat(browser position chr1:1-1\nrowser hide all\n, file='tempxx.txt') result - lapply(split(x, x$V1), function(.chro){ cat(sprintf(track type=wiggle_0 name=sample description=%s_sample visibility=full\nvariableStep chrom=%s span=1\n, as.character(.chro$V1[1]), as.character(.chro$V1[1])), file=tempxx.txt, append=TRUE) write.table(.chro, sep=\t, file=tempxx.txt, append=TRUE, col.names=FALSE, row.names=FALSE) }) On Feb 5, 2008 11:22 PM, joseph [EMAIL PROTECTED] wrote: Hi Jim I am trying to prepare a bed file to load as accustom track on the UCSC genome browser. I have a data frame that looks like the one below. x V1V2 V3 1 chr1 11255 55 2 chr1 11320 29 3 chr1 11400 45 4 chr2 21680 35 5 chr2 21750 84 6 chr2 21820 29 7 chr2 31890 46 8 chr3 32100 29 9 chr3 52380 29 10 chr3 66450 46 I would like to insert the following 4 lines at the beginning: browser position chr1:1-1 browser hide all track type=wiggle_0 name=sample description=chr1_sample visibility=full variableStep chrom=chr1 span=1 and then insert 2 lines before each chromosome: track type=wiggle_0 name=sample description=chr2_sample visibility=full vriableStep chrom=chr2 span=1 The final result should be tab delimited file that looks like this: browser position chr1:1-1 browser hide all track type=wiggle_0 name=sample description=chr1_sample visibility=full variableStep chrom=chr1 span=1 chr1 11255 55 chr1 11320 29 chr1 11400 45 track type=wiggle_0 name=sample description=chr2_sample visibility=full variableStep chrom=chr2 span=1 chr2 21680 35 chr2 21750 84 chr2 21820 29 track type=wiggle_0 name=sample description=chr3_sample visibility=full variableStep chrom=chr3 span=1 chr3 32100 29 chr3 32170 45 chr3 32240 45 Any kind of help or guidance will be much appreciated. Joseph Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error message from apply()
Is 'thr' supposed to be the mean and sd of all the values in data2_1? If so, then thr - mean(data2_1, na.rm=TRUE) + sd(data2_1,na.rm=TRUE) I am not exactly sure of what is the problem that you are trying to solve. You just have to make sure that the object you are creating by precomputing has the right structure to do what you want. On Feb 6, 2008 12:56 AM, Stanley Ng [EMAIL PROTECTED] wrote: Now I understand why 3 by 3 data2_1 works and not the 3x10 data2_1. How can I precompute thr and pass it safely to function(x) for the column operation ? -Original Message- From: jim holtman [mailto:[EMAIL PROTECTED] Sent: Wednesday, February 06, 2008 11:33 To: Ng Stanley Cc: r-help Subject: Re: [R] error message from apply() You matrix only has 3 rows, so when you do 'apply(data2_1,2,...)' you are extracting columns which only have a length of 3 while thr has a length of 10 str(data2_1) num [1:3, 1:10] 0.958 0.271 -0.950 -0.130 -0.754 ... str(thr) num [1:10] 1.060 0.528 0.104 0.925 -0.256 ... That is why you get the error message of a size mismatch. On Feb 5, 2008 10:21 PM, Ng Stanley [EMAIL PROTECTED] wrote: Replacing colMeans by mean removed the warning messages. Thanks However, when I precompute thr, and pass it to function(x), the error returns. Using the shorter data2_1, doesn't give any warnings. What is happening ? data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, -0.7539687, 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, 0.9936430, 0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152, -0.7458182, - 0.2231588, -0.5051651, -0.74871174, 0.9450363, 0.4797723, -0.9033313, - 0.5825065, 0.8523742, 0.7402795, -0.7134312, -0.8162558, 0.6345438, - 0.05704138), 3,10) # data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, - 0.7539687, 0.5344464, -0.8205933, 0.1581723, -0.5351588), 3,3) thr - colMeans(data2_1, na.rm = TRUE) + sd(data2_1, na.rm = TRUE) num - apply(data2_1, 2, function(x) { sum(x (thr), na.rm = TRUE) }) On 2/6/08, jim holtman [EMAIL PROTECTED] wrote: The error message was coming from the call to colMeans where 'x' was not a matrix; it was a vector that resulted from the 'apply' call. Did you intend to use 'mean' instead like this example: data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, - 0.7539687, + 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, + 0.9936430, 0.2278786, -0.8160700, -0.3314779, -0.4047975, + 0.1168152, -0.7458182, - 0.2231588, -0.5051651, -0.74871174, + 0.9450363, 0.4797723, -0.9033313, - 0.5825065, 0.8523742, + 0.7402795, -0.7134312, -0.8162558, 0.6345438, - 0.05704138), 3,10) num - apply(data2_1, 2, function(x) {sum(x (mean(x, na.rm = TRUE) + + 1*sd(x, na.rm = TRUE)), na.rm = TRUE)}) num [1] 0 1 1 1 0 0 1 1 0 0 On Feb 5, 2008 8:43 PM, Ng Stanley [EMAIL PROTECTED] wrote: Hi, I keep getting the error message. Please help. Error in colMeans(x, na.rm = TRUE) : 'x' must be an array of at least two dimensions The codes are: data2_1 - matrix(c(0.9584190, 0.2710325, -0.9495618, -0.1301772, - 0.7539687, 0.5344464, -0.8205933, 0.1581723, -0.5351588, 0.04448065, 0.9936430, 0.2278786, -0.8160700, -0.3314779, -0.4047975, 0.1168152, -0.7458182, - 0.2231588, -0.5051651, -0.74871174, 0.9450363, 0.4797723, -0.9033313, - 0.5825065, 0.8523742, 0.7402795, -0.7134312, -0.8162558, 0.6345438, - 0.05704138), 3,10) num - apply(data2_1, 2, function(x) {sum(x (colMeans(x, na.rm = TRUE) + 1*sd(x, na.rm = TRUE)), na.rm = TRUE)}) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying
Re: [R] matrix loop
What exactly are you intending the loop to do? Why do you have the 'as.matrix' in the middle of the loop? Where was 'y' defined? Does this do what you want? outer(1:5, 1:10, +) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]23456789 1011 [2,]3456789 10 1112 [3,]456789 10 11 1213 [4,]56789 10 11 12 1314 [5,]6789 10 11 12 13 1415 On Feb 6, 2008 7:52 PM, mohamed nur anisah [EMAIL PROTECTED] wrote: Dear list, I'm trying to make a loop of a (5x10) matrix and below are my codes. Could anybody help me figure out why my loop is not working. Thanks in advance!! m-1:5 n-1:10 for(i in 1:length(m)) { for(j in 1:length(n)) { y[i,j]=sum(i,j) y-as.matrix(y[i,j]) } } cheers, Anisah - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting a data.frame degenerates at one column?
try: input[,targets, drop=FALSE] see: ?[ for an explanation. On 2/8/08, Allen S. Rout [EMAIL PROTECTED] wrote: Greetings. At the moment, I'm applying R to some AIX 'nmon' output, trying to get a handle on some disk performance metrics. In case anyone's interested: http://docs.osg.ufl.edu/tsm/pdf/ some of them are more edifying than others. (ahem) I'm trying to develop a somewhat general framework for plotting these measures, in the hopes that it's of some use to people other than me. One obstacle I encounter is that, when I select one column out of a data.frame, the result is no longer a data.frame. So, say I've got, in data frame 'input' disk1 disk2 disk3 disk4 T 0 1 0 4 T0001 0 1 0 5 T0002 0 1 0 5 T0003 0 2 0 4 T0004 0 2 0 3 T0005 0 1 0 3 T0006 0 0 0 3 and somewhere I've noted a list targets - c('disk2','disk3') I can say input[,targets] disk2 disk3 T 1 0 T0001 1 0 T0002 1 0 T0003 2 0 T0004 2 0 T0005 1 0 T0006 0 0 but if targets - c('disk2') input[,targets] [1] 1 1 1 2 2 1 0 Ick. I've been reading through the indexing and data.frame docs, and remain unsatisfied so far. Where is my thinking going wrong? - Allen S. Rout __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] FW: merge multiple csv files
Don't have your data, but something like this is close: # something like the following. read into a list for easier processing allFile - Sys.glob(sample*.csv) results - lapply(allFiles, function(.file){ # extract number from file name num - as.integer(sub(^.*?([[:digit:]]+).*, \\1, .file, perl=TRUE)) .in - read.table(.file, skip=5) .in$obs - num .in }) # combine into a single dataframe result - do.call(rbind, results) # now do your processing for average z - split(result, result[,1]) # split by first column do.call(rbind, lapply(z, function(.avg){ data.frame(x=.avg[1,1], y=mean(.avg[,2])) })) On 2/8/08, Gator Connection [EMAIL PROTECTED] wrote: Dear list:I have a folder that contains more than 50 csv files labels sequencially like sample01.csv to sample50.csv. for each file the first 5 rows are descriptive of the data collected (useful but not needed in data merge). each file then start the data at row 6 and have 2 variables x and y. In order to know which file one observation is from, I'd like to have a new variable location, for example if the data are from file sample11.csv, then the location for that obs is 11.Another difficulty is there might be two observations actually repetitive, for example sample05.csv might contain (4, 10) and (4, 12). I'd like to average it into (4, 11). Any suggestions are welcome.Jack Connect and share in new ways with Windows Live. Get it now! _ 08 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to extract characters from a character string
This should do it for you: x [1] 32?35.421 N sub(^.*?([[:digit:].]+) N, \\1, x, perl=TRUE) [1] 35.421 On 2/8/08, Weidong Gu [EMAIL PROTECTED] wrote: Hi, I ran into a problem when I complied a dataset with UTM coordinates. For calculating distances between sites, I need to reformat the coordinates from, for example, 32?35.421 N, to 35.421, i.e. I need to delete all digits before symbol ? and a space and N at the end of the string. What functions I should use? Thanks in advance. Weidong Gu, Department of Medicine University of Alabama, Birmingham [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can I index a dataframe with a reference from/to a second dataframe?
, 1L, 8L, 7L, 9L, 23L, 10L, 28L, 11L, 12L, 31L, 30L, 17L, 16L, 4L, 5L, 3L, 25L, 22L, 20L, 24L, 21L, 26L, 27L, 19L, 2L, 18L, 32L, 33L), .Label = c(Abies balsamea, Acer pensylvanicum, Acer rubrum, Acer saccharum, Acer spicatum, Amelanchier, Betula alleghaniensis, Betula papyrifera, Cornus alternifolia, Cornus canadensis, Diervilla lonicera, Dirca palustris, Fagus grandifolia, Fraxinus americana, Fraxinus nigra, Lonicera canadensis, Ostrya virginiana, Picea glauca, Picea mariana, Pinus resinosa, Pinus strobus, Populus tremuloides, Prunus serotina, Prunus virginiana, Quercus rubra, Ribes , Sorbus americana, Thuja occidentalis, Tilia americana, Tsuga canadensis, Ulmus americana, Viburnum acerifolium, Viburnum lantanoides), class = factor), Cname = structure(c(7L, 27L, 30L, 2L, 3L, 6L, 31L, 33L, 1L, 8L, 10L, 15L, 11L, 21L, 4L, 14L, 20L, 17L, 18L, 23L, 25L, 24L, 29L, 26L, 12L, 16L, 13L, 5L, 9L, 28L, 32L, 22L, 19L), .Label = c(Alternate-leaved Dogwood, American Basswood, American Beech, American Elm, American Mountain-ash, Balsam Fir, Black Ash, Black Cherry, Black Spruce, Bunchberry, Bush Honeysuckle, Choke Cherry, Currant, Eastern Hemlock, Eastern White Cedar, Eastern White Pine, Fly Honeysuckle, Hard Maple, Hobblebush, Ironwood, Leatherwood, Maple-leaved Viburnum, Mountain Maple, Northern Red Oak, Red Maple, Red Pine, Serviceberry, Striped Maple, Trembling Aspen, White Ash, White Birch, White Spruce, Yellow Birch), class = factor)), .Names = c(spp, spp.orig, OPL, form, Type, keep, Sname, Cname), row.names = c(1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34), class = data.frame) Thanks, DaveT. * Silviculture Data Analyst Ontario Forest Research Institute Ontario Ministry of Natural Resources [EMAIL PROTECTED] http://ofri.mnr.gov.on.ca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector Size
How much memory do you have on your system? What type of system do you have? There is information in the archive about generating a sequence like this without having to have it all in memory at once. BTW, your matrix will require 1GB to store a single copy, so you will probably need at least 2-3X (2-3GB) to create it and do something with it. On Feb 8, 2008 7:28 PM, Oscar A [EMAIL PROTECTED] wrote: Hello everybody!! I'm from Colombia (South America) and I'm new on R. I've been trying to generate all of the possible combinations for a 6 number combination with numbers that ranges from 1 to 53. I've used the following commands: datos-c(1:53) M-matrix(data=(combn(datos,6,FUN=NULL,simplify=TRUE)),nrow=22957480,ncol=6,byrow=TRUE) Once the commands are executed, the program shows the following: Error: CANNOT ALLOCATE A VECTOR OF SIZE 525.5 Mb How can I fix this problem? -- View this message in context: http://www.nabble.com/Vector-Size-tp15366901p15366901.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error in the function
a quick look at it shows you would be trying to access y[n+1] in the last part of that loop and that is greater than the number of entries in 'y' so you will get an NA and this is not legal for comparisons. On Feb 9, 2008 6:07 PM, mohamed nur anisah [EMAIL PROTECTED] wrote: Dear lists, i want to find the non-overlapping interval values with this code: mysetdiff=function(x,y){ m=length(x) n=length(y) bx = logical(m) by = logical(n) for(i in 1:m){ for(j in 1:n){ if(x[i]=y[j+1]){ bx[i] = T by[j] = T NA= NA } } } sx = x[!bx] sy = y[!by] s=c(sx,sy) return(s) } Below is my dataset. When i called back my function with the code;mysetdiff(f,e). An error had occur: Error in if (x[i] = y[j + 1]) { : missing value where TRUE/FALSE needed. How am i going to fix my function so that i can get the values of my non-overlapping interval. Any suggestion?? Thanks a bunch!! e [1] 17130612 17712302 21225764 25012714 33852816 36012944 36252300 [8] 36737468 43693832 44148616 45318876 45852632 53258208 58530988 [15] 60437872 72516480 79673224 93128744 94269896 95868704 99651504 [22] 113688560 131101008 132955984 135891280 141318144 148257888 156158176 [29] 157797616 162055856 168221296 173125232 176267104 182826240 183742528 [36] 184401728 190671888 196639616 17587118 18221688 21387314 30748348 [43] 34480192 36209144 36280276 36971144 43878548 44496056 45740012 [50] 46752088 53700056 58603536 60691012 72757696 80077728 93181480 [57] 94474624 97418088 106596368 120128352 132462320 132980744 135998880 [64] 142259520 151591840 156920960 157838176 162743136 168466848 173167936 [71] 176338384 182930096 184149776 185735712 190910576 f [1] 17712302 21203780 25012714 33852816 34794536 36012944 37891284 [8] 43693832 44148616 45852632 53289188 61573112 63664928 72516480 [15] 79673224 94474624 95868704 99651504 113688560 125159688 127388568 [22] 131101008 154599216 176267104 181504912 182562720 182826240 183742528 [29] 196841904 18100404 21387314 30748348 34384588 35996440 36252300 [36] 37942556 43878548 44496056 46752088 53700056 62637560 63969952 [43] 72757696 80077728 94617360 97144032 106596368 120128352 127220456 [50] 127504536 132462320 154717312 176338384 181836032 182687824 182930096 [57] 184149776 - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Length problem
You were asking for the length of the first element of the vector coppie, which is of course 1. Did you mean to say lgngth(coppie)? length(data[,4]) is asking how many elements in that column, which seems to be 5. also your statement coppie - c(data[4:length(data)]) seems strange. What did you intend to do? On 2/11/08, Paolo Grillo [EMAIL PROTECTED] wrote: Hi all I have this problem: In my database .dta, called data I have five rows data-read.dta(C:\\2_CO_mmobile_ALL_Rid.dta) # From this database I wuold like to create another coppie-c(data[4:length(data)]) but I find this # Length of original data length(data[,4]) 5 RIGHT!! # Length of new data length(coppie[1]) 1 WHY?? Thank you all for your help Paolo Grillo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with bwplot
Not the most straightforward way, but I think it gets the job done: x - read.table(textConnection(Ageclass Scale MeanSex 1 21-40BP 40.26667 female 2 41-60BP 34.10714 female 3 61-79BP 37.3 female 4 21-40GH 30.25000 female 5 41-60GH 39.00926 female 6 61-79GH 49.3 female 7 21-40MH 56.5 female 8 41-60MH 62.42857 female 9 61-79MH 72.72727 female 1021-40PF 25.86111 female 1141-60PF 42.42063 female 1261-79PF 52.17172 female 1321-40RE 38.09524 female 1441-60RE 42.85714 female 1561-79RE 42.42424 female 1621-40RP 20.0 female 1741-60RP 25.89286 female 1861-79RP 15.90909 female 1921-40SF 51.7 female 2041-60SF 63.9 female 2161-79SF 57.95455 female 2221-40VT 32.1 female 2341-60VT 36.96429 female 2461-79VT 33.18182 female 2521-40BP 35.0 male 2641-60BP 37.75000 male 2761-79BP 36.0 male 2821-40GH 42.16667 male 2941-60GH 41.89062 male 3061-79GH 41.4 male 3121-40MH 72.0 male 3241-60MH 66.60417 male 3361-79MH 75.2 male 3421-40PF 41.85185 male 3541-60PF 55.31250 male 3661-79PF 47.0 male 3721-40RE 37.03704 male 3841-60RE 54.16667 male 3961-79RE 46.7 male 4021-40RP 27.8 male 4141-60RP 28.12500 male 4261-79RP 20.0 male 4321-40SF 61.1 male 4441-60SF 66.40625 male 4561-79SF 60.0 male 4621-40VT 38.9 male 4741-60VT 30.93750 male 4861-79VT 42.0 male), header=TRUE) # setup the plot for the max range plot(0, type='n', ylim=range(x$Mean), xlim=range(as.integer(x$Scale)), xaxt='n', ylab=Mean, xlab=Scale) # plot the axis axis(1, at=seq_along(levels(x$Scale)), labels=levels(x$Scale)) # split the data x.s - split(x, list(x$Ageclass, x$Sex)) # plot the data invisible(lapply(seq_along(x.s), function(.grp){ lines(as.integer(x.s[[.grp]]$Scale), x.s[[.grp]]$Mean, col=.grp, type='o') })) legend('topleft', legend=names(x.s), lwd=3, col=seq_along(x.s)) On Feb 12, 2008 12:21 PM, Tom Cohen [EMAIL PROTECTED] wrote: Dear list, I have following data set, which I want to plot the Scale variable on the x-axis and Mean´on the y-axis for each Ageclass and for each sex. The Mean value of each Ageclass for each sex would be connected by a line. Totally, there should be 6 lines, from which three present the Mean values of each Ageclass for respective sex. Are there any easy ways to do this in R? Ageclass Scale MeanSex 1 21-40BP 40.26667 female 2 41-60BP 34.10714 female 3 61-79BP 37.3 female 4 21-40GH 30.25000 female 5 41-60GH 39.00926 female 6 61-79GH 49.3 female 7 21-40MH 56.5 female 8 41-60MH 62.42857 female 9 61-79MH 72.72727 female 1021-40PF 25.86111 female 1141-60PF 42.42063 female 1261-79PF 52.17172 female 1321-40RE 38.09524 female 1441-60RE 42.85714 female 1561-79RE 42.42424 female 1621-40RP 20.0 female 1741-60RP 25.89286 female 1861-79RP 15.90909 female 1921-40SF 51.7 female 2041-60SF 63.9 female 2161-79SF 57.95455 female 2221-40VT 32.1 female 2341-60VT 36.96429 female 2461-79VT 33.18182 female 2521-40BP 35.0 male 2641-60BP 37.75000 male 2761-79BP 36.0 male 2821-40GH 42.16667 male 2941-60GH 41.89062 male 3061-79GH 41.4 male 3121-40MH 72.0 male 3241-60MH 66.60417 male 3361-79MH 75.2 male 3421-40PF 41.85185 male 3541-60PF 55.31250 male 3661-79PF 47.0 male 3721-40RE 37.03704 male 3841-60RE 54.16667 male 3961-79RE 46.7 male 4021-40RP 27.8 male 4141-60RP 28.12500 male 4261-79RP 20.0 male 4321-40SF 61.1 male 4441-60SF 66.40625 male 4561-79SF 60.0 male 4621-40VT 38.9 male 4741-60VT 30.93750 male 4861-79VT 42.0 male Thanks for any help, Tom - Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What
Re: [R] how to specify modes of certain fields in read.table
If you want to use colClasses, then do: read.table(, colClasses=rep('numeric', 50)) On Feb 12, 2008 5:40 PM, Weidong Gu [EMAIL PROTECTED] wrote: I have a data file with 50 columns. Among them, there are two coordinates, X and Y X Y 641673.78807 3607080.78438 641436.56207 3607108.30543 641165.28042 3607136.82957 640879.58373 3607116.20568 When I use read.table, it rounds X and Y to the maximal 8 decimal number as. 641673.8 3607081 641436.6 3607108 641165.3 3607137 640879.6 3607116 640683.5 3607105 My question is how to specify these two columns in read.table. Maybe colClasses helps but I have 50 columns... Thanks Weidong Gu, Department of Medicine University of Alabama, Birmingham 1900 University Blvd., Birmingham, Alabama 35294 Email: [EMAIL PROTECTED] PH: (205)-975-9053 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to specify modes of certain fields in read.table
It is just printing them out with that significance; the numbers are stored with about 15 digits. If you want more, use 'options': x - scan(textConnection(641673.78807 + + 3607080.78438 + + 641436.56207 + + 3607108.30543 + + 641165.28042 + + 3607136.82957 + + 640879.58373 + + 3607116.20568 + + ), what=0) Read 8 items x [1] 641673.8 3607080.8 641436.6 3607108.3 641165.3 3607136.8 640879.6 3607116.2 options(digits=20) x [1] 641673.78807 3607080.78438 641436.56207 3607108.30543 641165.28042 3607136.82957 640879.58373 [8] 3607116.20568 On Feb 12, 2008 5:40 PM, Weidong Gu [EMAIL PROTECTED] wrote: I have a data file with 50 columns. Among them, there are two coordinates, X and Y X Y 641673.78807 3607080.78438 641436.56207 3607108.30543 641165.28042 3607136.82957 640879.58373 3607116.20568 When I use read.table, it rounds X and Y to the maximal 8 decimal number as. 641673.8 3607081 641436.6 3607108 641165.3 3607137 640879.6 3607116 640683.5 3607105 My question is how to specify these two columns in read.table. Maybe colClasses helps but I have 50 columns... Thanks Weidong Gu, Department of Medicine University of Alabama, Birmingham 1900 University Blvd., Birmingham, Alabama 35294 Email: [EMAIL PROTECTED] PH: (205)-975-9053 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conflict within packages
you can use: package::getNames() to reference the one that you want. On Feb 12, 2008 3:45 PM, Elizabeth Purdom [EMAIL PROTECTED] wrote: Hi, I am trying to use two contributed packages, both of which have a function 'getNames'. So if I load them both they obviously conflict. Currently I manually detach one package and then reload the other to be able to use one function right after another. Is there anything else I can do? Best, Elizabeth __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reorder data frame columns by negating list of names
try this: x - read.table(textConnection( a b c d e f g h +1 1 6 11 16 21 26 31 36 +2 2 7 12 17 22 27 32 37 +3 3 8 13 18 23 28 33 38 +4 4 9 14 19 24 29 34 39 +5 5 10 15 20 25 30 35 40), header=TRUE) # initial columns init.cols - c('b', 'd', 'h') # now get the remaining remaining - setdiff(colnames(x), init.cols) x[,c(init.cols, remaining)] b d h a c e f g 1 6 16 36 1 11 21 26 31 2 7 17 37 2 12 22 27 32 3 8 18 38 3 13 23 28 33 4 9 19 39 4 14 24 29 34 5 10 20 40 5 15 25 30 35 On Feb 12, 2008 12:19 PM, Thompson, David (MNR) [EMAIL PROTECTED] wrote: Hello, I would like to reorder columns in a data frame by their names as demonstrated below: Take this data frame: xxx - data.frame(matrix(1:40, ncol=8)) names(xxx) - letters[1:8] xxx a b c d e f g h 1 1 6 11 16 21 26 31 36 2 2 7 12 17 22 27 32 37 3 3 8 13 18 23 28 33 38 4 4 9 14 19 24 29 34 39 5 5 10 15 20 25 30 35 40 and reorder the columns like this: xxx[,c( c('b', 'd', 'h'), c('a', 'c', 'e', 'f', 'g') )] b d h a c e f g 1 6 16 36 1 11 21 26 31 2 7 17 37 2 12 22 27 32 3 8 18 38 3 13 23 28 33 4 9 19 39 4 14 24 29 34 5 10 20 40 5 15 25 30 35 where I only have to name the columns that I'm interested in moving to the first few positions, something like: xxx[,c( c('b', 'd', 'h'), -c('b', 'd', 'h') )] Error in -c(b, d, h) : invalid argument to unary operator Suggestions? and Thank you, DaveT. * Silviculture Data Analyst Ontario Forest Research Institute Ontario Ministry of Natural Resources [EMAIL PROTECTED] http://ofri.mnr.gov.on.ca __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary statistics
Here is one way of doing it: (no exactly sure if 'mode' makes sense with your data) x - read.table(textConnection(RM mgl + 1 215 0.9285714 + 2 215 0.7352941 + 3 215 1.6455696 + 4 215 0.600 + 5 sc 1.833 + 6 sc 0.833 + 7 sc 2.5438596 + 8 sc 0.250 + 9 202NA + 10 202 0.550 + 11 202 0.8148148 + 12 202 1.667 + 13 198 0.5038760 + 14 198 0.3823529 + 15 198 0.760 + 16 198 0.480 + 17 hc 3.1818182 + 18 hc 3.7254902 + 19 hc 4.375 + 20 hc 2.6415094 + 21 190 0.350 + 22 190 0.440 + 23 190 0.650 + 24 190 0.500 + 25 bc 9.000 + 26 bc 5.000 + 27 bc 4.000 + 28 bc 3.200 + 29 185 0.7386364 + 30 185 0.500 + 31 185 1.1538462 + 32 185 0.600 + 33 179 1.8181818 + 34 179 1.198 + 35 179 2.500 + 36 179 2.000 + 37 148 2.083 + 38 148 2.333 + 39 148 3.100 + 40 148 2.2142857 + 41 119 2.444 + 42 119 2.3275862 + 43 119 4.7142857 + 44 119 1.7692308 + 45 61 2.889 + 46 61 3.250 + 47 61 4.750 + 48 61 2.6337449), header=TRUE) # compute the stats x.stats - by(x, x$RM, function(.rm){ + c(mean=mean(.rm$mgl, na.rm=TRUE), median=median(.rm$mgl, na.rm=TRUE)) + }) do.call(rbind, x.stats) meanmedian 119 2.8138868 2.3860153 148 2.4327381 2.2738095 179 1.8790455 1.9090909 185 0.7481206 0.6693182 190 0.485 0.470 198 0.5315572 0.4919380 202 1.0104938 0.8148148 215 0.9773588 0.8319327 61 3.3806584 3.069 bc 5.300 4.500 hc 3.4809545 3.4536542 sc 1.3651316 1.333 On Feb 12, 2008 11:57 AM, stephen sefick [EMAIL PROTECTED] wrote: below is my data frame. I would like to compute summary statistics for mgl for each river mile (mean, median, mode). My apologies in advance- I would like to get something like the SAS print out of PROC Univariate. I have performed an ANOVA and a tukey LSD and I would just like the summary statistics. thanks stephen RM mgl 1 215 0.9285714 2 215 0.7352941 3 215 1.6455696 4 215 0.600 5 sc 1.833 6 sc 0.833 7 sc 2.5438596 8 sc 0.250 9 202NA 10 202 0.550 11 202 0.8148148 12 202 1.667 13 198 0.5038760 14 198 0.3823529 15 198 0.760 16 198 0.480 17 hc 3.1818182 18 hc 3.7254902 19 hc 4.375 20 hc 2.6415094 21 190 0.350 22 190 0.440 23 190 0.650 24 190 0.500 25 bc 9.000 26 bc 5.000 27 bc 4.000 28 bc 3.200 29 185 0.7386364 30 185 0.500 31 185 1.1538462 32 185 0.600 33 179 1.8181818 34 179 1.198 35 179 2.500 36 179 2.000 37 148 2.083 38 148 2.333 39 148 3.100 40 148 2.2142857 41 119 2.444 42 119 2.3275862 43 119 4.7142857 44 119 1.7692308 45 61 2.889 46 61 3.250 47 61 4.750 48 61 2.6337449 -- Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] indices of rows containing one or more elements 0
Is this what you are after? test - matrix(c(0,2,0,1,3,5), 3,2) (x - which(test 0, arr.ind=TRUE)) row col [1,] 2 1 [2,] 1 2 [3,] 2 2 [4,] 3 2 unique(x[, 'row']) [1] 2 1 3 On Feb 12, 2008 9:40 PM, Ng Stanley [EMAIL PROTECTED] wrote: Hi, Given test - matrix(c(0,2,0,1,3,5), 3,2) test[test0] [1] 2 1 3 5 These are values 0 which(test0) [1] 2 4 5 6 These are array indices of those values 0 which(apply(test0, 1, all)) [1] 2 This gives the row whose elements are all 0 I can't seem to get indices of rows containing one or more elements 0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regular expression for na.strings / read.table
Here is one way of doing it: # read the file in as lines, do the convert and then re-read x - readLines(textConnection( X1 X.789 LNM. X78 X56 X89 X56.1 X100 + 1 2 700 AUW 78 56 8956 100 + 2 3 400 TOC 78 56 8956 10 + 3 4 389 RMN 78 56 8956 *89 + 4 5 400 LNM 78 56 *45256 100 + 5 6 200 UTC 78 *40 8956 100 + 6 7 100 GAT 78 56856 *100 + 7 879 *LNM 78 56956 100 + 8 989 TCG 78 56 80056 *100 + 9 10 78* LNM 78 56 8956 100)) x.c - gsub(\\*[[:alnum:]]*|[[:alnum:]]*\\*, NA, x) x.new - read.table(textConnection(x.c), header=TRUE) closeAllConnections() x.new X1 X.789 LNM. X78 X56 X89 X56.1 X100 1 2 700 AUW 78 56 8956 100 2 3 400 TOC 78 56 8956 10 3 4 389 RMN 78 56 8956 NA 4 5 400 LNM 78 56 NA56 100 5 6 200 UTC 78 NA 8956 100 6 7 100 GAT 78 56 856 NA 7 879 NA 78 56 956 100 8 989 TCG 78 56 80056 NA 9 10NA LNM 78 56 8956 100 On Feb 12, 2008 9:30 AM, [EMAIL PROTECTED] wrote: Dear all, I am working with a csv file. Some data of the file are not valid and they are marked with a star '*'. For example : *789. I have attached with this email a example file (test.txt) that looks like the data I have to work with. I see 2 possibilities ..thast I cannot manage anyway in R: 1-first easiest solution: Read the data with read.csv in R, and define as na strings all cells containing a star (*). Something which would looks like this ... DATA-read.csv(test.txt,na.strings=list(length(grep(\\*,DATA,value=T))==0)) DATA X1 X.789 LNM. X78 X56 X89 X56.1 X100 1 2 700 AUW 78 56 8956 100 2 3 400 TOC 78 56 8956 10 3 4 389 RMN 78 56 8956 *89 4 5 400 LNM 78 56 *45256 100 5 6 200 UTC 78 *40 8956 100 6 7 100 GAT 78 56856 *100 7 879 *LNM 78 56956 100 8 989 TCG 78 56 80056 *100 9 10 78* LNM 78 56 8956 100 ...but which would work (Stars are still there)! Do anyone knows how to do that ? 2-Second solution: - first read the file with DATA-read.csv(test.txt) - then replace all fields containing a * with NA in applying the following function to the object DATA: DATA_cleaned-apply(DATA,c(1,2),function(x){if(length(grep(\\*,x,value=TRUE))==1){x-NA}}) DATA_cleaned X1 X.789 LNM. X78 X56 X89 X56.1 X100 [1,] NULL NULL NULL NULL NULL NULL NULL NULL [2,] NULL NULL NULL NULL NULL NULL NULL NULL [3,] NULL NULL NULL NULL NULL NULL NULL NA [4,] NULL NULL NULL NULL NULL NA NULL NULL [5,] NULL NULL NULL NULL NA NULL NULL NULL [6,] NULL NULL NULL NULL NULL NULL NULL NA [7,] NULL NULL NA NULL NULL NULL NULL NULL [8,] NULL NULL NULL NULL NULL NULL NULL NA [9,] NULL NANULL NULL NULL NULL NULL NULL stars have deaseper, but all the rest too ! The pb comes from the fact that if a field does not contain any *, the command if(length(grep(\\*,x,value=T))==1) return NULL instead of FALSE ! I you have any idea, please let me know ! Many thanks, Jessica Jessica Gervais Mail: [EMAIL PROTECTED] Resource Centre for Environmental Technologies, Public Research Centre Henri Tudor, Technoport Schlassgoart, 66 rue de Luxembourg, P.O. BOX 144, L-4002 Esch-sur-Alzette, Luxembourg (See attached file: test.txt) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matching Problem
Here is one way of doing it: MyData - c(Test1,Test2,I(Test1^2),I(Test2^3),I(Test1.Test2^2)) x - gsub(^(.*\\(|)([^^)]*|.*).*, \\2, MyData) x [1] Test1 Test2 Test1 Test2 Test1.Test2 unique(x) [1] Test1 Test2 Test1.Test2 On Feb 12, 2008 5:44 AM, Tom.O [EMAIL PROTECTED] wrote: Hi I have this vector of strings. MyData - c(Test1,Test2,I(Test1^2),I(Test2^3),I(Test1.Test2^2)) where I want to extract only the text after I( and before ^ so that the string returned only contain c(Test1,Test2,Test1.Test2) I am not very skilled in the use of matching patterns so bare with me but I belive I should use gsub('^.\\(', ,MyData) for removing the I( and gsub(\\^.+, '',MyData) for the end. but theres got to be a more elegant way that does the trick in one go. So I would appriciate I anyone could give me some advice. Thanks Tom -- View this message in context: http://www.nabble.com/Matching-Problem-tp15430660p15430660.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] shaded area graph and extra plot
Use 'xlim=c(1993,2008)' in your second plot to setup the same range. On Feb 12, 2008 10:39 AM, Luis Ridao Cruz [EMAIL PROTECTED] wrote: R-help, I'm using the code below to plot a shaded area graph. At the same time I want to plot a second series on the y-axis (from par(new=T) on) but as the two series have different x-axis range (first 1994:2007 and second 1996:2007) the corresponding x's do not match. How can this be sorted out? Thanks in advance # plot.new() plot.window(xlim=c(1993,2008), xaxs=i, ylim=c(0,400), yaxs=i) x=1994:2007 xx = c(1994, x, 2007) yy1 = c(0, indexSp[,Xhat5Sp]+indexSp[,seA], 0 ) yy2 = c(0, indexSp[,Xhat5Sp]-indexSp[,seA], 0 ) polygon(xx, yy1, col=grey, lty=0) polygon(xx, yy2, col=white, lty=0) lines(x, indexSp[,Xhat5Sp], type=l) axis(1) axis(2) par(new=T) plot(1996:2007, c(0,0,indexSu[,Xhat5Su]), type=p, col=2, lwd=2, cex=1,ann=T,axes=F) axis(4) # __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rolling sum (like in Rmetrics package)
Have you tried 'filter'? x - 1:20 filter(x,filter=rep(1,5)) Time Series: Start = 1 End = 20 Frequency = 1 [1] NA NA 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 NA NA On 2/13/08, joshv [EMAIL PROTECTED] wrote: Hello, I'm new to R and would like to know how to create a vector of rolling sums. (I have seen the Rmetrics package and the rollMean function and I would like to do the same thing except Sum instead of Mean.) I imagine someone has done this, I just can't find it anywhere. Example: x - somevector #where x is 'n' entries long #what I would like to do is: x1 - x[1:20] output1 - sum(x1) x2 - x[2:21] output2 - sum(x2) x3 - ... ouput - c(output1, output2, ...) Thanks, JV -- View this message in context: http://www.nabble.com/rolling-sum-%28like-in-Rmetrics-package%29-tp15459848p15459848.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write output in a custom format
Here is a start. You basically have to interate through your data and use 'cat' to write it out: particle - list(dose=c(1,100.0,0),pos=data.frame(x=c(0,1,0,1),y=c(0,1,0,1))) output - file(/tempxx.txt, w) cat(particle$dose, \n, file=output, sep= ) for (i in 1:nrow(particle$pos)){ cat(particle$pos$x[i], particle$pos$y[i], \n, file=output, sep= ) } cat(#\n, file=output, sep= ) close(output) Here is what the file looks like: 1 100 0 0 0 1 1 0 0 1 1 # On 2/14/08, baptiste Auguié [EMAIL PROTECTED] wrote: Hi, I need to create a text file in the following format, 1 100.0 0 0 0 1 1 0 0 1 1 # 1 100.0 0 0 0 0 1 1 0 1 1 ... where # is part of the format and not a R comment. Each block (delimited by #) consists of a first line with three values, call it dose, and a list of (x,y) coordinates which are a matrix or data.frame, particle - list(dose=c(1,100.0,0),pos=data.frame(x=c(0,1,0,1),y=c (0,1,0,1))) print(particle) I'd like to establish a connection to a file and append to it a particle block in the format above, or even write the whole file at once. Because different lines have a different number of elements, I couldn't get write.table to work in this case, and my attempts at sink (), dump(), writeLines(), writeChar() all turn into really dirty solutions. I have this feeling I'm overlooking a simple solution. Any help welcome, baptiste _ Baptiste Auguié Physics Department University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag http://projects.ex.ac.uk/atto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replacing columns in a data frame using a previous condition
Is this what you want to do? x - data.frame(a=1:10, b=1:10, c=1:10, d=1:10) z - cbind(c=11:20, d=11:20) z c d [1,] 11 11 [2,] 12 12 [3,] 13 13 [4,] 14 14 [5,] 15 15 [6,] 16 16 [7,] 17 17 [8,] 18 18 [9,] 19 19 [10,] 20 20 x[,colnames(z)] - z[, colnames(z)] x a b c d 1 1 1 11 11 2 2 2 12 12 3 3 3 13 13 4 4 4 14 14 5 5 5 15 15 6 6 6 16 16 7 7 7 17 17 8 8 8 18 18 9 9 9 19 19 10 10 10 20 20 On 2/14/08, Jorge Iván Vélez [EMAIL PROTECTED] wrote: Dear R-list, I'm working with a data frame which dimensions are dim(GERU) [1] 3468 318 and looks like GERU[1:10,1:10] ped ind par1 par2 sex sta rs7696470 rs7696470.1 rs1032896 rs1032896.1 1 USA5854 200 2 1 4 4 1 1 2 USA5854 312 1 1 4 4 1 1 3 USA5854 412 2 2 1 4 1 3 4 USA5854 512 1 2 4 2 2 1 5 USA5855 100 1 1 0 0 0 0 6 USA5855 200 2 2 1 0 0 0 7 USA5855 312 1 2 0 2 0 0 8 USA5855 412 1 1 2 0 2 1 9 USA5855 512 1 2 0 1 0 0 10 USA5856 100 1 13 3 3 3 What I would like to do is: 1. Identify which column (from 6 to 318) has more than 4 categories (I solved that). In GERU would be rs7696470 and rs7696470.1. 2. Using the columns in step 1, replace its entries equals to 2 for 3. For example, rs7696470 would be 4,4,1,4,0,1,0,3,0,3 and so on. 3. Once replaced the entries, I need to rewrite the columns in GERU. Here is what I've done: # Function to identify columns with 3 or more categories tx=function(x) ifelse(dim(table(x))4,1,0) # Identifying the columns M4=apply(GUPN[,-c(1:6)],2,tx) names(which(MR==1))# Step 1 [1] rs335322 rs335322.1 rs186750 rs186750.1 rs1565901rs1565901.1 rs1565902 [8] rs1565902.1 rs11131334 rs11131334.1 rs1948616 rs1948616.1 rs4484334rs4484334.1 [15] rs1497921rs1497921.1 rs1391320rs1391320.1 rs1497913rs1497913.1 rs996208 [22] rs996208.1 # Step 2 REPLACE=GUPN[,names(which(AR==1))] RES=apply(REPLACE,2,function(x) ifelse(x==2,3,x)) RES[1:10,1:5] rs335322 rs335322.1 rs186750 rs186750.1 rs1565901 1 1 33 3 3 2 1 13 3 3 3 3 31 3 3 4 1 33 3 3 5 0 00 0 0 6 0 00 0 0 7 0 00 0 0 8 0 00 0 0 9 0 00 0 0 101 33 3 1 Now, the problem I have is replacing the columns in GERU by the columns in RES (step 3). At the end the dimension of the new data set should be 3468x318. Any help would be greatly appreciated. Thanks you so much, Jorge [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write output in a custom format
There is nothing wrong with a loop for handling this case. Most of your time is probably going to be spent writing out the files. If you don't want 'for' loops, you can use 'lapply', but I am not sure what type of performance improvement you will see. You are having to make decisions on each particle on how to write it. You can also use awk/perl as you indicated, but you would have to write the data out for those programs. You might take a test run and see. I would guess that by the time you format it for awk and then run awk, you could have done the whole thing in R. But it is your choice and there are plenty of tools to choose from. On 2/14/08, baptiste Auguié [EMAIL PROTECTED] wrote: Thanks for the input! It does work fine, however I'll have to do another loop to repeat this whole process quite a few times (10^3, 10^4 particles maybe), so I was hoping for a solution without loop. Maybe I could reshape all the values into a big array, dump it to a file and replace some values using system(awk...). I just don't really know how to format the data, having different number of values for some lines. Would that be a sensible thing to do? thanks, baptiste On 14 Feb 2008, at 16:49, jim holtman wrote: Here is a start. You basically have to interate through your data and use 'cat' to write it out: particle - list(dose=c(1,100.0,0),pos=data.frame(x=c(0,1,0,1),y=c (0,1,0,1))) output - file(/tempxx.txt, w) cat(particle$dose, \n, file=output, sep= ) for (i in 1:nrow(particle$pos)){ cat(particle$pos$x[i], particle$pos$y[i], \n, file=output, sep= ) } cat(#\n, file=output, sep= ) close(output) Here is what the file looks like: 1 100 0 0 0 1 1 0 0 1 1 # On 2/14/08, baptiste Auguié [EMAIL PROTECTED] wrote: Hi, I need to create a text file in the following format, 1 100.0 0 0 0 1 1 0 0 1 1 # 1 100.0 0 0 0 0 1 1 0 1 1 ... where # is part of the format and not a R comment. Each block (delimited by #) consists of a first line with three values, call it dose, and a list of (x,y) coordinates which are a matrix or data.frame, particle - list(dose=c(1,100.0,0),pos=data.frame(x=c(0,1,0,1),y=c (0,1,0,1))) print(particle) I'd like to establish a connection to a file and append to it a particle block in the format above, or even write the whole file at once. Because different lines have a different number of elements, I couldn't get write.table to work in this case, and my attempts at sink (), dump(), writeLines(), writeChar() all turn into really dirty solutions. I have this feeling I'm overlooking a simple solution. Any help welcome, baptiste _ Baptiste Auguié Physics Department University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag http://projects.ex.ac.uk/atto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? _ Baptiste Auguié Physics Department University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag http://projects.ex.ac.uk/atto __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Retrieving data frames from a for loop
Use a 'list' to capture the data within the loop: result - vector('list', 20) # preallocate tab - data.frame(x=1:20) for (i in 1:20) { + + g-sample(rep(LETTERS[1:2],each=10)) + result[[i]] -data.frame(tab,g) + + } # you can now access the combinations like this: result[[1]] x g 1 1 B 2 2 A 3 3 B 4 4 B 5 5 B 6 6 B 7 7 A 8 8 B 9 9 A 10 10 B 11 11 A 12 12 B 13 13 B 14 14 A 15 15 A 16 16 A 17 17 A 18 18 B 19 19 A 20 20 A result[[5]] x g 1 1 B 2 2 A 3 3 B 4 4 B 5 5 A 6 6 A 7 7 B 8 8 A 9 9 B 10 10 A 11 11 B 12 12 A 13 13 B 14 14 B 15 15 B 16 16 A 17 17 A 18 18 A 19 19 A 20 20 B On Thu, Feb 14, 2008 at 6:42 PM, Judith Flores [EMAIL PROTECTED] wrote: Dear R-helpers, I need to retrieve the data frames generated in a for loop. What I have looks something like this: where tab is a pre-existing data frame. for (i in 1:20) { g-sample(rep(LETTERS[1:2],each=10)) combination-data.frame(tab,g) } I tried to name every single combination doing this: assign(paste('combination',i), combination) without success. I need to retrieve every combination per separate. Thank you once again for your help. Looking for last minute shopping deals? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to specify the location of tick mark on x axies
I think what you want for your last statement is: lines(pts, y2) This uses the value of the tick marks to plot your line. On Feb 16, 2008 6:53 AM, Xin [EMAIL PROTECTED] wrote: hi, I did barplot. My data are: y1-c(13, 20, 22, 19, 10, 16, 8, 4, 3, 5, 7, 4, 0, 4, 4, 2, 4, 2, 2, 5, 1) y2-c(13, 23.29568698, 18.1385593, 14.97159795, 12.57640037, 10.65752306, 9.079421331, 7.7625489, 6.653641903, 5.714125735, 4.914645265, 4.232117758, 3.647980094, 3.147064034, 2.716830439, 2.346823055, 2.02826436, 1.753747752, 1.516997668, 1.31267921, 1.136244845 ) x-c(0, 1, 6, 11, 16, 21, 26, 31, 36, 41, 46, 51, 56, 61, 66, 71, 76, 81, 86, 91, 96) pts=barplot(y1,ylim=c(0,40),axes=TRUE,names.arg=x,border=TRUE,col=white) axis(side=1,at=pts, labels=F, tick=T) x axis with tickmarks exactly at the middle of the bars Then I want to add line into the barplot. I used lines(x,y2) But the data points of the line is plotted at the beggining of each category on x axis. I want to them plotted at the middle of each category. Can you help? Xin - Original Message - From: jim holtman [EMAIL PROTECTED] To: Xin [EMAIL PROTECTED] Sent: Saturday, February 16, 2008 11:43 AM Subject: Re: [R] how to specify the location of tick mark on x axies PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Can you provide an example of what you are doing and what you want. On Feb 16, 2008 6:14 AM, Xin [EMAIL PROTECTED] wrote: Dear: I want to plot barplot and let bar be in the middle of each x axis category. Do you have this experience? Many Thanks! Xin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to estimate weekly Variance
Be a better friend, newshound, and __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Felipe D. Carrillo Fishery Biologist US Fish Wildlife Service California, USA Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] predicting memory usage
If this is numeric, then for just storing one copy, you will require 86000 * 2500 * 8 = 1.7GB of memory. You should have 3-4X that if you want to analyze it, so you might need about 6GB of physical memory and a 64-bit version of R. Is there some other alternative? Do you need all the values at once, or can you use a database to access the portions you want? On 2/18/08, Federico Calboli [EMAIL PROTECTED] wrote: Hi All, is there a way of predicting memory usage? I need to build an array of 86000 by 2500 numbers (or I might create a list of 2 by 2500 arrays 43000 long). How much memory should I expect to use/need? Cheers, Fede -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St. Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 75941602 Fax +44 (0)20 75943193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Huge number
If you want to compute (157+221)! then sum up the log: a - 1:(157+221) sum(log10(a)) [1] 811.8165 This is about 6.55e811 which exceeds the range of floating point numbers (1.797693e+308). You might check out the Brobdingnag package. On Feb 18, 2008 6:23 PM, Hyojin Lee [EMAIL PROTECTED] wrote: Hi, I'm trying to calculate p-value to findout definitely expressed genes compare A to B situation. I got this data(this is a part of data) from whole organism , and each number means each expression values (that means, we could think 'a' gene is 13 in A situation, and it turns 30 in B situation) To findout probability, I'm going to use Audic - Claverie Method. ( The significance of digital gene expression profiles. 1997) But using this equation p(x|y), I have to calculate (x+y)! first. but I can't calculate (157+221)! or (666+1387)! in R. That's probabily the handling problem of huge number, How could I calculate p value in this data with R? A B Total5874641 6295980 a13 30 b36 39 c0 5 d40 61 e16 20 f13 11 g3 3 h9 5 i12 35 j157 221 k17 39 l6 17 m666 1387 n2 5 The significance of digital gene expression profiles. Audic S http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmedCmd=SearchTerm=%22A udic%20S%22%5BAuthor%5Ditool=EntrezSystem2.PEntrez.Pubmed.Pubmed_Result sPanel.Pubmed_RVAbstractPlusDrugs1 , Claverie JM http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmedCmd=SearchTerm=%22C laverie%20JM%22%5BAuthor%5Ditool=EntrezSystem2.PEntrez.Pubmed.Pubmed_Re sultsPanel.Pubmed_RVAbstractPlusDrugs1 . [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpolation between 2 vectors
check out the 'approx' function. On Feb 19, 2008 12:44 PM, Dani Valverde [EMAIL PROTECTED] wrote: Hello, I have two vectors, one with 13112 points and the other one with 10909. I wonder if there is a way to interpolate the data so the shorter vectors has the same number of points as the longer one. Best, Dani -- Daniel Valverde Saubí Grup de Biologia Molecular de Llevats Facultat de Veterinària de la Universitat Autònoma de Barcelona Edifici V, Campus UAB 08193 Cerdanyola del Vallès- SPAIN Centro de Investigación Biomédica en Red en Bioingeniería, Biomateriales y Nanomedicina (CIBER-BBN) Grup d'Aplicacions Biomèdiques de la RMN Facultat de Biociències Universitat Autònoma de Barcelona Edifici Cs, Campus UAB 08193 Cerdanyola del Vallès- SPAIN +34 93 5814126 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem Using the %in% command
With the format you have, we have to split out the genes separated by commas and then do 'table'. Here is one way of doing it: x - readLines(textConnection( Function x + Function1 gene5, gene19, gene22, gene23 + Function2 gene1, gene7, gene19 + Function3 gene2, gene3, gene7, gene23)) closeAllConnections() # funny data; split it up. get rid of header x - x[-1] # split on blanks x.b - strsplit(x, [[:blank:]]+) # recombine into a 'long' format x.c - lapply(x.b, function(z) cbind(z[1], unlist(strsplit(z[-1], , x.c - do.call(rbind, x.c) table(list(x.c[,1], x.c[,2])) .2 .1 gene1 gene19 gene2 gene22 gene23 gene3 gene5 gene7 Function1 0 1 0 1 1 0 1 0 Function2 1 1 0 0 0 0 0 1 Function3 0 0 1 0 1 1 0 1 On 2/20/08, Paul Christoph Schröder [EMAIL PROTECTED] wrote: I'm sorry if I didn't wrote it the right way. I'm just starting in the world of R and it's not that easy at the beginning. I wrote it again with code and comments. I hope it is understandable now. Do you think I should post it again in this shape? func_gen-read.delim(file, header=T) #contains functions (rows) and genes (colum); func_gen is a data.frame #It looks like this: # Function x # Function1 gene5, gene19, gene22, gene23 # Function2 gene1, gene7, gene19 # Function3 gene2, gene3, gene7, gene23 # Duplicates of genes exist between different functions. This is why the read.delim command was used instead of the read.table command #because of duplicate 'row.names' are not allowed error. all_genes #contains all genes from above data frame; all_genes is a data.frame #It looks like this: # Genes # gene1 # gene2 # gene3 # gene5 # gene7 # gene19 # gene 22 # gene 23 func_gen[,2] %in% all_genes #this should result in a true-false matrix # Like this: # Functiongene1gene2gene3 gene5 gene7 gene19 gene22 gene23 # Function1 F F F T F T T T # Function2 T F F F T T F F # Function3 F T T F T F F T #and instead I obtain a true-false matrix with only FALSE-values. Thanks in advance! Paul -- Paul C. Schröder PhD-Student Division of Proteomics, Genomics Bioinformatics Center for Applied Medicine (CIMA) University of Navarra Avda. Pio XII, 55 E-31008 Pamplona, Spain Tel: +34 948 194700, ext 5023 email: [EMAIL PROTECTED] jim holtman escribió: PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. It is hard to give a solution if we don't have the problem statement, or an example of the data structures you are using. On Feb 20, 2008 6:57 AM, Paul Christoph Schröder [EMAIL PROTECTED] wrote: Hello all! I have the following problem with the %in% command: 1) I have a data frame that consists of functions (rows) and genes (columns). The whole has been loaded with the read.delim command because of gene-duplications between the different rows. 2) Now, there is another data frame that contains all the genes (only the genes and without duplicates) from all the functions of the above data frame. What I want to do now is to use the % in % command to obtain a TRUE-FALSE data frame. This should be a data frame, where for every function some genes are TRUE and some are FALSE depending if they were or not in the specific function when matched against the all genes data frame. The main problem I have is the way how the genes are in the first data frame. I used the unlist command to separate them through commas ,. But every time I do the match between the first and second data frame it returns out FALSE for every gene in every function. Can anyone please give me a hind how to handle the problem? Thank you very much in advance! Paul -- Paul C. Schröder PhD-Student Division of Proteomics, Genomics Bioinformatics Center for Applied Medicine (CIMA) University of Navarra Avda. Pio XII, 55 E-31008 Pamplona, Spain Tel: +34 948 194700, ext 5023 email: [EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve
Re: [R] variable syntax problem
Exactly what do you mean by additional text? Have you tried paste? On 2/21/08, Paul Hammer [EMAIL PROTECTED] wrote: dear members, i would like to write a variable in a plot title (main=) but i don't know the right syntax:(...i tried a lot of different ways without success. here my example: y=30 z=33 for (i in 10:length(tissue)) { png(filename = tissues[i], width = 1024, height = 768, pointsize = 12, bg = white) gene.graph(ENSG0115252, rma.affy, gps=list(1:3, y:z), type=mean-int, gp.col=c(red, blue), by.order=TRUE, scale.to.gene=FALSE, use.symbol=TRUE, use.mt=FALSE, *main=PDE1A (red=prostate, blue=tissues[i])*, ylab=intensity / probeset, exon.y=1, exon.height=1, exon.bg.col=#c3c3c3, exon.bg.border.col=black, show.introns=TRUE) y=y-3 z=z-3 dev.off() } when i write main=tissues[i] the value is written right. but i would like to have an additional text... thanks paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] variable syntax problem
?assign On 2/21/08, Paul Hammer [EMAIL PROTECTED] wrote: Paul Hammer schrieb: jim holtman schrieb: Exactly what do you mean by additional text? Have you tried paste? On 2/21/08, Paul Hammer [EMAIL PROTECTED] wrote: dear members, i would like to write a variable in a plot title (main=) but i don't know the right syntax:(...i tried a lot of different ways without success. here my example: y=30 z=33 for (i in 10:length(tissue)) { png(filename = tissues[i], width = 1024, height = 768, pointsize = 12, bg = white) gene.graph(ENSG0115252, rma.affy, gps=list(1:3, y:z), type=mean-int, gp.col=c(red, blue), by.order=TRUE, scale.to.gene=FALSE, use.symbol=TRUE, use.mt=FALSE, *main=PDE1A (red=prostate, blue=tissues[i])*, ylab=intensity / probeset, exon.y=1, exon.height=1, exon.bg.col=#c3c3c3, exon.bg.border.col=black, show.introns=TRUE) y=y-3 z=z-3 dev.off() } when i write main=tissues[i] the value is written right. but i would like to have an additional text... thanks paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. thank you jim, that was what i meant :) now i would like to call a varaible like an another variable value... example: for (i in 10:length(tissue)) { PSA_SI_tissues[i] = splicing.index(rma.affy, ENSG0142515, tissue, c(prostate,tissues[i]), vector.out=FALSE) } with paste it does not work for (i in 10:length(tissue)) { paste(PSA_SI_,tissues[i]) = splicing.index(rma.affy, ENSG0142515, tissue, c(prostate,tissues[i]), vector.out=FALSE) } any suggestions? thanks paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unable to create/index a zoo irregular timeseries
You need to convert to POSIXct since POSIXlt is a vector of size 9. So do the following: miedate - as.POSIXct(strptime(as.character(pressione[,1]), format=%d-%m-%Y %H:%M:%S)) There is a newsletter (I forget the issue) that you might want to refer to on using 'dates'. On 2/21/08, vittorio [EMAIL PROTECTED] wrote: In the text file pressione2008.csv I have the following Data,MAX,MIN,Note 07-01-2008 08:00:00, 135, 90, Eccessi feste, inizio dieta 07-01-2008 18:00:00, 135, 85, 08-01-2008 08:00:00, 125, 75, which is a collection of blood pressure data at different time of the day. I would like to build an its with MIN MAX blood pressure but being a real newbye with zoo I obtain the following library(zoo) pressione - data.frame(read.csv(pressione2008.csv)) miedate - strptime(as.character(pressione[,1]), format=%d-%m-%Y %H:%M:%S) miedate [1] 2008-01-07 08:00:00 2008-01-07 18:00:00 2008-01-08 08:00:00 str(miedate) POSIXlt[1:9], format: 2008-01-07 08:00:00 2008-01-07 18:00:00 ... ts- as.zoo(matrix(pressione[,2:3],ncol=2), miedate) ts Error in Ops.POSIXt(freq, d) : * not defined for POSIXt objects ts- zoo(matrix(pressione[,2:3],ncol=2), miedate) Error in order(x, ..., na.last = na.last, decreasing = decreasing) : unimplemented type 'list' in 'orderVector1' In addition: Warning message: In zoo(matrix(pressione[, 2:3], ncol = 2), miedate) : some methods for zoo objects do not work if the index entries in 'order.by' are not unique Please help Ciao Vittorio __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] variable syntax problem
Also consider using a 'list' to store the results. On 2/21/08, Paul Hammer [EMAIL PROTECTED] wrote: Paul Hammer schrieb: jim holtman schrieb: Exactly what do you mean by additional text? Have you tried paste? On 2/21/08, Paul Hammer [EMAIL PROTECTED] wrote: dear members, i would like to write a variable in a plot title (main=) but i don't know the right syntax:(...i tried a lot of different ways without success. here my example: y=30 z=33 for (i in 10:length(tissue)) { png(filename = tissues[i], width = 1024, height = 768, pointsize = 12, bg = white) gene.graph(ENSG0115252, rma.affy, gps=list(1:3, y:z), type=mean-int, gp.col=c(red, blue), by.order=TRUE, scale.to.gene=FALSE, use.symbol=TRUE, use.mt=FALSE, *main=PDE1A (red=prostate, blue=tissues[i])*, ylab=intensity / probeset, exon.y=1, exon.height=1, exon.bg.col=#c3c3c3, exon.bg.border.col=black, show.introns=TRUE) y=y-3 z=z-3 dev.off() } when i write main=tissues[i] the value is written right. but i would like to have an additional text... thanks paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. thank you jim, that was what i meant :) now i would like to call a varaible like an another variable value... example: for (i in 10:length(tissue)) { PSA_SI_tissues[i] = splicing.index(rma.affy, ENSG0142515, tissue, c(prostate,tissues[i]), vector.out=FALSE) } with paste it does not work for (i in 10:length(tissue)) { paste(PSA_SI_,tissues[i]) = splicing.index(rma.affy, ENSG0142515, tissue, c(prostate,tissues[i]), vector.out=FALSE) } any suggestions? thanks paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Save a group of matrix
Look at using a list to store the data, something like this: results - list() for (year in 2002:2008){ + results[[as.character(year)]] - matrix(year,10,10) + } results $`2002` [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 2002 2002 2002 2002 2002 2002 2002 2002 2002 2002 [2,] 2002 2002 2002 2002 2002 2002 2002 2002 2002 2002 [3,] 2002 2002 2002 2002 2002 2002 2002 2002 2002 2002 [4,] 2002 2002 2002 2002 2002 2002 2002 2002 2002 2002 [5,] 2002 2002 2002 2002 2002 2002 2002 2002 2002 2002 [6,] 2002 2002 2002 2002 2002 2002 2002 2002 2002 2002 [7,] 2002 2002 2002 2002 2002 2002 2002 2002 2002 2002 [8,] 2002 2002 2002 2002 2002 2002 2002 2002 2002 2002 [9,] 2002 2002 2002 2002 2002 2002 2002 2002 2002 2002 [10,] 2002 2002 2002 2002 2002 2002 2002 2002 2002 2002 $`2003` [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 [2,] 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 [3,] 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 [4,] 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 [5,] 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 [6,] 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 [7,] 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 [8,] 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 [9,] 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 [10,] 2003 2003 2003 2003 2003 2003 2003 2003 2003 2003 $`2004` On 2/21/08, Alfonso Pérez Rodríguez [EMAIL PROTECTED] wrote: It seems that is not posible to send R file in the messages, well, then I resend the message with the script included. Hello, I'm creating a loop to work with vegan, to get a species abundance curve. Here I send the script I've created and also an excel file to prove what it can do. Well, I have a database with 20 years, and each year we have sampled 19 stratum, and in each estratum we have carry out some sumpling. Then, with the script that I've sent I've got to calculate the species abundance curve for each stratum but only for one year. I want to be able to do this for the 20 years sampled but separately, obtaining one independent matrix for each year, but I don't know how to do, I sure it's very simple but I've not encountered the way to do it. If someone can help me I would be very grateful, thank you SCRIPT library(reshape) library(vegan) Input=D:/R/Analisis aprendizaje/Input setwd(Input) Data=read.table(PruebasRNA3.csv,header=T,sep=;,dec=.) Estr=unique(Data$ESTRATO) LEstr=length(Estr) Results= matrix(nrow=20, ncol=LEstr) Results[is.na(Results)]=0 for(i in 1:LEstr) { Datasel=Data[Data$ESTRATO==Estr[i],] SubData=data.frame(Datasel$PESCA, Datasel$Sp, Datasel$Numero) TransData - reshape(SubData, v.names=Datasel.Numero, idvar=Datasel.PESCA, timevar=Datasel.Sp, direction=wide) TransData[is.na(TransData)] - 0 SAC=specaccum(TransData,random,permutations=100) # str(SAC), a través de esta función veo cual es la estructura de mis datos y puedo pedir las columnas que me interesen, que en este caso serían de la 3 a la 5 (sites, richness y sd) Pesc=length(SAC$richness) for (j in 1:Pesc) { Results[j,i]=SAC$richness[j] } } Results write.table(Results,file=D:/R/Analisis aprendizaje/Output/Results.txt) Alfonso Pérez Rodríguez Instituto de Investigaciones Marinas C/ Eduardo Cabello nº 6 C.P. 36208 Vigo (España) Tlf.- 986231930 Extensión 241 e-mail: [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get names of a list into df:s?
Here is one way of doing it: lapply(names(g), function(z)cbind(x=g[[z]], var1=z)) [[1]] x var1 1 1a 2 2a 3 3a [[2]] x var1 1 4b 2 5b 3 6b [[3]] x var1 1 7c 2 8c 3 9c On Thu, Feb 21, 2008 at 1:22 PM, Lauri Nikkinen [EMAIL PROTECTED] wrote: R users, I have a simple lapply question. g - list(a=1:3, b=4:6, c=7:9) g - lapply(g, function(x) as.data.frame(x)) lapply(g, function(x) cbind(x, var1 = rep(names(g), each=nrow(x))[1:nrow(x)])) I get $a x var1 1 1a 2 2a 3 3a $b x var1 1 4a 2 5a 3 6a $c x var1 1 7a 2 8a 3 9a And I would like to have $a x var1 1 1a 2 2a 3 3a $b x var1 1 4b 2 5b 3 6b $c x var1 1 7c 2 8c 3 9c How should I modify my lapply clause to achieve this? Best regards, Lauri sessionInfo() R version 2.6.1 (2007-11-26) i386-apple-darwin8.10.1 locale: C attached base packages: [1] stats graphics grDevices utils datasets methods base __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with cut
One way of finding out is to look at the code for cut.default. Here is the result of tracing through it where it determines where the cuts are for 12 equal spacings: D(2) [1] 149.804 166.170 182.536 198.902 215.268 231.634 248.000 264.366 280.732 297.098 313.464 329.830 [13] 346.196 As you can see one of the breakpoints is at 329.830 that is why 330 is in the (330,346] category. The statements in the function that do this are: if (length(breaks) == 1) { if (is.na(breaks) | breaks 2) stop(invalid number of intervals) nb - as.integer(breaks + 1) dx - diff(rx - range(x, na.rm = TRUE)) if (dx == 0) dx - abs(rx[1]) breaks - seq.int(rx[1] - dx/1000, rx[2] + dx/1000, length.out = nb) } You can see there is a small fudge factor applied to both ends to make sure all the data is included. That is what causes the perceived problem. On Fri, Feb 22, 2008 at 8:21 AM, [EMAIL PROTECTED] wrote: Hi All, I might misunderstood how cut works. But following behaviour surprises me. vv - seq(150, 346, by= 4) cc - cut(vv, 12) cc[vv == 330] Results [1] (330,346] I would have expected 330 to fall into (313,330] category. Can you please advice what do I do wrong? Many Thanks, Jussi Lehto Visit our website at http://www.ubs.com This message contains confidential information and is ...{{dropped:20}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with cut
You can also get more detail on where the intervals are with 'dig.lab': cc - cut(vv, 12, dig.lab=6) str(cc) Factor w/ 12 levels (149.804,166.17],..: 1 1 1 1 1 2 2 2 2 3 ... cc [1] (149.804,166.17] (149.804,166.17] (149.804,166.17] (149.804,166.17] (149.804,166.17] [6] (166.17,182.536] (166.17,182.536] (166.17,182.536] (166.17,182.536] (182.536,198.902] [11] (182.536,198.902] (182.536,198.902] (182.536,198.902] (198.902,215.268] (198.902,215.268] [16] (198.902,215.268] (198.902,215.268] (215.268,231.634] (215.268,231.634] (215.268,231.634] [21] (215.268,231.634] (231.634,248] (231.634,248] (231.634,248] (231.634,248] [26] (248,264.366] (248,264.366] (248,264.366] (248,264.366] (264.366,280.732] [31] (264.366,280.732] (264.366,280.732] (264.366,280.732] (280.732,297.098] (280.732,297.098] [36] (280.732,297.098] (280.732,297.098] (297.098,313.464] (297.098,313.464] (297.098,313.464] [41] (297.098,313.464] (313.464,329.83] (313.464,329.83] (313.464,329.83] (313.464,329.83] [46] (329.83,346.196] (329.83,346.196] (329.83,346.196] (329.83,346.196] (329.83,346.196] 12 Levels: (149.804,166.17] (166.17,182.536] (182.536,198.902] (198.902,215.268] ... (329.83,346.196] On Fri, Feb 22, 2008 at 10:59 AM, Henrique Dallazuanna [EMAIL PROTECTED] wrote: Is to show the categorys which contains '330' On 22/02/2008, Heinz Tuechler [EMAIL PROTECTED] wrote: At 15:22 22.02.2008, Henrique Dallazuanna wrote: Try this: grep(330, levels(cc), value=T) Could you please explain in a little more detail, how this answers the original question? I would have expected 330 to fall into (313,330] category. Can you please advice what do I do wrong? Thank you Heinz On 22/02/2008, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi All, I might misunderstood how cut works. But following behaviour surprises me. vv - seq(150, 346, by= 4) cc - cut(vv, 12) cc[vv == 330] Results [1] (330,346] I would have expected 330 to fall into (313,330] category. Can you please advice what do I do wrong? Many Thanks, Jussi Lehto Visit our website at http://www.ubs.com This message contains confidential information and is in...{{dropped:29}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Corrected : Efficient writing of calculation involving each element of 2 data frames
take a look at the 'embed' function. With the you can create a matrix with the added shifted in each column. You would want to do embed(your.data,100). On Fri, Feb 22, 2008 at 4:15 PM, Vikas N Kumar [EMAIL PROTECTED] wrote: Hi I have 2 data.frames each of the same number of rows (approximately 3 or more entries). They also have the same number of columns, lets say 2. One column has the date, the other column has a double precision number. Let the column names be V1, V2. Now I want to calculate the correlation of the 2 sets of data, for the last 100 days for every day available in the data.frames. My code looks like this : # Let df1, and df2 be the 2 data frames with the required data ## begin code snippet my_corr - c(); for ( i_start in 100:nrow(df1)) my_corr[i_start-99] - cor(x=df1[(i_start-99):i_start,V2],y=df2[(i_start-99):i_start,V2]) ## end of code snippet This runs very slowly, and takes more than an hour to run if I have to calculate correlation between 10 data sets leaving me with 45 runs of this snippet or taking more than 30 minutes to run. Is there an efficient way to write this piece of code where I can get it to run faster ? If I do something similar in Excel, it is much faster. But I have to use R, since this is a part of a bigger program. Any help will be appreciated. Thanks and Regards Vikas -- http://www.vikaskumar.org/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fixed effects
help.search('fixed effect') creates these matches. Does one of the do what you want? Help files with alias or concept or title matching 'fixed effect' using fuzzy matching: fixef(lme4)Extract Fixed Effects lmer(lme4) Fit (Generalized) Linear Mixed-Effects Models fixed.effects(nlme)Extract Fixed Effects fixed.effects.lmList(nlme) Extract lmList Fixed Effects lme(nlme) Linear Mixed-Effects Models lmeStruct(nlme)Linear Mixed-Effects Structure nlme(nlme) Nonlinear Mixed-Effects Models nlmeStruct(nlme) Nonlinear Mixed-Effects Structure On Fri, Feb 22, 2008 at 7:42 PM, Petros Andreou [EMAIL PROTECTED] wrote: Hello everyone! I would really appreciate it if someone knew where could I find the command in R in order to run a fixed effects regression. What format should my data have? I have looked through the manual and I could not find anything Thank you in advance, Petros [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Aranda-Ordaz
I have no idea if it is helpful, but a quick google search turned up: LINKINF # Two S-PLUS functions to compute influence diagnostics ...The assumed logit link is embedded within the Aranda-Ordaz parametric # family of link functions. # Written by John Yick and Andy H. Lee ... phase.hpcc.jp/mirrors/stat/S/linkinf - 5k - Cached - Similar pages - Note this On Sat, Feb 23, 2008 at 10:11 AM, o ha wang [EMAIL PROTECTED] wrote: Hi all, Does anyone know R code or SAS code for Aranda-Ordaz link family? thanks, xiao yue - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] color area between two time-series via polygon()?
I think you have to change one statement in your program: xx - cbind(time(z[,1]),rev(time(z[,2]))) On Sun, Feb 24, 2008 at 7:44 PM, [EMAIL PROTECTED] wrote: Hi all, I would like to color the area between two time-series. I tried it by using the polygon() function but I keeps drawing lines between beginning and end points. Is there another more appropriate function or how could I close the polygon at the end en the beginning of the time series (e.g., drawing a straight line)? The following doesn't plot a polygon between the two time-series: z - ts(matrix(rnorm(200), 100), start=c(1961, 1), frequency=12) plot(z, plot.type=single, lty=1:2) xx - cbind(time(z[,1]),rev(z[,2])) yy - cbind(as.vector(z[,1]),rev(as.vector(z[,2]))) polygon(xx,yy, col=gray, border = red) I would like to make it look like this (but then for time series) n - 100 xx - c(0:n, n:0) yy - c(c(0,cumsum(stats::rnorm(n))), rev(c(0,cumsum(stats::rnorm(n) plot (xx, yy, type=n, xlab=Time, ylab=Distance) polygon(xx, yy, col=gray, border = red) Thanks for your help, Jan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Graph Axis
You have to convert you date to be a Date class: x - read.table(/tempxx.txt, header=TRUE, as.is=TRUE) x$Date - as.Date(x$Date, %d/%m/%Y) plot(x$Date, x$Rate, type='l') On 2/25/08, Khadija Mohammedali [EMAIL PROTECTED] wrote: Hi I have data of exchange rates and time, and am trying to draw a graph that will show the rates on the y axis and dates on the x axis. I am using the following code: plot(rate, type='l', xlab='Date', ylab='Rate', main='£ to Euro rate over 5 years')This gives me the graph I want although I want to display the dates on the x axis, even if its just 2002, 2003,...2008. Attached is my data. Hope you can help. _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Graph Axis
Plot the x-axis with one less data point: plot(x$Date[-1], returns,) On Mon, Feb 25, 2008 at 5:39 PM, Khadija Mohammedali [EMAIL PROTECTED] wrote: Hi Jim Thank you for your quick response. This worked great. I am having the same problem again. I have moved on to calculating returns from rates and want to plot returns on the y axis and again dates on the x axis. The code I am using to calculate returns is as follows: rate-x$Rate returns-(diff(log(rate))) If I do: plot(returns, type=l) I get the graph I want however am having problems with the x axis again. Modification of the code below doesnt work as I now have one less dimension in returns. Hope I am making sense. Your help is much appreciated. Date: Mon, 25 Feb 2008 16:32:44 -0500 From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Re: [R] Graph Axis CC: r-help@r-project.org You have to convert you date to be a Date class: x - read.table(/tempxx.txt, header=TRUE, as.is=TRUE) x$Date - as.Date(x$Date, %d/%m/%Y) plot(x$Date, x$Rate, type='l') On 2/25/08, Khadija Mohammedali [EMAIL PROTECTED] wrote: Hi I have data of exchange rates and time, and am trying to draw a graph that will show the rates on the y axis and dates on the x axis. I am using the following code: plot(rate, type='l', xlab='Date', ylab='Rate', main='£ to Euro rate over 5 years')This gives me the graph I want although I want to display the dates on the x axis, even if its just 2002, 2003,...2008. Attached is my data. Hope you can help. _ [[elided Hotmail spam]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. She said what? About who? Shameful celebrity quotes on Search Star! -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining 40,000 with 40,000 data frame (different tact)
?rbind On 2/26/08, stephen sefick [EMAIL PROTECTED] wrote: I have not been able to find anything to do what I want, so I am going to tact to the left. I have twp continuous time series for two years with the same fourteen variables. I would like to simply append the second year to the first. They both have the same column headings etc. Just like tapping two pieces of paper together for a long number series. Thanks Stephen -- Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] numeric format
Those are parameter to 'print'; what you want is something like: x - data.frame(a=runif(10)) print(x) a 1 0.713705394 2 0.715496609 3 0.629578524 4 0.184360667 5 0.456639418 6 0.008667156 7 0.260985437 8 0.270915631 9 0.689128652 10 0.302484280 print(x,scientific=F, digits=4) a 1 0.713705 2 0.715497 3 0.629579 4 0.184361 5 0.456639 6 0.008667 7 0.260985 8 0.270916 9 0.689129 10 0.302484 On 2/26/08, cvandy [EMAIL PROTECTED] wrote: Hi! I'm an R newbie and this should be a trivial problem, but I can't make it work and cannot find what I'm doing wrong in the literature. I entered the the command: table-data.frame(x, scientific=F, digits=4) table This prints a column of x with 16 useless decimal places after the decimal point. Also, it prints an unwanted index number (1-20) in the left column. How do I get rid of the index column and how do I control the number of decimal places? Thanks in advance. CHV -- View this message in context: http://www.nabble.com/numeric-format-tp15700452p15700452.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot y1 and y2 on one graph
This should do what you want: x-1:10 y1-x+runif(10)*2 y2-seq(0,50,length.out=10)+rnorm(10)*10 plot(y1~x, bty='c') par(new=TRUE) # plot on the same graph plot(y2~x, col='red', axes=FALSE, bty='c', xlab='', ylab='') axis(4, col.axis='red', col='red') mtext(y2, 4, col='red', line=-2) On Wed, Feb 27, 2008 at 5:05 PM, milton ruser [EMAIL PROTECTED] wrote: Dear all I have a code like x-1:10 y1-x+runif(10)*2 y2-seq(0,50,length.out=10)+rnorm(10)*10 par(mfrow=c(1,2)) plot(y1~x) plot(y2~x) Now I would like to plot y1 and y2 on the same graph, with its two scales (y1 on left and y2 on rigth side). Any help are welcome. Kind regards Miltinho Brazil [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.csv +RMySQL request
?capture.output myoutput - capture.output(write.csv(...)) On Thu, Feb 28, 2008 at 7:34 PM, Tristan Casey [EMAIL PROTECTED] wrote: Hello, I am relatively new to R and learning its ins and outs. As part of a website I am building, I need to read and write csv files directly from an SQL database. Basically I want to convert R variables (dataframes) into CSV format, store them as another R variable (as a properly formatted text string suitable for csv reading) and then send this to one row in a database. The SQL part is fine, the problem arises because I cannot capture the output of write.csv! It posts to the terminal when file= is used, however I also want to store it. Does anyone have any ideas? Thanks in advance! _ e%2Ecom%2Fcgi%2Dbin%2Fa%2Fci%5F450304%2Fet%5F2%2Fcg%5F801459%2Fpi%5F1004813%2Fai%5F859641_t=762955845_r=tig_OCT07_m=EXT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting multiple tables when using table(dataframe) to tabulate data
Is this what you want? tapply(x$count, list(x$delta_ts, x$status), sum) ASSIGNED CLOSED NEW RESOLVED 2008-02-212 NA 20 2008-02-220 0 61 2008-02-232 1 120 2008-02-247 4 162 2008-02-252 6 225 2008-02-266 8 383 2008-02-27 NA 3 565 2008-02-28 NA 3 565 On Thu, Feb 28, 2008 at 8:22 PM, obradoa [EMAIL PROTECTED] wrote: I am having hard time tabulating data in a dataframe, and getting a single table for an answer. I am trying to tabulate all counts for given status on a given date. I have a data frame such as: delta_ts status count 1 2008-02-27 CLOSED 3 2 2008-02-27 NEW56 3 2008-02-27 RESOLVED 5 4 2008-02-21 ASSIGNED 1 5 2008-02-21 ASSIGNED 1 6 2008-02-21 NEW 2 7 2008-02-21 RESOLVED 0 8 2008-02-22 ASSIGNED 0 9 2008-02-22 CLOSED 0 10 2008-02-22 NEW 6 11 2008-02-22 RESOLVED 1 12 2008-02-23 ASSIGNED 2 13 2008-02-23 CLOSED 1 14 2008-02-23 NEW12 15 2008-02-23 RESOLVED 0 16 2008-02-24 ASSIGNED 7 17 2008-02-24 CLOSED 4 18 2008-02-24 NEW16 19 2008-02-24 RESOLVED 2 20 2008-02-25 ASSIGNED 2 21 2008-02-25 CLOSED 6 22 2008-02-25 NEW22 23 2008-02-25 RESOLVED 5 24 2008-02-26 ASSIGNED 6 25 2008-02-26 CLOSED 8 26 2008-02-26 NEW38 27 2008-02-26 RESOLVED 3 28 2008-02-28 CLOSED 3 29 2008-02-28 NEW56 30 2008-02-28 RESOLVED 5 When I do table on that frame I get a long list that looks like this: table(data) , , count = 0 status delta_ts ASSIGNED CLOSED NEW RESOLVED 2008-02-210 0 01 2008-02-221 1 00 2008-02-230 0 01 2008-02-240 0 00 2008-02-250 0 00 2008-02-260 0 00 2008-02-270 0 00 2008-02-280 0 00 and so on all the way up to , , count = 56 status delta_ts ASSIGNED CLOSED NEW RESOLVED 2008-02-210 0 00 2008-02-220 0 00 2008-02-230 0 00 2008-02-240 0 00 2008-02-250 0 00 2008-02-260 0 00 2008-02-270 0 10 2008-02-280 0 10 What I actually want is for my counts to be properly tabulated in one single table that looks something like this. delta_ts ASSIGNED CLOSED NEW RESOLVED 2008-02-212 5 915 and so on... Any ideas what I am doing wrong? Thanks! -- View this message in context: http://www.nabble.com/Getting-multiple-tables-when-using-table%28dataframe%29-to-tabulate-data-tp15750098p15750098.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Large loops hang?
If you really want to do a loop, then preallocate your storage. You were dynamically allocating each time through (or there abouts): system.time({ + res - numeric(10) + for (i in 1:10) { +x - rnorm(2) +res[i] - x[2] - x[1] +} + }) user system elapsed 2.750.023.10 On Thu, Feb 28, 2008 at 3:30 PM, Minimax [EMAIL PROTECTED] wrote: Dear useRs, Suppose we have loop: res - c() for (i in 1:10) { x - rnorm(2) res - c(res,x[2]-x[1]) } and this loop for 10^5 cases runs about - for example 5 minutes. When I add one zero (10^6) this loop will not end overnight but probably hangs. This occurs regardless of calculated statistics in such simulation, always above 10^5 times. Nested loops do not help. Any suggestions for collecting larger amount of Monte Carlo data ? Regards Minimax __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Re: How to create following chart for visualizing multivariate time series
Try something like this: require(grDevices) # for colours x - y - seq(-4*pi, 4*pi, len=27) r - sqrt(outer(x^2, y^2, +)) image(x, y, r, col=gray((0:32)/32)) colors - colorRampPalette(c('red', 'yellow', 'blue')) # create you color spectrum image(x,y,r, col=colors(100)) On Thu, Feb 28, 2008 at 9:28 PM, Megh Dal [EMAIL PROTECTED] wrote: I used ?image function to do that, like below : require(grDevices) # for colours x - y - seq(-4*pi, 4*pi, len=27) r - sqrt(outer(x^2, y^2, +)) image(x, y, r, col=gray((0:32)/32)) However my next problem to add a color pallet for color description [as shown in following link]. If anyone here tell me how to do that, it will be good for me. Regards, Megh Dal [EMAIL PROTECTED] wrote: Hi all, Can anyone here please tell me whether is it possible to produce a chart displayed in http://www.datawolf.blogspot.com/ in R for visualizing multivariate time series? If possible how? Regards, - - - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setwd on other computer?
I can do it under Windows for network mounted files that are on some other system: e.g., setwd(p:/APPS) On 2/29/08, Paul Hammer [EMAIL PROTECTED] wrote: hi members, is it possible to set the work directory ( e.g. via setwd() ) on a other computer than R has been started? thanks paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] while loop syntax help
Does this give the answer that you want? x - c(5,5,7,6,5,4,3) result - NULL for (i in 1:(length(x) - 2)){ + if ((x[i + 1] x[i]) (x[i + 2] x[i])) result - c(result, i) + } result [1] 3 4 5 On 2/29/08, zack holden [EMAIL PROTECTED] wrote: Dear list, I'm trying to write my first looping function in R. After many hours of searching help files and previous posts, I'm at wits end. Please forgive my programming ignorance...any help is greatly appreciated. I need to sort through a vector (x) and identify the point at which 2 successive values become smaller than the previous value. I've written a while statement that I think should work. It's should basically say: If value 1 value 2 and also value3, then == row(Value 1). Else, go to the next Value. However, output returns NULL, no matter how I've modified the syntax. Thanks in advance for any help. Zack # x - c(5,5,7,6,5,4,3) x - data.frame(x) y -length(x)-2counter - 1 output = c() while(counter = y) { counter1 - counter+1counter2 - counter+2 if(x[counter,1] x[counter1,1]|| x[counter1,1] x[counter2,1]){output = x[counter, ] } else { counter = counter+1 } counter = y} [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can the matrix size limit be increased?
You only have 1 row in your matrix, so what you are getting printed out is not an empty matrix, but the header. If you print the transpose you get: head(t(tst)) [,1] [1,]1 [2,]2 [3,]3 [4,]4 [5,]5 [6,]6 The default is to only print out 10 values. Your data is there, you just have to wait for the headers to print out, or at least make a matrix with more than one row. On Fri, Feb 29, 2008 at 2:25 PM, Robert Leach [EMAIL PROTECTED] wrote: Hi there, I'm brand new to R, so let me know if this question is not appropriate for this list. I've been reading through the documentation and have tried a number of things, but am pretty much stuck so far. Here's the session info: sessionInfo() R version 2.6.2 (2008-02-08) i386-apple-darwin8.10.1 locale: C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] rcompgen_0.1-17 So I seem to be hitting a limit on matrix size. First I read in my data into a list and it's OK: mz - scan(data.column3.txt, list(0)) Read 158991 records mz mz [[1]] [1] 0.00e+00 0.00e+00 1.003393e+01 3.651888e+00 0.00e +00 [6] 0.00e+00 3.067042e+00 1.277249e+00 1.984366e+00 3.644203e +01 [11] 1.172925e+02 1.933753e+02 2.020940e+02 1.570501e+02 8.990829e +01 ... But when I try to put it into a matrix like this, I don't get an error, but the matrix appears empty... MZ=matrix(mz[[1]],nrow=1) MZ [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [, 9][,10] [,11][,12][,13][,14] [,15][,16][, 17][,18] [,19][,20][,21][,22][,23][,24][, 25][,26] ... When I did a subset of my data, it was fine. I did a manual binary search and determined the cutoff to be 10 elements. So if I do just 99,999 elements, it looks as I would expect: mz - scan(data.column3.txt, list(0), 9) Read 9 records tst - matrix(mz[[1]],nrow=1) tst [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [, 9][,10] [1,]00 10.03393 3.65188800 3.067042 1.277249 1.984366 36.44203 [,11][,12][,13][,14] [,15][,16][, 17][,18] [1,] 117.2925 193.3753 202.0940 157.0501 89.9083 26.44127 17.05373 53.40315 [,19][,20][,21][,22][,23][,24][, 25][,26] [1,] 65.20086 37.33463 17.71247 27.37268 41.83289 48.46916 58.94969 76.05099 ... If I do 100,000, I get the same empty appearance. I've assumed that there must be a limitation on the number of elements in a matrix. Is that right? If so, how do I increase the maximum number of elements? I tried another machine's installation of R and it apparently doesn't have a 99,999 element limit. I've tried using: R --max-mem-size=2G R --max-vsize=20 R --max-nsize=20 R --max-vsize=20 --max-nsize=20 --max-ppsize=20 R --max-vsize=10M I still end up with the empty-looking matrix when I try these. How do I get my installation to work like the installation on another computer I tried where I was able to have larger matrices? Oh yeah, I also tried this, just to rule out problems with my data: tst - matrix(seq(1,158991),nrow=1,ncol=158991) tst [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [, 25] [,26] [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [, 37] [,38] ... Thanks, Rob Robert W. Leach Scientific Programmer Center for Computational Research Center of Excellence in Bioinformatics University at Buffalo http://www.ccr.buffalo.edu/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie: Incorrect number of dimensions
It would be helpful if you provided commented, minimal, self-contained, reproducible code. What does str(all_differ) say? That will tell you the structure of the object that you are trying to work with. On Sat, Mar 1, 2008 at 3:35 AM, Keizer_71 [EMAIL PROTECTED] wrote: dim(data.sub) [1] 1 140 #extracting all differentially express genes## library(multtest) two_side- (1-pt(abs(data.sub),50))*2 diff- mt.rawp2adjp(two_side) all_differ-diff[[1]][37211:1,] all_differ #list of differentially expressed genes## probe.names- + all_differ[[2]][all_differ[[1]][,BY]=0.01] Error in all_differ[[1]][, BY] : incorrect number of dimensions Hi, I am pretty new with R. What i am trying to do is to find all differentially express genes and list of differentially expressed genes. Am i doing something wrong? I keep getting incorrect number of dimensions. How do i find out the correct dimensions? thanks, Keizer -- View this message in context: http://www.nabble.com/Newbie%3A-Incorrect-number-of-dimensions-tp15773090p15773090.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Re: How to create following chart for visualizing multivariate time series
If you want color, then a slight addition to Henrique's solution will do it: x - y - seq(-4*pi, 4*pi, len=27) r - sqrt(outer(x^2, y^2, +)) library(lattice) colkey - colorRampPalette(c('red','yellow','green'))(32) levelplot(r, colorkey=list(col=colkey), col.regions=(col=colkey)) On Sat, Mar 1, 2008 at 6:08 PM, Henrique Dallazuanna [EMAIL PROTECTED] wrote: This works for me: x - y - seq(-4*pi, 4*pi, len=27) r - sqrt(outer(x^2, y^2, +)) library(lattice) levelplot(r, colorkey=list(col=gray((0:32)/32)), col.regions=(col=gray((0:32)/32))) 'r' is a matrix for you? On 01/03/2008, David Winsemius [EMAIL PROTECTED] wrote: Henrique Dallazuanna [EMAIL PROTECTED] wrote in news:[EMAIL PROTECTED]: library(lattice) levelplot(r, colorkey=list(col=gray((0:32)/32)), col.regions=(col=gray((0:32)/32))) When I try that example, I get an error, even after updating lattice. levelplot(r, colorkey=list(col=gray((0:32)/32)), + col.regions=(col=gray((0:32)/32))) Error in UseMethod(levelplot) : no applicable method for levelplot If I simply change colorkey=FALSE to colorkey=TRUE in the first levelplot help page example, I have what looks to me as success. levelplot(z~x*y, grid, cuts = 50, scales=list(log=e), xlab=, ylab=, main=Weird Function, sub=with log scales, colorkey = TRUE, region = TRUE) -- David Winsemius On 29/02/2008, Megh Dal [EMAIL PROTECTED] wrote: Hi Jim, i think you could not get my point. I did not want to put red-blue color there. I want to put a pallet which will describe the values of r. please have a look on following : http://bp0.blogger.com/_k3l6qPzizGs/RvDVglPknRI/AKo/itlWOvuuO tI/s1600-h/pairwise_kl_window60.png. Please see how a color pallate is added on the right side of this plot describing the value of red color, value of blue color etc. Is there any solution? Regards, jim holtman [EMAIL PROTECTED] wrote: Try something like this: require(grDevices) # for colours x - y - seq(-4*pi, 4*pi, len=27) r - sqrt(outer(x^2, y^2, +)) image(x, y, r, col=gray((0:32)/32)) colors - colorRampPalette(c('red', 'yellow', 'blue')) # create you color spectrum image(x,y,r, col=colors(100)) On Thu, Feb 28, 2008 at 9:28 PM, Megh Dal wrote: I used ?image function to do that, like below : require(grDevices) # for colours x - y - seq(-4*pi, 4*pi, len=27) r - sqrt(outer(x^2, y^2, +)) image(x, y, r, col=gray((0:32)/32)) However my next problem to add a color pallet for color description [as shown in following link]. If anyone here tell me how to do that, it will be good for me. Megh Dal wrote: Hi all, Can anyone here please tell me whether is it possible to produce a chart displayed in http://www.datawolf.blogspot.com/ in R for visualizing multivariate time series? If possible how? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] an efficient pairwise matrix cell's comparison function
Does this do what you want? A - matrix(sample(0:2, 25, TRUE), ncol=5) B - matrix(1:25, ncol=5) C - ifelse(A == 0, 0, B) A [,1] [,2] [,3] [,4] [,5] [1,]11121 [2,]10110 [3,]00102 [4,]01200 [5,]12122 B [,1] [,2] [,3] [,4] [,5] [1,]16 11 16 21 [2,]27 12 17 22 [3,]38 13 18 23 [4,]49 14 19 24 [5,]5 10 15 20 25 C [,1] [,2] [,3] [,4] [,5] [1,]16 11 16 21 [2,]20 12 170 [3,]00 130 23 [4,]09 1400 [5,]5 10 15 20 25 On Sun, Mar 2, 2008 at 7:11 AM, Diogo André Alagador [EMAIL PROTECTED] wrote: To all, I am undergoing an analysis involving big matrices of about 3x200 which I have to handle in a more efficient way. So I would like some advice to build such efficient function to deliver the following result: - starting with 2 matrices of the same dimension (eg. A and B) 0 0 3 5 6 0 0 5 A= 0 0 6 4 B= 0 4 3 5 0 0 5 0 1 0 0 9 - the function should deliver a C matrix (same dimension too), where at each position C(i,j), compares A and B. if A(i,j)=0, than C(i,j)=0, if A(i,j)!=0, than C(i,j)=B(i,j) 6 0 0 5 C= 0 0 3 5 0 0 0 0 Although not an expert I could build a function with 2 cycles (reading columns and rows) which is not quick. Maybe you can help me in this challenge. Much thanks in advance, Diogo André Alagador Biodiversity Global Change Lab, Museo Nacional de Ciencias Naturales, CSIC, Madrid, España Forest Research Centre, Instituto Superior de Agronomia, Universidade Técnica de Lisboa, Lisboa, Portugal [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R data Export to Excel
If you are asking how to convert to multiple columns in Excel, look at the text to column option in I think the data tab. On Sun, Mar 2, 2008 at 9:59 PM, Keizer_71 [EMAIL PROTECTED] wrote: Here is my R Code x-1:2 y-2:141 data.matrix-data.matrix(data[,y])#create data.matrix variableprobe-apply(data.matrix[x,],1,var) variableprobe #output variance across probesets hist(variableprobe) #displaying histogram of variableprobe write.table(cbind(data[1], Variance=apply(data[,y],1,var)),file='c://variance.csv') #export as a .csv file. Output in Excel all in 1 column. ProbeID Variance 1 224588_at 21.5825745738848 How do i separate them so that i can have three columns ProbeID Variance 1 224588_at 21.582. thanks, Kei -- View this message in context: http://www.nabble.com/R-data-Export-to-Excel-tp15796903p15796903.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help for the first poster- a simple question
FAQ 7.31 (You need to understand what floating point numbers are) On 3/3/08, Xuejun Qin [EMAIL PROTECTED] wrote: Hi, there, I cannot get accurate value for calculation. for example: ld-sqrt(1*0.05*0.95*0.05*0.95) 0.05*0.95-ld=-6.938894e-18 0.05*0.95-ld==0 is False. I met this problem in my program, how can I handle it. Thanks. xj. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tapply for Group Specific Means and Proportions
1 3 3 3 3 ... $ TreeHt : num 6 6 6 6 8 8 7 7 7 7 ... test-sort((tapply(Final$TreeHt, INDEX=interaction(Final$testdate, Final$testtime), FUN=mean, na.rm=TRUE))) data.frame(test) test 28Mar96.0752 6.00 28Mar96.1014 7.00 28Mar96.0924 7.33 29Mar96.0835 8.928571 28Mar96.0954 10.00 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simulation study using R
What is the format of the data you are storing (single value, multivalued vector, matrix, dataframe, ...)? This will help formulate a solution. What do you plan to do with the data? Are you going to do further analysis, write it to flat files, store it in a data base, etc.? How big are the data objects you are manipulating? On Mon, Mar 3, 2008 at 7:05 PM, Davood Tofighi [EMAIL PROTECTED] wrote: Dear All, I am running a Monte Carlo simulation study and have some questions on how to manage data storage efficiently at the end of each 1000 replication loop. I have three conditions coded using the FOR {} loops and a FOR loop that generates data for each condition, performs analysis, and computes a statistic 1000 times. Therefore, for each condition, I will have 1000 statistic values. My question is what's the best way to store the 1000 statistic for each condition. Any suggestion on how to manage such simulation studies is greatly appreciated. Thanks, -- Davood Tofighi Department of Psychology Arizona State University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simulation study using R
One of the things you might take a look at is the 'filehash' package. It is an easy way of storing/retrieving R objects. I have an application where my objects are matrices of about the same size and I can quickly store the data and then come back later with a different script to do further analysis. On 3/3/08, Davood Tofighi [EMAIL PROTECTED] wrote: Thanks for your reply. For each condition, I will have a matrix or data frames of 1000 rows and 4 columns. I also have a total of 64 conditions for now. So, in total, I will have 64 matrices or data frames of 1000 rows and 4 columns. The format of data I would like to store would be data frames or matrices. I also would like to store the data for later use, e.g., a plot of the empirical distribution of the chi^2, or to compute the power of Chi^2 across 1000 reps for each condition. On Mon, Mar 3, 2008 at 7:03 PM, jim holtman [EMAIL PROTECTED] wrote: What is the format of the data you are storing (single value, multivalued vector, matrix, dataframe, ...)? This will help formulate a solution. What do you plan to do with the data? Are you going to do further analysis, write it to flat files, store it in a data base, etc.? How big are the data objects you are manipulating? On Mon, Mar 3, 2008 at 7:05 PM, Davood Tofighi [EMAIL PROTECTED] wrote: Dear All, I am running a Monte Carlo simulation study and have some questions on how to manage data storage efficiently at the end of each 1000 replication loop. I have three conditions coded using the FOR {} loops and a FOR loop that generates data for each condition, performs analysis, and computes a statistic 1000 times. Therefore, for each condition, I will have 1000 statistic values. My question is what's the best way to store the 1000 statistic for each condition. Any suggestion on how to manage such simulation studies is greatly appreciated. Thanks, -- Davood Tofighi Department of Psychology Arizona State University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? -- Davood Tofighi Department of Psychology Arizona State University P.O. BOX 871104 Tempe, AZ 85287-1104 Tel.:480-727-7884 -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory constraints in ubuntu gutsy
What type of data do you have? Will it be numeric or factors? If it is all numeric, then you will need over 4GB just to hold one copy of the object (700,000 * 800 * 8). That is to hold the final object; I don't know how much additional space is required during the processing. What are you going to do with all of it at once? Can you read it in in parts and store it in a database and then just retieve the columns you need for processing? So your machine is probably not large enough to hold a single copy and you would have to be using a 64 - bit version of R. On 3/4/08, Randy Griffiths [EMAIL PROTECTED] wrote: Hello All, I have a very large data set (1.1GB) that I am trying to read into R. The file is tab delimited and contains headers; there are over 800 columns and almost 700,000 rows. I am using the Ubuntu 7.10 Gutsy Gibbon version of R. I am using Kernel Linux 2.6.22-14-generic. I have 3.1GB of RAM with the AMD Athlon(tm) 64 Processor 3200+. I downloaded R using the instructions from cran under Linux-Ubuntu. I need to be able to read the whole data set into R, but when I try right now, it will only use 4.2GB of the swap space (50% of the 8.5GB currently available) and won't go any further. I am new to Linux, but anxious to learn. Is there a memory constraint with this build of R? or is this something that can be fixed with hardware (like more RAM)? I thought that a 64bit version of R would be able to handle data of this magnitude. Is there a different version of Linux that is better for reading in large data sets such as this one? I know that databases can be used for large data, but i need run discriminant analysis or randomForest on all of the variables. Any of your suggestions would be very much appreciated. Sincerely, Randy Griffiths [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.