Re: [R] Help with read.csv2
On Thu, 2005-11-17 at 17:09 +0100, Matthieu Cornec wrote: Hello, I am importing the following file ;aa;bb;cc 1988;12;12;12 1989;78;78;12 1990;78;78;12 1991;78;78;12 1992;78;78;12 1993;78;78;12 1994;78;78;12 data-read.csv2(test.csv,header=T) -- it gives X aa bb cc 1 1988 12 12 12 2 1989 78 78 12 3 1990 78 78 12 4 1991 78 78 12 5 1992 78 78 12 6 1993 78 78 12 7 1994 78 78 12 How do I get : aa bb cc 1988 12 12 12 1989 78 78 12 1990 78 78 12 1991 78 78 12 1992 78 78 12 1993 78 78 12 1994 78 78 12 Are you indicating that you want the years to be the rownames and NOT a column in the data frame? If so, use the 'row.names' argument to indicate that the first column of the incoming data file contains the rownames: dat - read.csv2(clipboard, row.names = 1) dat aa bb cc 1988 12 12 12 1989 78 78 12 1990 78 78 12 1991 78 78 12 1992 78 78 12 1993 78 78 12 1994 78 78 12 str(dat) `data.frame': 7 obs. of 3 variables: $ aa: int 12 78 78 78 78 78 78 $ bb: int 12 78 78 78 78 78 78 $ cc: int 12 12 12 12 12 12 12 rownames(dat) [1] 1988 1989 1990 1991 1992 1993 1994 Note that you are getting the X in your initial example above, since you are missing the first column name in the text file. A better approach here might be (depending upon your intent) to use the 'col.names' argument to specify the column names explicitly: dat - read.csv2(clipboard, header = TRUE, col.names = c(Years, aa, bb, cc)) dat Years aa bb cc 1 1988 12 12 12 2 1989 78 78 12 3 1990 78 78 12 4 1991 78 78 12 5 1992 78 78 12 6 1993 78 78 12 7 1994 78 78 12 This way, you are defining the first column name during the import and your 'Years' are available as a data column. It all depends upon what you want to do with the data at this point. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] invert y-axis in barplot
On Wed, 2005-11-16 at 16:46 +, Jörg Schlingemann wrote: Hi! This is probably a very trivial question. Is there an easy way to invert the y-axis (low values on top) when using the function barplot()? Thanks, Jrg You mean something like this?: barplot(1:10, ylim = rev(c(0, 12))) or, something like this?: barplot(1:10, yaxt = n) axis(2, labels = rev(seq(0, 10, 2)), at = seq(0, 10, 2)) Note the use of rev() in each case. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] bug/feature with barplot?
On Mon, 2005-11-14 at 15:55 +0100, Karin Lagesen wrote: I have found a bug/feature with barplot that at least to me shows undesireable behaviour. When using barplot and plotting fewer groups/levels/factors(I am unsure what they are called) than the number of colors stated in a col statement, the colors wrap around such that the colors are not fixed to one group. This is mostly problematic when I make R figures using scripts, since I sometimes have empty input groups. I have in these cases experienced labeling the empty group as red, and then seeing a bar being red when that bar is actually from a different group. Reproducible example (I hope): barplot(VADeaths, beside=TRUE, col=c(red, green, blue, yellow, black)) barplot(VADeaths[1:4,], beside=TRUE, col=c(red, green, blue, yellow, black)) Now, I don't know if this is a bug or a feature, but it sure bugged me...:) Karin Most definitely not a bug. As with many vectorized function arguments, they will be recycled as required to match the length of other appropriate arguments. In this case, the number of colors (5) does not match the number of groups (4). Thus, they are out of synch with each other and you get the result you have. Not unexpected behavior. You should adjust your code and the function call so that the number of groups matches the number of colors. Something along the lines of the following: col - c(red, green, blue, yellow, black) no.groups - 4 barplot(VADeaths[1:no.groups, ], beside = TRUE, col = col[1:no.groups]) Now try: no.groups - 5 barplot(VADeaths[1:no.groups, ], beside = TRUE, col = col[1:no.groups]) no.groups - 3 barplot(VADeaths[1:no.groups, ], beside = TRUE, col = col[1:no.groups]) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Coercion of percentages by as.numeric
On Mon, 2005-11-14 at 19:07 +0200, Brandt, T. (Tobias) wrote: -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: 14 November 2005 06:21 PM On 11/14/05, Brandt, T. (Tobias) [EMAIL PROTECTED] wrote: Hi Given that things like the following work a - c(-.1, 2.7 ,B) a [1] -.12.7 B as.numeric(a) [1] -0.1 2.7 NA Warning message: NAs introduced by coercion I naively expected that the following would behave differently. b - c('10%', '-20%', '30.0%', '.40%') b [1] 10% -20% 30.0% .40% as.numeric(b) [1] NA NA NA NA Warning message: NAs introduced by coercion Try this: as.numeric(sub(%, e-2, b)) Thank you, that accomplishes what I had intended. I would have thought though that the expression 53% would be a fairly standard representation of the number 0.53 and might be handled as such. Is there a specific reason for avoiding this behaviour? 53% is a 'shorthand' character representation of a mathematical concept. To wit, the specific representation of a fraction using 100 as the denominator (ie. 53 / 100). The symbol '%' can be replaced by the word percent, such as 53 percent, which is also a character representation. 0.53, in context, is a numeric representation of a proportion in the range of 0 - 1.0. I can imagine that it might add unnecessary overhead to routines like as.numeric which one would like to keep as fast as possible. Perhaps there are other areas though where it might be desirable? For example I'm thinking of the read.table function for reading in csv files since I have many of these that have been saved from excel and now contain numbers in the % format. In Excel, numbers displayed with a '%' are what you see visually. However, the internal representation (how the value is actually stored in the program) is still as a floating point value, without the '%'. For example: a - 53 a [1] 53 sprintf(%.0f%%, a) [1] 53% is.numeric(a) [1] TRUE is.numeric(sprintf(%.0f%%, a)) [1] FALSE Unfortunately (depending upon your perspective), Excel, and other similar programs, tend to export the visually displayed values and not the internal representations of them. Thus, as Gabor pointed out, you will need to do some 'editing' of the values before using them in R. You can either do this in Excel, by removing the % formatting, or post-import in R as Gabor has described. You need to keep separate the internal representation of a value and its printed or displayed representation for human readable consumption. as.numeric() does basically one thing and it does it well and properly. It is up to the user to ensure that it is passed the proper values. When that is not the case, it issues an appropriate warning message and returns NA. Of course, using Gabor's hint, you can also write your own variation of as.numeric(), creating a function that takes percent formatted values and converts them as you require. One of the many strengths of R, is that you can extend it to meet your own specific requirements when the base functions do not. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] as.integer with base other than ten.
On Mon, 2005-11-14 at 19:01 +, William Astle wrote: Is there an R function analogous to the C function strtol? I would like to convert a binary string into an integer. Cheers for advice Will There was some discussion in the past and you might want to search the archive for a more generic solution for any base to any base, but for binary to decimal specifically, something like the following will work: bin2dec - function(x) { b - as.numeric(unlist(strsplit(x, ))) pow - 2 ^ ((length(b) - 1):0) sum(pow[b == 1]) } The function takes the binary string and splits it up into individual numbers ('b'). It then creates a vector of powers of 2 as long as 'b' less one through 0 ('pow'). It then takes the sum of the values of pow, indexed by 'b == 1'. bin2dec(101) [1] 5 bin2dec() [1] 15 bin2dec(101) [1] 95 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Orientation of tickmarks labels in boxplot/plot
On Wed, 2005-11-02 at 16:06 -0600, Michal Lijowski wrote: Hi, I have been trying draw tickmark labels along the y - axis perpendicular to the y axis while labels along the x - axis parallel to x axis while making box plot. Here is my test dataset. TData ID Ratio 1 0 7.075 2 0 7.414 3 0 7.403 4 0 7.168 5 0 6.820 6 0 7.294 7 0 7.238 8 0 7.938 9 1 7.708 10 1 8.691 11 1 8.714 12 1 8.066 13 1 8.949 14 1 8.590 15 1 8.714 16 1 8.601 boxplot(Ratio ~ ID, data=TData) makes box plot with tickmark labels parallel to the y - axis. So I try boxplot(Ratio ~ ID, data=TData, axes=FALSE) par(las=0) axis(1) and I get x - axis ranging from 0.5 to 2.5 (why?) and boxes at 1 and 2. par(las=2) axis(2) box() So, if I set tickmark labels parallel to y - axis somehow the x - axis range is not what I expect even if I use xlim = c(0.0, 3.0) in boxplot(Ratio ~ Id, data=TData, axes=FALSE, xlim=c(0.0, 3.0)) par(las=0) axis(1) Plots are in the attachments in pdf format. I appreciate any tips. I am using R 2.2.0 (2005-10-06) on FC4. Michal I suspect that you want this: boxplot(Ratio ~ ID, data=TData, las = 1) so that the x and y axis labels are horizontal. See ?par and review the options for 'las'. '1' is for axis tick mark labels to be horizontal. The x axis (horizontal axis if the boxes are vertical, the vertical axis if the boxes are horizontal) is set by default to the number of boxes (groups) +/- 0.5. 'xlim' is ignored in both cases. The boxes themselves are placed at integer values from 1:N, where N is the number of groups. There is the 'at' argument and there is an example of its use in ?boxplot. You can also search the archives for posts where variations on the use of 'at' have been posted (recently, in fact...) The actual plotting of the boxplots is done by bxp(), so review the help for that function as well. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] multiple boxplots
On Fri, 2005-10-28 at 10:26 -0700, J.M. Breiwick wrote: Hello, I want to plot 3 boxplots [ par(mfrow=c(3,1)) ] but the first one has 8 groups, the 2nd has 7 and the third has 6. But I the groups to line up: 1 2 3 4 5 6 7 8 2 3 4 5 6 7 8 3 4 5 6 7 8 where the numbers actually refer to years. I suspect I have to manipulate the function bxp(). Is there a relatively simple way to do this? Thanks for any help. Jeff Breiwick NMFS, Seattle Here is one approach. It is based upon using 2 principal concepts: 1. Create the first plot, save the plot region ranges from par(usr) and then use this information for the two subsequent plots, where we use the 'add = TRUE' argument. 2. Use the 'at' argument to specify the placement of the boxes in the second and third plots to line up with the first. So: # Create our data. dat - rnorm(80) years - rep(1991:1998, each = 10) # MyDF will be a data frame with two columns and we will use # this in the formula method for boxplot. # The second and third plots will use subset()s of MyDF MyDF - cbind(dat, years) # Set the plot matrix par(mfrow = c(3, 1)) # Create the first boxplot and save par(usr) boxplot(dat ~ years, MyDF) usr - par(usr) # Now open a new plot, setting it's par(usr) to # match the first plot. # Then use boxplot and set the boxes 'at' x pos 2:8 # and add it to the open plot plot.new() par(usr = usr) boxplot(dat ~ years, subset(MyDF, years %in% 1992:1998), at = 2:8, add = TRUE) # Rinse and repeat ;-) # Different subset and use 3:8 for 'at' plot.new() par(usr = usr) boxplot(dat ~ years, subset(MyDF, years %in% 1993:1998), at = 3:8, add = TRUE) Replace MyDF in the above with your actual datasets of course. This would of course be a bit easier if one could set an 'xlim' argument in boxplot(), but this is ignored by bxp(). HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] encrypted RData file?
On Thu, 2005-10-27 at 16:15 -0500, Na Li wrote: On 27 Oct 2005, Duncan Temple Lang wrote: Yes, it is of interest and was sitting on my todo list at some time. If you want to go ahead and provide code to do it, that would be terrific. There are other areas where encryption would be good to have, so a general mechanism would be nice. D. Na Li wrote: Hi, I wonder if there is interest/intention to allow for encrypted .RData files? One can certainly do that outside R manually but that will leave a decrypted RData file somewhere which one has to remember to delete. I was hoping someone has already done it. ;-( One possibility is to implement an interface package to gpgme library which itself is an interface to GnuPG. But I'm not sure how the input of passphrase can be handled without using clear text. Michael Seems to me that a better option would be to encrypt the full partition such that (unless you write the files to a non-encrypted partition) these issues are transparent. This would include the use of save(), save.image() and write() type functions to save what was an encrypted dataset/object to a unencrypted file. Of course, you would also have to encrypt the swap and tmp partitions (as appropriate) for similar reasons. On Linuxen/Unixen, full encryption of partitions is available via loopback devices and other mechanisms and some distros have this available as a built-in option. I believe that the FC folks finally have this on their list of functional additions for FC5. Windows of course can do something similar. The other consideration here, is that if R Core builds in some form of encryption, there is the potential for import/export restrictions on such technology since R is available via international CRAN mirrors. It may be best to provide for a plug-in encryption black box of sorts, so that folks can use a particular encryption schema that meets various legal/regulatory requirements. Of course, simply encrypting the file or even a complete partition has to be considered within a larger security strategy (ie. network security, physical access control, etc.) that meets a particular functional requirement (such as HIPAA here in the U.S.) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Graphics window always overlaps console window!
On Tue, 2005-10-25 at 11:55 -0600, [EMAIL PROTECTED] wrote: Does anyone know how I can set up R so that when I make a graphic, the graphics window remains behind the console window? It's annoying to have to reach for the mouse every time I want to type another line of code (e.g., to add another line to the plot). Thanks. What operating system? Default window focus behavior is highly OS and even window manager specific and is not an R issue. Depending upon your OS and window manager, you may need to check the documentation and/or do a Google search on window focus for further information. Another alternative, if you are on Windows, is to review Windows FAQ 5.4 How do I move focus to a graphics window or the console?, but this is a programmatic approach and not a means to affect default behavior. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Graphics window always overlaps console window!
On Tue, 2005-10-25 at 13:07 -0500, Marc Schwartz (via MN) wrote: On Tue, 2005-10-25 at 11:55 -0600, [EMAIL PROTECTED] wrote: Does anyone know how I can set up R so that when I make a graphic, the graphics window remains behind the console window? It's annoying to have to reach for the mouse every time I want to type another line of code (e.g., to add another line to the plot). Thanks. What operating system? Default window focus behavior is highly OS and even window manager specific and is not an R issue. Depending upon your OS and window manager, you may need to check the documentation and/or do a Google search on window focus for further information. Another alternative, if you are on Windows, is to review Windows FAQ 5.4 How do I move focus to a graphics window or the console?, but this is a programmatic approach and not a means to affect default behavior. Yet another approach which I just remembered is that (if on Windows) MS offers a program called Tweak UI: http://www.microsoft.com/windowsxp/downloads/powertoys/xppowertoys.mspx as part of their Power Toys add-ons. You might want to review that to see if there is a setting for the handling of new window focus behavior. HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Basic: setting resolution and size of an R graphic
On Mon, 2005-10-24 at 22:32 +0200, Dr. med. Peter Robinson wrote: Dear List, I am sorry if this perhaps a too basic question, but I have not found an answer in Google or in the R help system. I am trying to use R to do a very simple analysis of some data (RT-PCR and Western analysis) with a T-test and to plot the results as a histogram with error bars. (I have pasted an example script at the bottom of this mail). In order to use this for publication, I would like to adjust the resolution and size of the final image. However, even using file types such as postscript or pdf that are vector based, I get rather bad-looking results with pdf(file=test.pdf) source(script at bottom of mail) dev.off() using either pdf or postscript or jpg devices. Therefore I would like to ask the list, how to best produce a graphic from the script below that would fit into one column of a published article and have a high resolution (as eps, or failing that tiff or png)? Thanks in advance for any advice, Peter Snip of code What OS are you on? Running your example on FC4, I get the attached output for a pdf(). I suspect that on your OS, the height and width arguments are not appropriate by default. Thus, you may need to adjust your pdf (or postscript) function call to explicitly specify larger height and width arguments. Also note that to generate an EPS file, pay attention to the details section of ?postscript, taking note of the 'onefile', 'horizontal' and 'paper' arguments and settings. Also, check with your journal to see if they specify dimensions for such graphics so that you can abide by their specs if provided. If they are using LaTeX, there are means of specifying and/or adjusting the height and/or width specs in the code based upon proportions of various measures (ie. \includegraphics[width=0.9\textwidth]{GraphicsFile.eps} ). HTH, Marc Schwartz test.pdf Description: Adobe PDF document __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] make three plot to one plot
On Fri, 2005-10-21 at 21:13 +0200, Jan Sabee wrote: Dear all, I want to make three plot below to only one plot together with legend, how can I do that? I have tried with matplot function but I did not succeed. Thanks for your help. Sincerelly, Jan Sabee test.five.x - c(0.02,0.05,0.07,0.09,0.10,0.12,0.13,0.14,0.16,0.17,0.20,0.21,0.34,0.40) test.five.y - c(18,12,17,12,3,15,1,5,1,1,3,10,15,10) plot(test.five.x, test.five.y, type=l,lty=1, lwd=3, col='red') legend(par('usr')[2], par('usr')[4], xjust=1, c('five'), lwd=3, lty=1, col=c('red')) test.six.x - c(0.03,0.05,0.07,0.08,0.09,0.10,0.12,0.13,0.14,0.15,0.17,0.18,0.21,0.22,0.24,0.33,0.39,0.43) test.six.y - c(8,32,14,21,3,8,11,14,11,21,16,14,10,21,1,2,13,19) plot(test.six.x, test.six.y, type=l,lty=2, lwd=3, col='green') legend(par('usr')[2], par('usr')[4], xjust=1, c('six'), lwd=3, lty=2, col=c('green')) test.seven.x - c(0.04,0.05,0.08,0.09,0.10,0.12,0.13,0.14,0.16,0.17,0.19,0.21,0.22,0.24,0.29,0.32,0.38,0.40,0.46,0.50) test.seven.y - c(18,135,240,42,63,128,215,267,127,36,21,23,223,12,66,96,12,26,64,159) plot(test.seven.x, test.seven.y, type=l,lty=3, lwd=2, col='blue') legend(par('usr')[2], par('usr')[4], xjust=1, c('seven'), lwd=3, lty=3, col=c('blue')) Jan, matplot() won't work well here, because you do not have common x axis values across the three sets of data. Thus, you need to use plot() and then lines(). Try this: # First set up your data test.five.x - c(0.02,0.05,0.07,0.09,0.10,0.12, 0.13,0.14,0.16,0.17,0.20,0.21, 0.34,0.40) test.five.y - c(18,12,17,12,3,15,1,5,1,1,3,10, 15,10) test.six.x - c(0.03,0.05,0.07,0.08,0.09,0.10, 0.12,0.13,0.14,0.15,0.17,0.18, 0.21,0.22,0.24,0.33,0.39,0.43) test.six.y - c(8,32,14,21,3,8,11,14,11,21,16, 14,10,21,1,2,13,19) test.seven.x - c(0.04,0.05,0.08,0.09,0.10,0.12, 0.13,0.14,0.16,0.17,0.19,0.21, 0.22,0.24,0.29,0.32,0.38,0.40, 0.46,0.50) test.seven.y - c(18,135,240,42,63,128,215,267, 127,36,21,23,223,12,66,96,12, 26,64,159) # Now use plot to draw the first set of data # The key here is to properly set to the x and y axis # ranges ('xlim' and 'ylim') so that they are based # upon all three sets of data. # Note that I also set the x and y axis labels to . You # can adjust these and the plot title as you require. plot(test.five.x, test.five.y, type=l,lty=1, lwd=3, col=red, xlim = range(test.five.x, test.six.x, test.seven.x), ylim = range(test.five.y, test.six.y, test.seven.y), xlab = , ylab = ) # Now use lines() to add the second and third sets of data lines(test.six.x, test.six.y, lty=2, lwd=3, col=green) lines(test.seven.x, test.seven.y, lty=3, lwd=2, col=blue) # Now use legend(). Place it at the upper right hand corner # and set the line types and colors to reflect the above legend(topright, xjust = 1, legend = c(five, six, seven), lwd = 3, lty = 1:3, col = c(red, green, blue)) See ?lines for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] peculiar matrices
On Fri, 2005-10-21 at 21:32 +, Ben Bolker wrote: As far as I can tell from reading The Fine Documentation (R Language Definition and Intro to R), matrices are supposed to be of homogeneous types. Yet giving matrix() an inhomogeneous list seems to work, although it produces a peculiar object: v = list(1:3,4,5,a) m = matrix(v,nrow=2) m [,1] [,2] [1,] Integer,3 5 [2,] 4 a m[1,] [[1]] [1] 1 2 3 [[2]] [1] 3 (this is R 2.1.1, running under Linux) Ben, If you review the structure of 'm' note: str(m) List of 4 $ : int [1:3] 1 2 3 $ : num 4 $ : num 5 $ : chr a - attr(*, dim)= int [1:2] 2 2 that it is actually a list, even though: class(m) [1] matrix Also: mode(m) [1] list typeof(m) [1] list If you remove the dim attributes, you get: dim(m) - NULL m [[1]] [1] 1 2 3 [[2]] [1] 4 [[3]] [1] 5 [[4]] [1] a So I would argue that it is consistent with the documentation in that, while the printed output is that of a matrix, it is a list, which of course can handle heterogeneous data types. This is Version 2.2.0 Patched (2005-10-20 r35979). Should there be a check/error? Or is this just analogous to the joke about going to the doctor and saying it hurts when I do this, and the doctor saying well then, don't do that? Maybe more like: Doctor, my eye hurts when I drink my tea. and the doctor says, Well, remove the spoon from the cup before you drink. ;-) Regards, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Automatic rounding of values after factors , converted to numeric, are multipled by a real number
On Thu, 2005-10-20 at 16:20 +0200, Peter Dalgaard wrote: Nelson, Gary (FWE) [EMAIL PROTECTED] writes: Peter, Thank you for your response. I knew how close the values are to integers, but I still don't understand why I don't have control over how the numbers are displayed (rounded or not)? It's all in the conventions. The digits in print() and friends are significant digits, so we first round to that many significant digits, then discard trailing zeros, which is why 12.51 [1] 12.5 12.50001 [1] 12.50001 The exception is that we do not discard significant digits to the left of the decimal point, unless we are using scientific notation print(12345678,digits=2) [1] 1.2e+07 print(12345678,digits=5) [1] 12345678 (the scipen options controls the logic for switching notation). For finer control we have the formatC function: format(1234.1,digits=9) # same thing as with print() [1] 1234.1 format(1234.1,digits=8) [1] 1234 formatC(1234.1,digits=5,format=f) [1] 1234.1 formatC(1234.1,digits=4,format=f) [1] 1234. Also, sprintf(): sprintf(%.9f, 1234.1) [1] 1234.1 sprintf(%.4f, 1234.1) [1] 1234. sprintf(%12.4f, 1234.1) [1]1234. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] having scaling problems with a histogram
On Thu, 2005-10-20 at 17:31 +0200, Christian Jones wrote: Hello,?xml:namespace prefix = o ns = urn:schemas-microsoft-com:office:office /o:p/o:p I would like to create a histogram from a data collumn consisting of 4 classes (0; 0.05;0.5;25;75). Due to the difference in scale the classes 0;0.05 and 0.5 are displayed within one combined bin by default with the code:Hist(x, scale=percent, breaks=Sturges). How can I display them all, or as unique classes, leaving out the rest of the x_axes scale? There was no enlightenment in the help function and the command : breaks=5 did not do the trick.o:p/o:p Thanks in advance o:p/o:p Chriso:p/o:p I may be mis-understanding what you are doing here, but it sounds like you should be using barplot() rather than hist(), which is referenced in the See Also in ?hist. This would be preferred if you want individual counts or proportions (or percents) from specific groups without binning. Where does the 'scale = percent' come from? I don't recognize that argument from the standard histogram plotting functions. Oh, wait a minute, just found it doing a search. The Hist() function is in John Fox's Rcmdr. Please be sure to note this if you are using a function from a contributed package. It saves time and increases the likelihood of your getting a reply. Also, I am guessing that the extraneous content here is the result of using MS Word as your e-mail editor? Please use plain text only, which is the defined format for this list and from reviewing the archive, has been pointed out to you previously by Frank Harrell and Ted Harding. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Boxplot labels
On Thu, 2005-10-20 at 14:46 -0400, Keith Sabol wrote: I am creating boxplots from a dataframe and would like to add to the standard output a marker representing the value from a particular row in the dataframe. .And I apologize if the solution is as trivial as it seems it should be. Thanks for any assistance. Keith Some example code would be helpful here. However, some hints that I suspect will be helpful: 1. Unless you use the 'at' argument in boxplot(), the box center positions are at integer values from 1:n, where n is the number of groups. 2. You can thus use the points() function to add symbols or the text() function to add labels to the existing plot as you may require. See ?points and/or ?text for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Boxplot labels
On Thu, 2005-10-20 at 15:41 -0500, Marc Schwartz (via MN) wrote: On Thu, 2005-10-20 at 14:46 -0400, Keith Sabol wrote: I am creating boxplots from a dataframe and would like to add to the standard output a marker representing the value from a particular row in the dataframe. .And I apologize if the solution is as trivial as it seems it should be. Thanks for any assistance. Keith Some example code would be helpful here. However, some hints that I suspect will be helpful: 1. Unless you use the 'at' argument in boxplot(), the box center positions are at integer values from 1:n, where n is the number of groups. 2. You can thus use the points() function to add symbols or the text() function to add labels to the existing plot as you may require. See ?points and/or ?text for more information. One other note to add here is that boxplot() will return a list which contains various stats about your grouped data. This requires using the form: stats - boxplot(...) Thus using the 'stats$group' and 'stats$out' components will give you the x and y positions of any outliers. This will be helpful if these are the values that you might want to identify See the Value section in ?boxplot and ?boxplot.stats for more information. HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] matching two plots
On Wed, 2005-10-19 at 18:27 +0100, Rui Cardoso wrote: Hi, I have a problem about graphics. I would like to plot two graphs: a barplot and curve. Here is the code: barplot(dpois(0:45,20),xlim=c(0,45),names=0:45) curve(dnorm(x,20,sqrt(20)),from=0,to=45,add=T) Both graphs are drawn in the same figure, however the scale in both graphs dooes not match. For some reason the second plot is shifted to left. I think there is a problem concerning the axis scale. Thanks a lot. Rui This came up this past summer and Gabor and I had a couple of different approaches to the solution. You can see the posts here: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/57431.html and http://finzi.psych.upenn.edu/R/Rhelp02a/archive/57432.html Using my approach with barplot(), what you need to do is to set the 'space' argument to 0 so that there is no space between the bars. This will then place the bar centers at increments of 0.5, in your case seq(0.5, 45.5, 5). Knowing this, you can then adjust the x axis position in curve() by +0.5 to coincide with the bars. So you end up with this: barplot(dpois(0:45, 20), names = 0:45, space = 0, ylim = c(0, .1)) curve(dnorm(x, 20 + 0.5, sqrt(20)), add = TRUE) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Efficient ways of finding functions and Breslow-Day test for homogeneity of the odds ratio
On Tue, 2005-10-18 at 14:27 +0100, MJ Price, Social Medicine wrote: Dear all, I have been trying to find a function to calculate the Breslow-Day test for homogeneity of the odds ratio in R. I know the test can be preformed in SAS but i was wondering if anyone could help me to perform this in r. In addition i have the fullrefman file to search for functions in the basic R packages, does anyone have any suggestions of an efficient way of searching for functions in the other add-on packages on the website? Thanks in advance Malcolm RSiteSearch(Breslow), which will perform a search of the e-mail list archives and other online documentation, reveals this post: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/50202.html which in turn leads to this code: http://www.math.montana.edu/~jimrc/classes/stat524/Rcode/breslowday.test.r There is also code for the Woolf test in ?mantelhaen.test Further searches can be conducted using help.search(Keyword), which will search your installed set of packages. See ?help.search and ?RSiteSearch for more help on these. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] p-value calculation
On Tue, 2005-10-18 at 17:13 +0200, richard mendes wrote: hello everybody i'm very new at using R so probably this is a very stupid question. I have a problem calculating a p-value. When i do this with excel i can use the method CHIDIST for 1.2654 with 1 freedom degree i get the answer 0.261 i just want to do the same thing in R but i can't find a method. can somebody help me friendly regards richard pchisq(1.2654, 1, lower.tail = FALSE) [1] 0.2606314 See ?pchisq for more information. You might also want to read Chapter 8 Probability Distributions in An Introduction To R, available with your R installation or from the Documentation links on the main R web site. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Efficient ways of finding functions and Breslow-Day test for homogeneity of the odds ratio
On Wed, 2005-10-19 at 06:47 +1000, Tim Churches wrote: Marc Schwartz (via MN) wrote: There is also code for the Woolf test in ?mantelhaen.test Is there? How is it obtained? The documentation on mantelhaen.test in R 2.2.0 contains a note: Currently, no inference on homogeneity of the odds ratios is performed. and a quick scan of the source code for the function didn't reveal any meantion of Woolf's test. Tim C Review the code in the examples on the cited help page... :-) HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] y axis in histograms
On Mon, 2005-10-17 at 16:51 +0200, Jorge Gaspar Sanz Salinas wrote: Hi all, This is my first post, I hope you will help me. I've some data to present with histograms. I have few values with almost 99% of the frequencies (thousands) and some other values with low frequencies (below one hundred) that I want to emphasize. I think if I could present the logarithms of the frequencies, these could be presented in a more convenient way, but I don't know how to deal with this. Thanks, and sorry for my English If you are plotting the counts, you should probably look at barplot(), which as of R version 2.2.0 supports log scale axes (ie. log = y). See ?barplot for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] apply and plot
On Thu, 2005-10-13 at 14:50 +0100, Luis Ridao Cruz wrote: R-help, I use the code below to plot some data by applying apply function. But I don't know how I can get the argument type or col on the plot function to distinguish the different lines in the graph: apply ( my.data, 2, function ( x ) lines ( dimnames ( my.data ) [[1]] , x ) ) Thank you in advance Rather than trying the use the construct above, take a look at ?matlines and/or ?matplot (both on the same help page with ?matpoints.) I think that you will find these purposefully designed functions better suited to what I believe you are trying to do here. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Removing and restoring factor levels (TYPO CORRECTED)
On Thu, 2005-10-13 at 10:02 -0400, Duncan Murdoch wrote: Sorry, a typo in my previous message (parens in the wrong place in the conversion). Here it is corrected: I'm doing a big slow computation, and profiling shows that it is spending a lot of time in match(), apparently because I have code like x %in% listofxvals Both x and listofxvals are factors with the same levels, so I could probably speed this up by stripping off the levels and just treating them as integer vectors, then restoring the levels at the end. What is the safest way to do this? I am worried that at some point x and listofxvals will *not* have the same levels, and the optimization will give the wrong answer. So I need code that guarantees they have the same coding. I think this works, where master is a factor with the master list of levels (guaranteed to be a superset of the levels of x and listofxvals), but can anyone spot anything that might go wrong? # Strip the levels x - as.integer( factor(x, levels = levels(master) ) ) # Restore the levels x - structure( x, levels = levels(master), class = factor ) Thanks for any advice... Duncan Murdoch Duncan, With the predicate that 'master' has the full superset of all possible factor levels defined, it would seem that this would be a reasonable way to go. This approach would also seem to eliminate whatever overhead is encountered as a result of the coercion of 'x' as a factor to a character vector, which is done by match(). One question I have is, what is the advantage of using structure() versus: x - factor(x, levels = levels(master)) ? Thanks, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Removing and restoring factor levels (TYPO CORRECTED)
On Thu, 2005-10-13 at 14:31 -0400, Duncan Murdoch wrote: On 10/13/2005 1:07 PM, Marc Schwartz (via MN) wrote: On Thu, 2005-10-13 at 10:02 -0400, Duncan Murdoch wrote: Sorry, a typo in my previous message (parens in the wrong place in the conversion). Here it is corrected: I'm doing a big slow computation, and profiling shows that it is spending a lot of time in match(), apparently because I have code like x %in% listofxvals Both x and listofxvals are factors with the same levels, so I could probably speed this up by stripping off the levels and just treating them as integer vectors, then restoring the levels at the end. What is the safest way to do this? I am worried that at some point x and listofxvals will *not* have the same levels, and the optimization will give the wrong answer. So I need code that guarantees they have the same coding. I think this works, where master is a factor with the master list of levels (guaranteed to be a superset of the levels of x and listofxvals), but can anyone spot anything that might go wrong? # Strip the levels x - as.integer( factor(x, levels = levels(master) ) ) # Restore the levels x - structure( x, levels = levels(master), class = factor ) Thanks for any advice... Duncan Murdoch Duncan, With the predicate that 'master' has the full superset of all possible factor levels defined, it would seem that this would be a reasonable way to go. This approach would also seem to eliminate whatever overhead is encountered as a result of the coercion of 'x' as a factor to a character vector, which is done by match(). One question I have is, what is the advantage of using structure() versus: x - factor(x, levels = levels(master)) ? That one doesn't work. What factor(x, levels=levels(master)) says is to convert x to a factor, coding the values in it according the levels in master. But at this point x has values which are integers, so they won't match the levels of master, which are probably character strings. For example: master - factor(letters) print(x - factor(letters[1:3])) [1] a b c Levels: a b c print(x - as.integer( factor(x, levels = levels(master) ) ) ) [1] 1 2 3 print(x - factor(x, levels = levels(master))) [1] NA NA NA Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z I get NA's at the end because the values 1,2,3 aren't in the vector of factor levels (which are the lowercase letters). As opposed to: print(x - structure(x, levels = levels(master), class = factor )) [1] a b c Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z OK. Makes sense. Thanks for the clarification. Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to generate for one vector matrix
On Thu, 2005-10-13 at 19:47 +0200, Jan Sabee wrote: Is there any routine to generate for one vector matrix. If I have X I want to generate start from zero to maximum value each vector. For example, I have a vector x = (4,2,3,1,4) I want to generate n=6 times, for 4, start 0 to 4, then 2 start 0 to 2, ect. The result something like this: generate(x,n=6) 1,1,2,1,4 1,2,3,0,3 4,0,1,1,1 3,1,0,1,4 0,0,3,0,0 4,1,3,0,4 Could anyone help me. Thanks. Regards, Jan Sabee If I am properly understanding what you are doing here, you have an initial vector of values. You want to create a matrix, whose columns are the result of random sampling with replacement from the initial vector 'n' times, where the sampling space for each column i is from 0:x[i]? If correct, this should do it: x - c(4, 2, 3, 1, 4) x [1] 4 2 3 1 4 sapply(x, function(x) sample(0:x, 6, replace = TRUE)) [,1] [,2] [,3] [,4] [,5] [1,]42112 [2,]30000 [3,]12013 [4,]42111 [5,]31310 [6,]01110 Just replace '6' in the sample() arguments with the 'n' you require. See ?sapply and ?sample. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] subsetting data frame using by() or tapply() or other
On Thu, 2005-10-13 at 14:28 -0600, Brian S Cade wrote: Ok so I see the problem that I'm having creating a new variable (LAG1DBC) in the example data transformation below is that tapply() is creating a list that is not dimensionally consistent with the data frame (data). So how do I go from the list output of tapply() to create a dimensionally consistent vector that can create the new variable in my original data frame? I've been trying to use a function like data$LAG1DBC - tapply(data$DBC, data$LOCID, function(x) c(NA, x[-length(x)])) which creates a list of dimension much smaller than the nrows in data. And I've tried things like using as.data.frame.array() or as.data.frame.list() in front of tapply() and still have the same problem. I know this can't be that unusual of a data manipulation and that someone has to have done similar things before. I want to go from something like this: LOCID POPULATION YEARDBC 1 algb-1 A 1992 0.70451575 2 algb-1 A 1993 0.59506851 3 algb-1 A 1997 0.84837544 4 algb-1 A 1998 0.50283182 5 algb-1 A 2000 0.91242707 6 algb-2 A 1992 0.09747155 7 algb-2 A 1993 0.84772253 8 algb-2 A 1997 0.43974081 9 algb-2 A 1998 0.83108544 10 algb-2 A 2000 0.22291192 11 algb-3 A 1992 0.44234175 12 algb-3 A 1993 0.54089534 5680 taylr-73 B 2001 0.43918082 5681 taylr-73 B 2002 0.34694427 5682 taylr-73 B 2003 3.35619190 5683 taylr-73 B 2004 0.71575815 5684 taylr-73 B 2005 0.42038506 5685 taylr-74 B 1992 3.88410354 5686 taylr-74 B 1993 3.32472557 5687 taylr-74 B 1994 3.29861501 5688 taylr-74 B 1996 0.48153827 5689 taylr-74 B 1997 3.63570636 5690 taylr-74 B 1998 1.94630194 to something like this: LOCID POPULATION YEARDBC LAG1DBC 1 algb-1 A 1992 0.70451575 NA 2 algb-1 A 1993 0.59506851 0.70451575 3 algb-1 A 1997 0.84837544 0.59506851 4 algb-1 A 1998 0.50283182 0.84837544 5 algb-1 A 2000 0.91242707 0.50283182 6 algb-2 A 1992 0.09747155 NA 7 algb-2 A 1993 0.84772253 0.09747155 8 algb-2 A 1997 0.43974081 0.84772253 9 algb-2 A 1998 0.83108544 0.43974081 10 algb-2 A 2000 0.22291192 0.83108544 11 algb-3 A 1992 0.44234175 NA 12 algb-3 A 1993 0.54089534 0.44234175 5680 taylr-73 B 2001 0.43918082 NA 5681 taylr-73 B 2002 0.34694427 0.43918082 5682 taylr-73 B 2003 3.35619190 0.34694427 5683 taylr-73 B 2004 0.71575815 3.35619190 5684 taylr-73 B 2005 0.42038506 0.71575815 5685 taylr-74 B 1992 3.88410354 NA 5686 taylr-74 B 1993 3.32472557 3.88410354 5687 taylr-74 B 1994 3.29861501 3.32472557 5688 taylr-74 B 1996 0.48153827 3.29861501 5689 taylr-74 B 1997 3.63570636 0.48153827 5690 taylr-74 B 1998 1.94630194 3.63570636 Brian Brian, Use unlist(): data$LAG1DBC - unlist(tapply(data$DBC, data$LOCID, function(x) c(NA, x[-length(x)]))) data LOCID POPULATION YEARDBCLAG1DBC 1 algb-1 A 1992 0.70451575 NA 2 algb-1 A 1993 0.59506851 0.70451575 3 algb-1 A 1997 0.84837544 0.59506851 4 algb-1 A 1998 0.50283182 0.84837544 5 algb-1 A 2000 0.91242707 0.50283182 6 algb-2 A 1992 0.09747155 NA 7 algb-2 A 1993 0.84772253 0.09747155 8 algb-2 A 1997 0.43974081 0.84772253 9 algb-2 A 1998 0.83108544 0.43974081 10 algb-2 A 2000 0.22291192 0.83108544 11 algb-3 A 1992 0.44234175 NA 12 algb-3 A 1993 0.54089534 0.44234175 5680 taylr-73 B 2001 0.43918082 NA 5681 taylr-73 B 2002 0.34694427 0.43918082 5682 taylr-73 B 2003 3.35619190 0.34694427 5683 taylr-73 B 2004 0.71575815 3.35619190 5684 taylr-73 B 2005 0.42038506 0.71575815 5685 taylr-74 B 1992 3.88410354 NA 5686 taylr-74 B 1993 3.32472557 3.88410354 5687 taylr-74 B 1994 3.29861501 3.32472557 5688 taylr-74 B 1996 0.48153827 3.29861501 5689 taylr-74 B 1997 3.63570636 0.48153827 5690 taylr-74 B 1998 1.94630194 3.63570636 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] shell scripts in R
On Fri, 2005-10-14 at 00:04 +0200, Benedykt P. Barszcz wrote: Dnia czwartek, 13 października 2005 23:25, Andrew Robinson napisał: Marco, use the system command. ?system I hope that this helps, system(ls, intern = FALSE, ignore.stderr = TRUE) Error in as.character(args[[i]]) : cannot coerce to vector In fact it ignores the stderr ! As per ?system: command the system command to be invoked, as a string. Thus, system(ls, intern = FALSE, ignore.stderr = TRUE) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] high resolution images for publication
On Thu, 2005-10-13 at 15:20 -0600, Chris Buddenhagen wrote: Dear all I am using R to produce ordinations library(vegan) and the plot function produced looks great on the screen but when I send it to jpg or pdf or eps the resolution is not so good. Can you tell me how to get high resolution images out of R for publication? It would be helpful to see the actual code that you are using, as is asked for in the Posting Guide. For publication, it would be rare to want to use a bitmapped format such as jpg/png. pdf and eps are vector based formats and would be generally preferred over the above. I can only hazard a guess to consider that perhaps your height and width arguments for pdf() and/or postscript() are not set properly, as there are no other resolution based settings for those formats. By design, the output resolution is target device dependent. Please provide the code that you are using and we can respond with more specific guidance. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Hmisc latex function
On Wed, 2005-10-12 at 08:33 -0500, Charles Dupont wrote: Marc Schwartz (via MN) wrote: On Tue, 2005-10-11 at 10:01 -0400, Rick Bilonick wrote: I'm using R 2.2.0 on an up-to-date version of Fedora Core 4 with the latest version of Hmisc. When I run an example from the latex function I get the following: x - matrix(1:6, nrow=2, dimnames=list(c('a','b'),c('c','d','enLine 2'))) x c d enLine 2 a 1 35 b 2 46 latex(x) # creates x.tex in working directory sh: line 0: cd: “/tmp/Rtmpl10983”: No such file or directory This is pdfeTeX, Version 3.141592-1.21a-2.2 (Web2C 7.5.4) entering extended mode ! I can't find file `“/tmp/Rtmpl10983/file643c9869”'. * “/tmp/Rtmpl10983/file643c9869” Please type another input file name: q (/usr/share/texmf/tex/latex/tools/q.tex LaTeX2e 2003/12/01 Babel v3.8d and hyphenation patterns for american, french, german, ngerman, b ahasa, basque, bulgarian, catalan, croatian, czech, danish, dutch, esperanto, e stonian, finnish, greek, icelandic, irish, italian, latin, magyar, norsk, polis h, portuges, romanian, russian, serbian, slovak, slovene, spanish, swedish, tur kish, ukrainian, nohyphenation, loaded. File ignored xdvi-motif.bin: Fatal error: /tmp/Rtmpl10983/file643c9869.dvi: No such file. How can I fix this? Rick B. I get the same results, also on FC4 with R 2.2.0. I am cc:ing Frank here for his input, but a quick review of the code and created files suggests that there may be conflict between the locations of some of the resultant files during the latex system call. Some files appear in a temporary R directory, while others appear in the current R working directory. For example, if I enter the full filename: /tmp/RtmpC12100/file643c9869.tex at the latex prompt, I get: latex(x) sh: line 0: cd: “/tmp/RtmpC12100”: No such file or directory This is pdfeTeX, Version 3.141592-1.21a-2.2 (Web2C 7.5.4) entering extended mode ! I can't find file `“/tmp/RtmpC12100/file643c9869”'. * “/tmp/RtmpC12100/file643c9869” Please type another input file name: *** loading the extensions datasource /tmp/RtmpC12100/file643c9869.tex (/tmp/RtmpC12100/file643c9869.tex LaTeX2e 2003/12/01 Babel v3.8d and hyphenation patterns for american, french, german, ngerman, b ahasa, basque, bulgarian, catalan, croatian, czech, danish, dutch, esperanto, e stonian, finnish, greek, icelandic, irish, italian, latin, magyar, norsk, polis h, portuges, romanian, russian, serbian, slovak, slovene, spanish, swedish, tur kish, ukrainian, nohyphenation, loaded. (/usr/share/texmf/tex/latex/base/report.cls Document Class: report 2004/02/16 v1.4f Standard LaTeX document class (/usr/share/texmf/tex/latex/base/size10.clo)) (/usr/share/texmf/tex/latex/geometry/geometry.sty (/usr/share/texmf/tex/latex/graphics/keyval.sty) (/usr/share/texmf/tex/latex/geometry/geometry.cfg)) No file file643c9869.aux. [1] (./file643c9869.aux) ) Output written on file643c9869.dvi (1 page, 368 bytes). Transcript written on file643c9869.log. xdvi-motif.bin: Fatal error: /tmp/RtmpC12100/file643c9869.dvi H, It works for me. Interesting. It almost looks like the temp dir is not being created, but thats not possible because R does that. It might be a Unicode issue with you system shell. Can you run this statement in R sys(paste('cd',dQuote(tempdir()),;, echo Hello BOB test.test, ;,cat test.test)) What version of Hmisc are you using? What local are you using? Charles Hmisc version 3.0-7, Dated 2005-09-15, which is the latest according to CRAN. sys(paste('cd',dQuote(tempdir()),;, + echo Hello BOB test.test, + ;,cat test.test)) sh: line 0: cd: “/tmp/RtmpGY5553”: No such file or directory [1] Hello BOB From a bash console: $ cd /tmp/RtmpGY5553 $ pwd /tmp/RtmpGY5553 $ locale LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 LC_NUMERIC=en_US.UTF-8 LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 LC_ALL= On the creation of the sys() call, it looks like the backquotes are causing the problem: paste('cd',dQuote(tempdir())) [1] cd “/tmp/RtmpGY5553” From a bash shell: $ cd “/tmp/RtmpGY5553” bash: cd: “/tmp/RtmpGY5553”: No such file or directory $ cd /tmp/RtmpGY5553 $ pwd /tmp/RtmpGY5553 According to ?dQuote: By default, sQuote and dQuote provide undirectional ASCII quotation style. In a UTF-8 locale (see l10n_info), the Unicode directional quotes are used. The See Also points to shQuote for quoting OS commands. HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Correlation, by date, of two variables?
On Wed, 2005-10-12 at 12:01 -0700, t c wrote: I have a dataset with three variables: date, var1, var2 How can I calculate the correlation, by date, between var1 and var2? e.g. datevar1var2 1/1/200154 1/1/200185 1/1/200197 2/1/200172 2/1/200121 2/1/200146 3/1/200135 3/1/200143 3/1/200169 3/1/20017-1 the results I want: 1/1/2001 0.891042111 2/1/2001 0.075093926 3/1/2001 -0.263117406 t c, Given your series of posts here, I would highly recommend that before you consider spending money on a tutor or other similar resource, you take the time to read the freely available documentation that both R Core and useRs have kindly provided to the Community. The R Core manuals are all available with your downloaded installation of R and/or from the main R web site under Documentation as are the Contributed documents. You will find that your time will be well invested in that effort. If you read the Posting Guide for the e-mail lists, a link to which is provided at the bottom of every e-mail that comes through, you will find an excellent guide to the resources that are available to assist you in using R. Trying to learn R (as with any technical endeavor) without reading at least a basic set of the growing list of documents, books and publications on R is going to be problematic. A hint for you here: See ?by, ?tapply, ?split and ?lapply for guidance on how to generate summary statistics on subsets of data. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Two factor (or more) non-parametric comparison of means
I suspect that John may be looking for ?friedman.test, which I believe will allow for a two-way non-parametric test. kruskal.test() will perform a non-parametric test on two or more samples (or factor levels) as a generalization of wilcox.test() for one or two samples (or factor levels). HTH, Marc Schwartz On Tue, 2005-10-11 at 18:22 +0200, Johan Sandblom wrote: ?kruskal.test 2005/10/11, John Sorkin [EMAIL PROTECTED]: Can anyone suggest a good non-parametric test, and an R implementation of the test, that allows for two or more factors? Wilcoxon signed rank allows for only one. Thanks, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Hmisc latex function
On Tue, 2005-10-11 at 10:01 -0400, Rick Bilonick wrote: I'm using R 2.2.0 on an up-to-date version of Fedora Core 4 with the latest version of Hmisc. When I run an example from the latex function I get the following: x - matrix(1:6, nrow=2, dimnames=list(c('a','b'),c('c','d','enLine 2'))) x c d enLine 2 a 1 35 b 2 46 latex(x) # creates x.tex in working directory sh: line 0: cd: “/tmp/Rtmpl10983”: No such file or directory This is pdfeTeX, Version 3.141592-1.21a-2.2 (Web2C 7.5.4) entering extended mode ! I can't find file `“/tmp/Rtmpl10983/file643c9869”'. * “/tmp/Rtmpl10983/file643c9869” Please type another input file name: q (/usr/share/texmf/tex/latex/tools/q.tex LaTeX2e 2003/12/01 Babel v3.8d and hyphenation patterns for american, french, german, ngerman, b ahasa, basque, bulgarian, catalan, croatian, czech, danish, dutch, esperanto, e stonian, finnish, greek, icelandic, irish, italian, latin, magyar, norsk, polis h, portuges, romanian, russian, serbian, slovak, slovene, spanish, swedish, tur kish, ukrainian, nohyphenation, loaded. File ignored xdvi-motif.bin: Fatal error: /tmp/Rtmpl10983/file643c9869.dvi: No such file. How can I fix this? Rick B. I get the same results, also on FC4 with R 2.2.0. I am cc:ing Frank here for his input, but a quick review of the code and created files suggests that there may be conflict between the locations of some of the resultant files during the latex system call. Some files appear in a temporary R directory, while others appear in the current R working directory. For example, if I enter the full filename: /tmp/RtmpC12100/file643c9869.tex at the latex prompt, I get: latex(x) sh: line 0: cd: “/tmp/RtmpC12100”: No such file or directory This is pdfeTeX, Version 3.141592-1.21a-2.2 (Web2C 7.5.4) entering extended mode ! I can't find file `“/tmp/RtmpC12100/file643c9869”'. * “/tmp/RtmpC12100/file643c9869” Please type another input file name: *** loading the extensions datasource /tmp/RtmpC12100/file643c9869.tex (/tmp/RtmpC12100/file643c9869.tex LaTeX2e 2003/12/01 Babel v3.8d and hyphenation patterns for american, french, german, ngerman, b ahasa, basque, bulgarian, catalan, croatian, czech, danish, dutch, esperanto, e stonian, finnish, greek, icelandic, irish, italian, latin, magyar, norsk, polis h, portuges, romanian, russian, serbian, slovak, slovene, spanish, swedish, tur kish, ukrainian, nohyphenation, loaded. (/usr/share/texmf/tex/latex/base/report.cls Document Class: report 2004/02/16 v1.4f Standard LaTeX document class (/usr/share/texmf/tex/latex/base/size10.clo)) (/usr/share/texmf/tex/latex/geometry/geometry.sty (/usr/share/texmf/tex/latex/graphics/keyval.sty) (/usr/share/texmf/tex/latex/geometry/geometry.cfg)) No file file643c9869.aux. [1] (./file643c9869.aux) ) Output written on file643c9869.dvi (1 page, 368 bytes). Transcript written on file643c9869.log. xdvi-motif.bin: Fatal error: /tmp/RtmpC12100/file643c9869.dvi: No such file. The temporary .tex file is present, but the .dvi, .aux and .log files are created in the current working R directory. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Two factor (or more) non-parametric comparison of means
Peter, Good points all around. Hopefully John will provide further clarification. Best regards, Marc On Tue, 2005-10-11 at 19:46 +0200, Peter Dalgaard wrote: Marc Schwartz (via MN) [EMAIL PROTECTED] writes: I suspect that John may be looking for ?friedman.test, which I believe will allow for a two-way non-parametric test. You could be right, but you could also be wrong... It depends quite a bit on what is meant by a two-factor comparison of means. If he literally means means (!), then a permutation test (permuting within levels of the other factor) could be appropriate. It also makes a difference whether there's a single replication or more per cell in the cross-classification. kruskal.test() will perform a non-parametric test on two or more samples (or factor levels) as a generalization of wilcox.test() for one or two samples (or factor levels). HTH, Marc Schwartz On Tue, 2005-10-11 at 18:22 +0200, Johan Sandblom wrote: ?kruskal.test 2005/10/11, John Sorkin [EMAIL PROTECTED]: Can anyone suggest a good non-parametric test, and an R implementation of the test, that allows for two or more factors? Wilcoxon signed rank allows for only one. Thanks, John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] wildcards and removing variables
On Mon, 2005-10-10 at 10:37 -0400, Afshartous, David wrote: All, Is there are a wildcard in R for varible names as in unix? For example, rm(results*) to remove all variable or function names that begin w/ results? cheers, Dave ps - please respond directly to [EMAIL PROTECTED] See ?ls, which has a 'pattern' argument, enabling the use of Regex to define the objects to be listed and subsequently removed using rm(). You can then use something like: rm(list = ls(pattern = \\bresults.)) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] wildcards and removing variables
On Mon, 2005-10-10 at 16:01 +0100, Prof Brian Ripley wrote: On Mon, 10 Oct 2005, Marc Schwartz (via MN) wrote: On Mon, 2005-10-10 at 10:37 -0400, Afshartous, David wrote: All, Is there are a wildcard in R for varible names as in unix? For example, rm(results*) to remove all variable or function names that begin w/ results? cheers, Dave ps - please respond directly to [EMAIL PROTECTED] See ?ls, which has a 'pattern' argument, enabling the use of Regex to define the objects to be listed and subsequently removed using rm(). You can then use something like: rm(list = ls(pattern = \\bresults.)) One new feature of R-2.2.0 is glob2rx, which converts wildcards to regexps for you. E.g. rm(list = ls(pattern = glob2rc(results*))) (I think if does it a little better, as that trailing dot is not I think correct: result* matches result.) Thanks to both Gabor and Prof. Ripley for pointing out the use of glob2rc(). One of the new features I had forgotten about since reading NEWS. On the trailing dot, I was just in the process of drafting a follow up after realizing my error, since as you point out, 'results' would not be matched in that case. Thanks, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] spline.des
On Mon, 2005-10-03 at 15:28 -0500, lforzani wrote: Hello, I am using library fda and I can not run a lot of functions because I receive the error: Error in bsplineS(evalarg, breaks, norder, nderiv) : couldn't find function spline.des do you know how I can fix that? Thnaks. Liliana spline.des() is in the 'splines' package, installed with the basic R distribution. It appears that bsplineS() has a dependency on this not otherwise referenced in the fda package documentation. Looks like fda has not been updated in some time as well. I have cc'd Jim Ramsay on this reply as an FYI. You probably need a: library(splines) before calling your code above. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Getting eps into Word documents.
On Tue, 2005-10-04 at 08:38 -0500, Robert Baer wrote: On 03-Oct-05 Marc Schwartz (via MN) wrote: On Mon, 2005-10-03 at 16:31 -0300, Rolf Turner wrote: A student in one of my courses has asked me about getting R graphics output (under Linux) into a Word document. I.e. she wants to do her R thing under Linux, but then do her word processing using Word. --snip-- So use something like the following: postscript(RPlot.eps, height = 4, width = 4, horizontal = FALSE, onefile = FALSE, paper = special) plot(1:5) dev.off() You can then import the .eps file into Word or most other such applications that can import encapsulated postscript files. -snip More information is available from MS here: http://support.microsoft.com/?kbid=290362 HTH, Marc Schwartz --snip--- b) It won't work anyway if printed to a non-PostScript printer. True, which is the case irrespective of Word/Windows. If you don't have a PS printer locally or accessible via network, you can always install a PS printer driver and print to a file, which can then be printed by a third party if required. Well, as a lowly Windows and Office user, I most often right click on R grahics, cut to clipboard, and paste into Word. So one possiblility is for the student to install R on her own machine (Windows or Mac?). But I just tried Marc's suggestion, and it looks VERY VIABLE to me. I generated the graph from his code snippet and used Insert picture from file in Word 2003 to place the graphic in a Word document. I then tried printing on both an HP 4100 TN laserjet and an HP 960c deskjet. The image printed perfectly on both printers with crisp lines and text that apprear to be vector-based not degraded bitmapped representations. Certainly worth the student trying. Glad to hear that worked for you. I do think that the majority of problems with importing EPS files from R into other applications (Word, Powerpoint, Writer, Impress, etc.) are due to not properly configuring the postscript() arguments that are on the help page. I should also note that beyond plots, I use this same approach for putting LaTeX tables into these documents as well. There are times where I need create one or more nicely formatted tables for inclusion in another document, to then be used by someone else. I generate the LaTeX preamble, document and table code using R, then run: latex FileName.tex dvips -E FileName -o FileName.ps where the '-E' option to dvips attempts to create an EPS output file with a tight bounding box. The result is a single EPS file with the table (much like the plot example) ready for import. I had been using 'epstool' to create the image preview, but now that OO.org 2.0 has included this automatically upon import (as do the latest versions of Word), this step is no longer required. In this way, I can use R to create reproducible and publication ready tables for use by others in their documents, in situations where they are not (or cannot) use LaTeX for the entire document. HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] gnomeGUI installation
On Mon, 2005-10-03 at 18:01 +0100, Prof Brian Ripley wrote: On Mon, 3 Oct 2005, Daniel Pick wrote: I have successfully downloaded the sources and built R as a shared library on a Red Hat Enterprise Level 3 box. I am now trying to build the GNOME GUI, but configure is barfing on glade. According to the system logs, the RPM for libglade2-devel-2.0.1-3.x86_64 is installed, but in /usr/bin, where gnomeGUI configure is looking, what's there is libglade-convert. How do I fix this? By reading the manual. In the R-admin manual it says `Please check you have all the requirements. You need at least the following packages or later installed: audiofile-0.2.1 esound-0.2.23 glib-1.2.10 gtk+-1.2.10 imlib-1.9.10 ORBit-0.5.12 gnome-libs-1.4.1.2 libxml-1.8.16 libglade-0.17 ... Remember that some package management systems (such as @acronym{RPM} and deb) make a distinction between the user version of a package and the developer version. The latter usually has the same name but with the extension @samp{-devel} or @samp{-dev}. If you use a pre-packaged version of @acronym{GNOME} then you must have the developer versions of the above packages in order to compile the R-GNOME console.' My FC3 box has gannet% rpm -qa | grep ^libglade libglade2-devel-2.4.0-5 libglade-devel-0.17-15 libglade2-2.4.0-5 libglade-0.17-15 and note that libglade2[-devel] is NOT a later version of libglade (in the same way that gtk2 is different from gtk). AFAICS it is likely that /usr/bin/libglade-config is provided by libglade-devel-0.17-15. That is correct, subject of course to RPM version differences between FC3/4 and RHEL3/4. Using RPM, this can be confirmed with: $ rpm -q --whatprovides /usr/bin/libglade-config libglade-devel-0.17-16 which is the output on FC4. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Getting eps into Word documents.
On Mon, 2005-10-03 at 16:31 -0300, Rolf Turner wrote: A student in one of my courses has asked me about getting R graphics output (under Linux) into a Word document. I.e. she wants to do her R thing under Linux, but then do her word processing using Word. Scanning around the r-help archives I encountered an inquiry about this topic --- eps into Word documents --- from Paul Johnson but found no replies to it. I tried contacting him but the email address in the archives appeared not to be valid. Does anyone know a satisfactory solution to the problem of including a graphic which exists in the form of a *.eps (encapsulated postscript) file into a Word document. If so, would you be willing to share it with me and my student? If so, please be gentle in your explanation. I am not myself (repeat ***NOT***) a user of Word! Hehe... :-) Rolf, just use the guidance provided in ?postscript. In the details section it indicates: The postscript produced by R is EPS (_Encapsulated PostScript_) compatible, and can be included into other documents, e.g., into LaTeX, using '\includegraphics{filename}'. For use in this way you will probably want to set 'horizontal = FALSE, onefile = FALSE, paper = special'. So use something like the following: postscript(RPlot.eps, height = 4, width = 4, horizontal = FALSE, onefile = FALSE, paper = special) plot(1:5) dev.off() You can then import the .eps file into Word or most other such applications that can import encapsulated postscript files. The recent versions of Word will also automatically generate a bitmapped preview of the plot upon import. BTW, OO.org 2.0, which is in late beta testing now, also generates EPS preview images upon import. The key to doing this successfully is using the arguments to postscript() as defined above. I have never had a problem with this. More information is available from MS here: http://support.microsoft.com/?kbid=290362 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Getting eps into Word documents.
On Mon, 2005-10-03 at 22:00 +0100, Ted Harding wrote: Rolf ( Marc) On 03-Oct-05 Marc Schwartz (via MN) wrote: On Mon, 2005-10-03 at 16:31 -0300, Rolf Turner wrote: A student in one of my courses has asked me about getting R graphics output (under Linux) into a Word document. I.e. she wants to do her R thing under Linux, but then do her word processing using Word. Scanning around the r-help archives I encountered an inquiry about this topic --- eps into Word documents --- from Paul Johnson but found no replies to it. I tried contacting him but the email address in the archives appeared not to be valid. Does anyone know a satisfactory solution to the problem of including a graphic which exists in the form of a *.eps (encapsulated postscript) file into a Word document. If so, would you be willing to share it with me and my student? If so, please be gentle in your explanation. I am not myself (repeat ***NOT***) a user of Word! Hehe... :-) Rolf, just use the guidance provided in ?postscript. In the details section it indicates: The postscript produced by R is EPS (_Encapsulated PostScript_) compatible, and can be included into other documents, e.g., into LaTeX, using '\includegraphics{filename}'. For use in this way you will probably want to set 'horizontal = FALSE, onefile = FALSE, paper = special'. So use something like the following: postscript(RPlot.eps, height = 4, width = 4, horizontal = FALSE, onefile = FALSE, paper = special) plot(1:5) dev.off() You can then import the .eps file into Word or most other such applications that can import encapsulated postscript files. The recent versions of Word will also automatically generate a bitmapped preview of the plot upon import. BTW, OO.org 2.0, which is in late beta testing now, also generates EPS preview images upon import. The key to doing this successfully is using the arguments to postscript() as defined above. I have never had a problem with this. More information is available from MS here: http://support.microsoft.com/?kbid=290362 HTH, Marc Schwartz This suggestion could be problematic in that Ted, a) According to the MS web site above, it applies to recent Word (Office 2002/2003) or possibly earlier depending on installed graphics filters. The Word EPS filter has been around for some time. I recall using it years ago. Note that it is listed on that site in both categories of filters for some reason. In the older versions of Word, there was no preview image generated, hence you ended up with a box/frame place holder of sorts, unless you added a preview image before importing. That being said, one does need to install the import filters from the Office CD, which may not be part of the default settings on all prior versions. It may require going back into the Office set up program to install them. b) It won't work anyway if printed to a non-PostScript printer. True, which is the case irrespective of Word/Windows. If you don't have a PS printer locally or accessible via network, you can always install a PS printer driver and print to a file, which can then be printed by a third party if required. If one has ps2pdf/Ghostscript available, you can also convert the PS file to PDF and then print it via Acrobat or other PDF viewers. If either of these applies to Rolf's student, she could have problems. [Just to add my own disclaimer: the only version of Word I'm in any position to ever touch, and then only if driven to, belongs to Office 98; and I'm sure that this doesn't know a thing about PostScript!] Another option to consider, since she's doing her R work on Linux, is that recent versions of the ImageMagick program 'convert' have the capability to convert EPS into WMF (Windows Metafile; use file extension .wmf for 'convert') or EMF (Enhanced Metafile); use file extension .emf. The gubbins is built in to a file wmf.so in the lib/ImageMagick tree. Likewise, the program 'pstoedit' can do it (to .wmf or .emf), using library /usr/local/lib/pstoedit/libp2edrvwmf.so (on my machine). Most Linux distributions these days come with ImageMagick and pstoedit. If not already installed in the machine she's using it should be straightforward to get this dome. Hoping this helps, Ted. Ted, the critical issue with doing the WMF/EMF conversion on Linux is that the libEMF stuff generates lousy quality (and visually bitmapped) output. I have tried it in the past and gave up, staying with EPS/PDF formats for vector graphics. If one wants to convert the EPS file to WMF/EMF, it is best done on Windows using native Windows drivers, rather than on Linux. When SVG graphics go mainstream and cross-platform, that should help significantly with this whole issue. HTH, Marc __ R-help
Re: [R] boxplot and xlim confusion?
On Thu, 2005-09-29 at 15:28 +0200, Karin Lagesen wrote: Petr Pikal [EMAIL PROTECTED] writes: Hi Karin I did not have seen any answer for your question yet so here is a try. I gues you want the horizotal layout or your boxplot. boxplot(split(rnorm(30), rep(1:3, each=10)), horizontal =T, names=letters[1:3]) boxplot(split(rnorm(30), rep(1:3, each=10)), horizontal =T, names=c(NA,b,NA)) These look like the plots I want, yes. However, is there a way of locking the x-axis? When I do these several times in a row, the range of the x axis moves with the data being plotted. If I set xlim in the boxplot command, it makes no difference what so ever. That's because when you use 'horizontal = TRUE' in boxplot(), the x and y axes are rotated. Thus you need to set 'ylim' and not 'xlim', where ylim is the range of values in your data irrespective of the axis orientation. This is done in bxp(), which is the actual function doing the plotting for boxplot(). In bxp(), there is the follow code snippet: if (horizontal) plot.window(ylim = c(0.5, n + 0.5), xlim = ylim, log = log) else plot.window(xlim = c(0.5, n + 0.5), ylim = ylim, log = log) Note in the first case, that the xlim value for plot.window() is set to the ylim value passed from boxplot() or from bxp(), if called directly. So, the above examples by Petr should be something like: boxplot(split(rnorm(30), rep(1:3, each=10)), horizontal = TRUE, names=letters[1:3], ylim = c(-4, 4)) boxplot(split(rnorm(30), rep(1:3, each=10)), horizontal =TRUE, names=c(NA,b,NA), ylim = c(-4, 4)) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Display values in piechart/barplot
On Thu, 2005-09-29 at 14:34 +0200, Volker Rehbock wrote: Is it possible to automatically display the underlying values of a piechart/barplot in the graphic? If so, which package/function/argument do I need for it? Thanks, Volker Using pie charts are not a particularly good way of displaying data, even though the pie() function is available in R. See ?pie for more information on this. For barplots, the following provide two approaches: # Place the bar values above the bars # Note that I set 'ylim' to make room for # the text labels above the bars vals - 1:5 names(vals) - LETTERS[1:5] mp - barplot(vals, ylim = c(0, 6)) text(mp, vals, labels = vals, pos = 3) # Place the bar values below the x axis vals - 1:5 names(vals) - LETTERS[1:5] mp - barplot(vals) mtext(side = 1, at = mp, text = vals, line = 3) Note that barplot() returns the bar midpoints, so you can use these values for annotation placement. See ?barplot, ?text and ?mtext for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to add frame (frame.plot=T) to Plot.Design?
On Thu, 2005-09-29 at 18:36 +0200, Jan Verbesselt wrote: Hi R-help, When using the package Design and the plot.Design function all the graphs of an lrm.fit (lrm()) are plotted without a frame (only with axes). How can the frame be added to these plots? In the plot.default function the frame can be added/or removed via the frame.plot=T/F, respectively. e.g., plot(1,axes=T,frame.plot=T). How can this be done with plot.Design(lrm.fit) Thanks, Jan See ?box HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Binary Logit Regression with R
On Thu, 2005-09-29 at 18:08 -0400, Johann Park wrote: Hi to all, I am a PH.D Student doing statistical analysis. I am totally new to R. I previously use Stata and am changing into R. I ususally do with logit regression with binary dependent variable (war occurence:1 or 0). I just want to know command to do that. More sepcifically, Let say, my Y is war occurence (occur=1, otherwise 0). And my independent variables (Xs) are trade, democracy, military poweretc. In Stata, I do like what follows: logit war trade democracy militarypower... Then I will get results. What are the equivalent command in R? Many thanks, JP See ?glm in the base stats package or ?lrm in Frank Harrell's Design package on CRAN. BTW, doing: help.search(logit) or RSiteSearch(logit) would provide you with the ability to do keyword searches of your current R installation or the online e-mail list and documentation archives, respectively. You should also review Chapter 11 - Statistical Models in R in An Introduction to R, which is installed with R or available online under the Documentation/Manuals link on the main R web site. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] correct syntax
On Thu, 2005-09-29 at 06:57 +1000, Stephen Choularton wrote: Hi I am trying to produce a little table like this: 0 1 0 601 408 1 290 2655 but I cannot get the syntax right. Can anyone help. Stephen Do you have the raw data or only the tabulated counts? If you have the raw data, see ?table for ways to create n-dimensional tables. If you only have the tabulated counts, then you can use matrix(): mat - matrix(c(601, 290, 408, 2655), ncol = 2) dimnames(mat) - list(c(0, 1), c(0, 1)) mat 01 0 601 408 1 290 2655 See ?matrix for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Using unsplit - unsplit does not seem to reverse the effect of split
On Tue, 2005-09-27 at 19:12 +0200, Søren Højsgaard wrote: In data OME in MASS I would like to extract the first 5 observations per subject (=ID). So I do library(MASS) OMEsub - split(OME, OME$ID) OMEsub - lapply(OMEsub,function(x)x[1:5,]) unsplit(OMEsub, OME$ID) - which results in [[1]] [1] 1 1 1 1 1 [[2]] [1] 30 30 30 30 30 [[3]] [1] low low low low low Levels: N/A high low [[4]] [1] 35 35 40 40 45 [[5]] [1] coherent incoherent coherent incoherent coherent Levels: coherent incoherent [[6]] [1] 1 4 0 1 2 [[1094]] [1] 4 5 5 5 2 [[1095]] [1] 100 100 100 100 100 [[1096]] [1] 18 18 18 18 18 [[1097]] [1] N/A N/A N/A N/A N/A Levels: N/A high low There were 50 or more warnings (use warnings() to see the first 50) warnings() Warning messages: 1: number of items to replace is not a multiple of replacement length 2: number of items to replace is not a multiple of replacement length 3: number of items to replace is not a multiple of replacement length According to documentation unsplit is the reverse of split, but I must be missing a point somewhere... Can anyone help? Thanks in advance. Søren If you read the documentation for split/unsplit, you will also note that in the Details section it says: 'unsplit' works only with lists of vectors as opposed to lists of data frames, which is the result of your split() operation. Also note that in the Value section, it indicates: 'unsplit' returns a vector for which 'split(x, f)' equals 'value' as opposed to unsplit returning a data frame. Thus, use: OME1 - do.call(rbind, OMEsub) where OME1 will be the result of rbind()'ing the data frames in the OMEsub list. See ?do.call for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help: x11 position in the Unix environment
On Mon, 2005-09-26 at 17:45 +0200, Shengzhe Wu wrote: Hello, In the Unix environment, I open a window by x11(). May I specify the position of this window by specifying the position of the top left of the window as in Windows environment? Or some other parameters can be used to do that? Thank you, Shengzhe I don't believe so. In general, under Unix/Linux, the Window Manager determines window positioning upon startup unless the application overrides this behavior. Some applications let you specify application window positioning via command line 'geometry' arguments or via the use of an .Xresources file. Some WM's provide more or less functionality for this behavior relative to user customization. For example, Sawfish provides quite a bit, whereas Metacity hides much of it. You may want to check the documentation for the WM that you are using. There is also an application called Devil's Pie: http://www.burtonini.com/blog/computers/devilspie which provides additional Sawfish-like customization for window positioning, etc. However, this is global for a given window, not on a per instance basis. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fligner-Policello robust rank test
On Thu, 2005-09-22 at 10:44 -0400, Elisabetta Manduchi wrote: Can anybody tell me if there is an R implementation of the Fligner-Policello robust rank test? Thanks, Elisabetta I don't know about the Fligner-Policello test, but the Fligner-Killeen test has been implemented in fligner.test(). From ?fligner.test: The Fligner-Killeen (median) test has been determined in a simulation study as one of the many tests for homogeneity of variances which is most robust against departures from normality, see Conover, Johnson Johnson (1981). It is a k-sample simple linear rank which uses the ranks of the absolute values of the centered samples and weights a(i) = qnorm((1 + i/(n+1))/2). The version implemented here uses median centering in each of the samples (F-K:med X^2 in the reference). FWIW, in a Google search, I located the following SAS macro by Paul von Hippel, which includes a comment that does not lead to optimism regarding software implementations: http://www.sociology.ohio-state.edu/people/ptv/macros/fligner_policello.htm The above should not be overly difficult to convert to R, presuming you know SAS macros... HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to extract the column name or value from the numerical value of the matrix
On Mon, 2005-09-19 at 12:30 -0700, shanmuha boopathy wrote: Dear sir, i have a matrix like x-c(1:20) A-matrix(x,4,5) A [,1] [,2] [,3] [,4] [,5] [1,]159 13 17 [2,]26 10 14 18 [3,]37 11 15 19 [4,]48 12 16 20 I want to extract the column value for the matrix value 11... or the row value for 14.. how it is possible? thanks for your help with regards, boopathy. Why did you copy r-devel? r-help is the appropriate forum. See ?which and take note of the 'arr.ind' argument: which(A == 11, arr.ind = TRUE) row col [1,] 3 3 which(A == 14, arr.ind = TRUE) row col [1,] 2 4 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How do I get the row indices?
On Fri, 2005-09-16 at 10:34 -0700, Martin Lam wrote: Hi, I was wondering if it's possible to get the row numbers from a filtering. Here's an example: # give me the rows with sepal.length == 6.2 iris[(iris[,1]==6.2),] # output Sepal.Length Sepal.Width Petal.Length Petal.Width Species 69 6.2 2.2 4.5 1.5 versicolor 98 6.2 2.9 4.3 1.3 versicolor 127 6.2 2.8 4.8 1.8 virginica 149 6.2 3.4 5.4 2.3 virginica What I really want is that it return the row numbers: 69, 98, 127, 149. Thanks in advance, Martin Martin, First: Be very, very careful when performing exact equalities on floating point numbers. They won't always result in the answer you expect. For more information see R FAQ 7.31: Why doesn't R think these numbers are equal? Second: See ?all.equal, ?sapply and ?which. Here is a possible vectorized solution: which(sapply(iris[, 1], function(x) isTRUE(all.equal(x, 6.2 [1] 69 98 127 149 The above applies isTRUE(all.equal(x, 6.2)) for each element 'x' in iris[, 1], returning the indices of the TRUE results for the near equality comparison, based upon the tolerance argument in all.equal(). HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Graphics 'snapshots' in Linux?
On Thu, 2005-09-15 at 10:43 -0400, Tyler Smith wrote: Hi, I'm working on a MEPIS (Debian-based Linux) computer, using the emacs/ESS package to do my R work. I've got some plots that I label interactively using the locate function. With the Windows GUI there is an option to take a snapshot of the graphics output, saving it as an image file. Is there a way to do this with emacs/ESS? Thanks, Tyler Tyler, Take a look at ?dev.copy2eps and on the same page dev.copy(), which enable you to copy the current X11 plot supported output devices. You could do something like the following for an EPS file: plot(1:5) text(locator(1), Place Text Here) dev.copy2eps(file = MyPlot.eps) or the following for a PNG file: par(bg = white) plot(1:5) text(locator(1), Place Text Here) dev.copy(device = png, file = MyPlot.png) dev.off() Note that in the first example, dev.off() is not required, as the EPS output device is closed after the call. Also, note in the second example, you will need to set the background to white (unless already specified for whatever color you may be using), as the default output file will have a transparent background, even though the png() function shows the default as white. If my memory is correct this is because the X11 device itself has a transparent background by default and this is what is copied. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Forcing hist()
On Wed, 2005-09-14 at 20:17 +0200, Par Leijonhufvud wrote: I'm trying to create histogram (using hist()) that fullfill the following criteria: * data is on a ordinal scale (1, 2, 3, 4, 5) * I want bars centered over the number on the x-axis * I want 5 bars of equal width I have tried various versions of the hist() command, with no luck. what am I missing? /Par More than likely, you want to use barplot() and not hist(): Try: barplot(1:5, names.arg = 1:5) See ?barplot for more information. HTH, Marc Schwartz P.S. To R Core: It probably makes sense to add barplot to the See Also section of ?hist, since hist is listed in the See Also for ?barplot. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] axis of plot
On Thu, 2005-09-01 at 16:58 +0100, [EMAIL PROTECTED] wrote: hy, I need to have the 0 on the bottom left corner of the graph being joined , not with this little hole between x axis and y axis... I've saw this question with the answer one time but i'm unable to find it again.. thks. guillaume See 'xaxs' and 'yaxs' in ?par: plot(1:5, xaxs = i, yaxs = i, xlim = c(0, 5), ylim = c(0, 5)) By default, both are set to r, which adds +/- 4% to the range of each axis. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] png scaling problem
On Thu, 2005-09-01 at 21:51 +0200, Knut Krueger wrote: scaling-4 xywidth-480 resolution-150 png(filename = c:/r/anschluss/plots/4.png, width = xywidth*scaling, height = xywidth*scaling,pointsize = 12, bg = white, res = resolution*scaling) .. barplot(xrow,col = barcolors,cex.axis=scaling, ylab=mean time till attachment in sec,cex.lab=1.2*scaling) I tried to scale the barplot but there is one strange result: scaling=1 http://biostatistic.de/temp/1.png--- the ylab is ok scaling=2 http://biostatistic.de/temp/2.png--- the ylab is not ok scaling=4 http://biostatistic.de/temp/4.png--- the ylab is terrible is there any better solution to scale the resolution and the width/height? with regards Knut Probably a better first question is, why are you using a bitmapped graphics format if you need the image to be re-scaled? In general, bitmapped graphics do not resize well, though if you have a specific need and know a target image size, you can adjust various parameters to look decent. Are you going to view these images in a web browser, where you are concerned with display size and resolution? From your e-mail headers it appears you are on Windows. If you need a re-sizable graphic, use a vector based format such as wmf/emf, especially if you need the graphics embedded in a Windows application such as Word or Powerpoint. This is the default format under Windows when copying and pasting a graphic between applications. You can then, fairly freely, resize the image in the target application as you may require. If you are going to print the graphic directly or include it in a document for printing (as opposed to just viewing it), then use PDF or Postscript. The latter in EPS format, can be imported into many Windows applications like Word, including the generation of a preview image. However, they don't look good for direct use in presentations (unless you print to a PS file and then convert to PDF for viewing). See ?Devices for more information. With a better idea of how you plan to use the graphic(s), we can offer more specific recommendations on how to proceed. Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] crosstab for n-way contingency tables
On Tue, 2005-08-30 at 12:28 -0700, Isotta Felli wrote: Dear list. New to R, I'm looking for a way of using crosstab to output low-dimensional (higher than 2) contingency tables (frequencies, per-cents by rows, % by columns, mean, quantiles) I'm looking for something of the following sort dataframe: singers, categorical variates: voice category (soprano,mezzo-soprano, ...) , voice type( drammatic, spinto, lirico-spinto, lirico, leggero), school (german, italian, french, russian, anglo-saxon, other);repertory (opera, Lieder, oratorio, operetta) continuous variate: age I would like to tabulate the frequencies (relative percentages) say in the following way columns: school repertory rows : voice category voice type or to output in the cells of the above table, the statistics (mean/median/quantiles) for age I've seen that the function bwplot(age~school | repertory, data= singers, layout=c(4,2)) would do graphically something similar to what I want, but I desire the output also in tabular form Thanks Isotta I don't know that you will find a single function that will do all of what you desire, but you can look at the ctab() function, which is in the 'catspec' package on CRAN by John Hendrickx. This will do multi-way tables with summary row/col statistics. There is also the CrossTable() function in the 'gmodels' package on CRAN, though this will only do one way and two way tables, with summary row/col statistics. Neither of the above provide typical summary output for continuous data. They are more for categorical variables. For simple count multi-way output, you can also look at the ftable() function which is in the base package. Also look at the summary() function in the base package which will provide range, mean, quantile data for continuous variables. It may just be a matter of formatting the output in a fashion that you desire on a post analysis basis. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] ylim for graphic
On Mon, 2005-08-29 at 12:58 -0400, Doran, Harold wrote: Dear list: I have some data for which I am generating a series of barplots for percentages. One issue that I am dealing with is that I am trying to get the legend to print in a fixed location for each chart generated by the data. Because these charts are being created in a loop, with different data, my code searches the data to identify the maximum value in the data and then print the data values 5 points above the tallest bar. Now, in situations where the largest value is 100, I needed to create the y-axis high enough to accommodate the legend w/o crowding up the data or the bars. So, I have ylim =c(0,200) and then place the legend at c(2,150). Visually, this places things exactly where I want them. But, it seems silly to have a y-axis labeled as high as 200%. I'm certain there is a smarter technique. Is it possible to place the legend at a location higher than 100 to avoid the crowding of the bars and the data and then also have the labels for the y-axis not print after the value 100%? In looking at ?legend I didn't see an option that would address this. Below is some code that you can use to create a similar chart. Thanks, Harold math.bar - c(53,31,55,28,55,100) apmxpmeet - c(47, 50, 49, 50, 49, 46) par(ps=10) math.bar - rbind(math.bar, apmxpmeet) math.barplot - barplot(math.bar, beside=T, col=c('blue','orange'), names=c('Grade \n 3','Grade \n 4','Grade \n 5','Grade \n 6','Grade \n 7','Grade \n 8'), ylim=c(0,200), ylab=Percentage, xlab=Grade Level) tot - round(math.bar,digits=0) graph.max - max(math.bar, apmxpmeet, na.rm=T) text(math.barplot, graph.max+5, tot, xpd = TRUE, col = c(blue, orange) ) legend(2,150,legend=(c(Label A, Average)), fill=c(blue,orange)) Harold, A few thoughts: 1. Instead of fixing the y axis max value at 200, simply set ylim to c(0, max(math.bar * 1.2)) or a similar constant. In this case, you get an extra 20% above the max(y) value for the legend placement. 2. In the legend call, use: legend(topleft, legend=(c(Label A, Average)), fill = c(blue,orange)) This will place the legend at the topleft of the plot region, rather than you having to calculate the x and y coords. If you want it moved in from the upper left hand corner, you can use the 'inset' argument as well, which moves the legend by a proportion of the plot region limits (0 - 1): legend(topleft, legend=(c(Label A, Average)), fill = c(blue,orange), inset = .1) 3. You can use barplot(..., yaxt = n) to have the y axis not drawn and then use axis() to place the labels at locations of your choosing, which do not need to run the full length of the axis range: barplot(1:5, yaxt = n, ylim = c(0, 10)) axis(2, at = 0:5) 4. You can place the legend outside the plot region, enabling you to keep the y axis range to =100. This would need some tweaking, but the idea is the same: # Increase the size of the top margin par(mar = c(5, 4, 8, 2) + 0.1) # Draw a barplot barplot(1:5) # Disable clipping outside the plot region par(xpd = TRUE) # Now draw the legend, but move it up by 30% from the top left legend(topleft, legend = LETTERS[1:5], inset = c(0, -.3)) You could also place the legend to the right or left of the plot region if you prefer, adjusting the above accordingly. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Matrix oriented computing
On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote: Marc Schwartz [EMAIL PROTECTED] writes: x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995) df - c(1:100) mat - sapply(x, qchisq, df) dim(mat) [1] 100 11 str(mat) num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ... outer() is perhaps a more natural first try... It does give the transpose of the sapply approach, though. round(t(outer(x,df,qchisq)),2) should be close. You should likely add dimnames. What I find interesting, is that I would have intuitively expected outer() to be faster than sapply(). However: system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE) [1] 0.01 0.00 0.01 0.00 0.00 system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst = TRUE) [1] 0.01 0.00 0.01 0.00 0.00 # No round() or t() to test for overhead system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE) [1] 0.01 0.00 0.02 0.00 0.00 # Bear in mind the round() on mat1 above all.equal(mat, mat1) [1] Mean relative difference: 4.905485e-05 all.equal(mat, t(mat2)) [1] TRUE Even when increasing the size of 'df' to 1:1000: system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE) [1] 0.16 0.01 0.16 0.00 0.00 system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst = TRUE) [1] 0.16 0.00 0.18 0.00 0.00 # No round() or t() to test for overhead system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE) [1] 0.16 0.01 0.17 0.00 0.00 It also seems that, at least in this case, t() and round() do not add much overhead. Best regards, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Matrix oriented computing
Prof. Ripley, Excellent point. Neither sapply() nor outer() are the elephant in the room in this situation. On Fri, 2005-08-26 at 16:55 +0100, Prof Brian Ripley wrote: Try profiling. Doing this many times to get an overview, e.g. for sapply with df=1:1000: % self% total self secondstotalsecondsname 98.26 6.78 98.26 6.78 FUN 0.58 0.04 0.58 0.04 unlist 0.29 0.02 0.87 0.06 as.vector 0.29 0.02 0.58 0.04 names- 0.29 0.02 0.29 0.02 names-.default 0.29 0.02 0.29 0.02 names so almost all the time is in qchisq. On Fri, 26 Aug 2005, Marc Schwartz (via MN) wrote: On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote: Marc Schwartz [EMAIL PROTECTED] writes: x - c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, 0.95, 0.975, 0.99, 0.995) df - c(1:100) mat - sapply(x, qchisq, df) dim(mat) [1] 100 11 str(mat) num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ... outer() is perhaps a more natural first try... It does give the transpose of the sapply approach, though. round(t(outer(x,df,qchisq)),2) should be close. You should likely add dimnames. What I find interesting, is that I would have intuitively expected outer() to be faster than sapply(). However: system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE) [1] 0.01 0.00 0.01 0.00 0.00 system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst = TRUE) [1] 0.01 0.00 0.01 0.00 0.00 # No round() or t() to test for overhead system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE) [1] 0.01 0.00 0.02 0.00 0.00 # Bear in mind the round() on mat1 above all.equal(mat, mat1) [1] Mean relative difference: 4.905485e-05 all.equal(mat, t(mat2)) [1] TRUE Even when increasing the size of 'df' to 1:1000: system.time(mat - sapply(x, qchisq, df), gcFirst = TRUE) [1] 0.16 0.01 0.16 0.00 0.00 system.time(mat1 - round(t(outer(x, df, qchisq)), 2), gcFirst = TRUE) [1] 0.16 0.00 0.18 0.00 0.00 # No round() or t() to test for overhead system.time(mat2 - outer(x, df, qchisq), gcFirst = TRUE) [1] 0.16 0.01 0.17 0.00 0.00 It also seems that, at least in this case, t() and round() do not add much overhead. Definitely not for such small matrices. True and both are C functions, which of course helps as well. Best regards, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Unpaste Problem
On Fri, 2005-08-26 at 22:39 +0530, A Mani wrote: Hello, Easy ways to unpaste? xp - paste(x2, x3) # x2, x3 are two non-numeric columns. . . xfg - data.frame(xp,sc1, sc2, sc3) # sc1,sc2, sc3 are numeric cols. I want xp to be split up to form a new dataframe of the form (x3, sc1, sc2, sc3). IMPORTANT info : elements of xp have the form abcspaceefg, with abc in x2 and efg in x3. Thanks in advance, I think I understand what you are trying to do. Hopefully the below may be helpful: # Create the data frame with 3 rows x2 - letters[1:3] x3 - LETTERS[1:3] xp - paste(x2, x3) sc1 - rnorm(3) sc2 - rnorm(3) sc3 - rnorm(3) xfg - data.frame(xp, sc1, sc2, sc3) xfg xpsc1sc2sc3 1 a A 1.3479123 -1.0642578 0.2479218 2 b B -0.1586587 1.1237456 -1.3952176 3 c C 2.7807484 -0.9778066 -1.9322279 # Use strsplit() here to break apart 'xp' using as the split # Use [ in sapply() to get the second (2) element from each # 'xp' list pair. Note that I use as.character() here, since xfg$xp is # a factor. # See the output of: strsplit(as.character(xfg$xp), ) # for some insight into this approach xp.split - sapply(strsplit(as.character(xfg$xp), ), [, 2) # show post split values xp.split [1] A B C # Now cbind it all together into a new data frame # don't include column 1 from xfg (xp) xfg.new - cbind(xp.split, xfg[, -1]) xfg.new xp.splitsc1sc2sc3 1A 1.3479123 -1.0642578 0.2479218 2B -0.1586587 1.1237456 -1.3952176 3C 2.7807484 -0.9778066 -1.9322279 See ?strsplit for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help on retrieving output from by( ) for regression
That works also (using the example in ?lmList) library(lme4) ?lmList fm1 - lmList(breaks ~ wool | tension, warpbreaks) However, one still would need to use either sapply() or lapply() as below to get the details that Krishna is looking for. 'fm1' above is a list of models (S4 class 'lmList') not overly different from 'tmp' below, which is a list of models (S3 class 'by'). If you review str(fm1) and str(tmp), you would note that they are virtually identical, save the use of slots, etc. HTH, Marc Schwartz On Thu, 2005-08-25 at 12:17 -0400, Randy Johnson wrote: What about using lmList from the lme4 package? Randy On 8/25/05 9:44 AM, Marc Schwartz [EMAIL PROTECTED] wrote: Also, looking at the last example in ?by would be helpful: attach(warpbreaks) tmp - by(warpbreaks, tension, function(x) lm(breaks ~ wool, data = x)) # To get coefficients: sapply(tmp, coef) # To get residuals: sapply(tmp, resid) # To get the model matrix: sapply(tmp, model.matrix) To get the summary() output, I suspect that using: lapply(tmp, summary) would yield more familiar output as compared to using: sapply(tmp, summary) The output from the latter might require a bit more navigation through the resultant matrix, depending upon how the output is to be ultimately used. HTH, Marc Schwartz On Thu, 2005-08-25 at 14:57 +0200, TEMPL Matthias wrote: Look more carefully at ?lm at the See Also section ... X - rnorm(30) Y - rnorm(30) lm(Y~X) summary(lm(Y~X)) Best, Matthias Hi all I used a function qtrregr - by(AB, AB$qtr, function(AB) lm(AB$X~AB$Y)) objective is to run a regression on quartery subsets in the data set AB, having variables X and Y, grouped by variable qtr. Now i retrieved the output using qtrregr, however it only showed the coefficients (intercept and B) with out significant levels and residuals for each qtr. Can some on help me on how can retrieve the detailed regression output. rgds snvk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ~~ Randy Johnson Laboratory of Genomic Diversity NCI-Frederick Bldg 560, Rm 11-85 Frederick, MD 21702 (301)846-1304 ~~ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help on retrieving output from by( ) for regression
Sorry to reply to my own post, but in reviewing the NAMESPACE file for lme4, it looks like Doug is perhaps planning to add additional model object accessor methods for the lmList class, including resid() and summary(), which are commented out now. coef() is available presently. So when in place, it would appear that the use of lmList would be advantageous over the use of by(). Marc On Thu, 2005-08-25 at 11:52 -0500, Marc Schwartz (via MN) wrote: That works also (using the example in ?lmList) library(lme4) ?lmList fm1 - lmList(breaks ~ wool | tension, warpbreaks) However, one still would need to use either sapply() or lapply() as below to get the details that Krishna is looking for. 'fm1' above is a list of models (S4 class 'lmList') not overly different from 'tmp' below, which is a list of models (S3 class 'by'). If you review str(fm1) and str(tmp), you would note that they are virtually identical, save the use of slots, etc. HTH, Marc Schwartz On Thu, 2005-08-25 at 12:17 -0400, Randy Johnson wrote: What about using lmList from the lme4 package? Randy On 8/25/05 9:44 AM, Marc Schwartz [EMAIL PROTECTED] wrote: Also, looking at the last example in ?by would be helpful: attach(warpbreaks) tmp - by(warpbreaks, tension, function(x) lm(breaks ~ wool, data = x)) # To get coefficients: sapply(tmp, coef) # To get residuals: sapply(tmp, resid) # To get the model matrix: sapply(tmp, model.matrix) To get the summary() output, I suspect that using: lapply(tmp, summary) would yield more familiar output as compared to using: sapply(tmp, summary) The output from the latter might require a bit more navigation through the resultant matrix, depending upon how the output is to be ultimately used. HTH, Marc Schwartz On Thu, 2005-08-25 at 14:57 +0200, TEMPL Matthias wrote: Look more carefully at ?lm at the See Also section ... X - rnorm(30) Y - rnorm(30) lm(Y~X) summary(lm(Y~X)) Best, Matthias Hi all I used a function qtrregr - by(AB, AB$qtr, function(AB) lm(AB$X~AB$Y)) objective is to run a regression on quartery subsets in the data set AB, having variables X and Y, grouped by variable qtr. Now i retrieved the output using qtrregr, however it only showed the coefficients (intercept and B) with out significant levels and residuals for each qtr. Can some on help me on how can retrieve the detailed regression output. rgds snvk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ~~ Randy Johnson Laboratory of Genomic Diversity NCI-Frederick Bldg 560, Rm 11-85 Frederick, MD 21702 (301)846-1304 ~~ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Remove NAs from Barplot
On Wed, 2005-08-24 at 13:06 -0400, Doran, Harold wrote: Dear List: I'm creating a series of barplots using Sweave that must assume a standard format. This is student achievement data and the x-axis must include all grades 3 to 8. In some cases, the data for a grade (or more than one grade) are missing in the vector math.bar, but are never missing for the vector apmxpmeet. The following sample code illustrates the issue. Using the code below to create the plot works fine. However, the following command is designed to place the data onto the plot: text(math.barplot, graph.max+5, format(tot), xpd = TRUE, col = c(blue, orange) ) This does work, but, it also places 'NA's on the plot. Is there a way to place the data onto the plot such that the numbers appear, but for the NAs not to appear? I've searched through ?text but was not able to find a solution. math.bar - c(NA,NA,NA,NA,55,48) apmxpmeet - c(47, 50, 49, 50, 49, 46) par(ps=10) math.bar - rbind(math.bar, apmxpmeet) math.barplot - barplot(math.bar, beside=T, col=c('blue','orange'), names=c('Grade \n 3','Grade \n 4','Grade \n 5','Grade \n 6','Grade \n 7','Grade \n 8'), ylim=c(0,120), ylab=Percentage, xlab=Grade Level) tot - round(math.bar,digits=0) graph.max - max(math.bar, apmxpmeet, na.rm=T) text(math.barplot, graph.max+5, format(tot), xpd = TRUE, col = c(blue, orange) ) tmp$sch_nameS05 - as.character(tmp$sch_nameS05) legend(2,120,legend=(c(tmp$sch_nameS05, Average)), fill=c(blue,orange)) Thanks, Harold Harold, Get rid of the 'format(tot)' in your text() call: text(math.barplot, graph.max+5, tot, xpd = TRUE, col = c(blue, orange)) By using format(tot), you are coercing the values to text. Thus: format(tot) [,1] [,2] [,3] [,4] [,5] [,6] math.bar NA NA NA NA 55 48 apmxpmeet 47 50 49 50 49 46 versus: tot [,1] [,2] [,3] [,4] [,5] [,6] math.barNA NA NA NA 55 48 apmxpmeet 47 50 49 50 49 46 In the first case, text() will draw NA, whereas in the second, the NA is ignored. Also, your code above is not entirely reproducible, since: tmp$sch_nameS05 - as.character(tmp$sch_nameS05) Error: Object tmp not found HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Seeking help with an apparently simple recoding problem
On Tue, 2005-08-23 at 10:12 -0500, Greg Blevins wrote: Hello, I have struggled, for longer than I care to admit, with this seemingly simple problem, but I cannot find a solution other than the use of long drawn out ifelse statements. I know there has to be a better way. Here is stripped down version of the situation: I start with: a - c(1,0,1,0,0,0,0) b - c(1,1,1,1,0,0,0) c - c(1,1,0,1,0,0,0) rbind(a,b,c) [,1] [,2] [,3] [,4] [,5] [,6] [,7] a1010000 b1111000 c1101000 I refer to column 3 as the target column, which at the end of the day will be NA in all instances. The logic involved: 1) If columns 2, 4 thru 7 do NOT include at least one '1', then recode columns 2 thru 7 to NA and recode column 1 to code 2. 2) If columns 2, 4 thru 7 contain at least one '1', then recode column 3 to NA. Desired recoding of the above three rows: [,1][,2][,3][,4][,5][,6][,7] a2NA NA NA NA NA NA b11 NA 1 0 0 0 c11 NA 1 0 0 0 Thanks you. You left out one key detail in the explanation, which is that the recoding appears to be done on a row by row basis, not overall. The following gets the job done, though there may be a more efficient approach: a - c(1,0,1,0,0,0,0) b - c(1,1,1,1,0,0,0) c - c(1,1,0,1,0,0,0) d - rbind(a, b, c) d [,1] [,2] [,3] [,4] [,5] [,6] [,7] a1010000 b1111000 c1101000 mod.row - function(x) { if (all(x[c(2, 4:7)] == 0)) { x[2:7] - NA x[1] - 2 } else { x[3] - NA } x } y - t(apply(d, 1, mod.row)) y [,1] [,2] [,3] [,4] [,5] [,6] [,7] a2 NA NA NA NA NA NA b11 NA1000 c11 NA1000 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plotting with same axes
On Mon, 2005-08-22 at 12:58 -0400, [EMAIL PROTECTED] wrote: I have used the 'par' command to overlay one plot on another. But how do I overlay it with the x-values plotted at the same points on the x-axis? Thank you, Steven The specific answer depends to an extent on the graphics you are plotting and whether there is a need to actually use par(new). In some cases there is an 'add = TRUE' option to plotting functions that allows for this. In others, you can create the initial plot and then use lines(), points() or the like to simply add further plotting components to the existing plot. If all else fails and you need to use par(new), then you will want to explicitly define the x and y axes to the same ranges by using the same 'xlim' and 'ylim' arguments in _both_ plotting functions. A reproducible example of what it is you are doing would enable us to give you more specific guidance, if the above is not helpful. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Copying rows from a matrix using a vector of indices
On Wed, 2005-08-17 at 12:02 -0700, Martin Lam wrote: Hi, I am trying to use a vector of indices to select some rows from a matrix. But before I can do that I somehow need to convert 'combinations' into a list, since 'mode(combinations)' says it's 'numerical'. Any idea how I can do that? library(combinat) combinations - t(combn(8,2)) indices - c(sample(1:length(combinations),10)) # convert ??? subset - combinations[indices] Thanks in advance, Martin Your goal is not entirely clear. Given your code above, you end up with: library(combinat) combinations - t(combn(8,2)) combinations [,1] [,2] [1,]12 [2,]13 [3,]14 [4,]15 [5,]16 [6,]17 [7,]18 [8,]23 [9,]24 [10,]25 [11,]26 [12,]27 [13,]28 [14,]34 [15,]35 [16,]36 [17,]37 [18,]38 [19,]45 [20,]46 [21,]47 [22,]48 [23,]56 [24,]57 [25,]58 [26,]67 [27,]68 [28,]78 combinations is a 28 x 2 matrix (or a 1d vector of length 56). Your use of sample() with '1:length(combinations)' is the same as 1:56. length(combinations) [1] 56 Hence, you end up with: set.seed(128) indices - sample(1:length(combinations), 10) indices [1] 41 53 38 46 50 44 25 11 43 7 I suspect that you really want to use '1:nrow(combinations)', which yields 1:28. nrow(combinations) [1] 28 Thus, set.seed(128) indices - sample(1:nrow(combinations), 10) indices [1] 21 26 18 22 23 20 11 5 27 3 Now, you can use: subset - combinations[indices, ] subset [,1] [,2] [1,]47 [2,]67 [3,]38 [4,]48 [5,]56 [6,]46 [7,]26 [8,]16 [9,]68 [10,]14 which yields the rows from combinations, defined by 'indices'. If correct, please review An Introduction to R and ?[ for more information on subsetting objects in R. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] recategorizing a vector into discrete states
On Tue, 2005-08-16 at 13:18 -0400, Allan Strand wrote: Hi All, I'm trying to take a numerical vector and produce a new vector of the same length where each element in the first is placed into a category given by a 'breaks-like' vector. The values in the result should equal the lower bounds of each category as defined in the breaks vector. I suspect that a vectorized solution is pretty simple, but I can't seem to figure it out today. Here is an example of my problem: Vector 'a' is the original vector. Vector 'b' gives the lower bounds of the categories. Vector 'c' is the result I am seeking. a - c(0.9, 11, 1.2, 2.4, 4.0, 5.0, 7.3, 8.1, 3.3, 4.5) b - c(0, 2, 4, 6, 8) c - c(0, 8, 0, 2, 4, 4, 6, 8, 2, 4) Any suggestions would be greatly appreciated. cheers, Allan Strand See ?cut a - c(0.9, 11, 1.2, 2.4, 4.0, 5.0, 7.3, 8.1, 3.3, 4.5) b - c(0, 2, 4, 6, 8) cut(a, c(b, Inf), labels = b, right = FALSE) [1] 0 8 0 2 4 4 6 8 2 4 Levels: 0 2 4 6 8 Note that cut() returns a factor. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] use different symbols for frequency in a plot
On Mon, 2005-08-08 at 11:57 -0700, Kerry Bush wrote: suppose I have the following data x-c(rep(.1,5),rep(.2,6),rep(.4,10),rep(.5,20)) y-c(rep(.5,3),rep(.6,8),rep(1.2,8),rep(2.5,18),rep(3,4)) If I plot(x,y) in R, I will only get seven distinct points. What I want to do is to use different symbols to show the frequency at each point. e.g. if the frequncey is between 1 and 5, then I plot the point as a circle; if the frequency is between 6 and 10, then I plot the point as a square; if the frequency is above 10, then I plot the point as a triangle. I am not sure how to do this in R. Can anybody help me? You might want to review this recent post by Deepayan Sarkar: https://stat.ethz.ch/pipermail/r-help/2005-July/074042.html with modest modification you can replace his example, which plots the frequencies with: x - c(rep(.1,5),rep(.2,6),rep(.4,10),rep(.5,20)) y - c(rep(.5,3),rep(.6,8),rep(1.2,8),rep(2.5,18),rep(3,4)) temp - data.frame(x, y) foo - subset(as.data.frame(table(temp)), Freq 0) foo x y Freq 1 0.1 0.53 5 0.1 0.62 6 0.2 0.66 11 0.4 1.28 15 0.4 2.52 16 0.5 2.5 16 20 0.5 34 # Use cut() to create the bins and specify the plotting symbols # for each bin, which are the 'label' values foo$sym - with(foo, cut(Freq, c(0, 5, 10, Inf), labels = c(21, 22, 24))) # convert 'foo' to all numeric from factors above for plotting foo - apply(foo, 2, function(x) as.numeric(as.character(x))) foo x y Freq sym 1 0.1 0.53 21 5 0.1 0.62 21 6 0.2 0.66 22 11 0.4 1.28 22 15 0.4 2.52 21 16 0.5 2.5 16 24 20 0.5 34 21 # Now do the plot. Keep in mind that 'foo' is now # a matrix, rather than a data frame plot(foo[, x], foo[, y], pch = foo[, sym]) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plotting horizontally
On Tue, 2005-07-26 at 08:58 +0700, [EMAIL PROTECTED] wrote: Hello, Is there any way to use plot() horizontally similar to boxplot(., horiz=TRUE)? I want to use to illustrate the distribution of y-values on an adjacent plot using layout(). Thanks in advance for any help. Regards, --Dan Have you looked at the last example in ?layout, which has a scatterplot with marginal histograms for the x and y axes? Alternatively, you can generally reverse the x and y values for most plots. For example, compare: d - density(rnorm(100)) # Vertical density plot plot(d) # Horizontal density plot plot(d$y, d$x, type = l, xlab = Density, main = density(y = rnorm(100)), ylab = paste(N =, d$n, Bandwidth =, formatC(d$bw))) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] elegant solution to transform vector into percentages?
On Tue, 2005-07-26 at 15:48 -0400, [EMAIL PROTECTED] wrote: Hi, I am looking for an elegant way to transform a vector into percentages of values that meet certain criteria. store-c(1,1.4,3,1.1,0.3,0.6,4,5) # now I want to get the precentages of values # that fall into the categories =M , M =N , N # let M -.8 N - 1.2 # In my real example I have many more of these cutoff-points # What I did is: out - matrix(NA,1,3) out[1,1] - ( (sum(store=M)) /length(store) )*100 out[1,2] - ( (sum(store M store= N )) /length(store) )*100 out[1,3] - ( (sum(store N)) /length(store) )*100 colnames(out)-c(percent=M,percentM =N,percentN) out But this gets very tedious if I have many cutoff-points. Does anybody know a more elegant way to do this task? Thanks so much. Cheers, Jens Something alone the lines of: store - c(1, 1.4, 3, 1.1, 0.3, 0.6, 4, 5) M - 0.8 N - 1.2 x - hist(store, br = c(min(store), M, N, max(store)), plot = FALSE)$counts pct.x - prop.table(x) * 100 names(pct.x) - c(percent = M,percent M = N,percent N) pct.x percent = M percent M = Npercent N 25 25 50 I think that should do it. See ?hist for more information and take note of the 'include.lowest' and 'right' arguments relative to whether or not values are or are not included in the specified intervals. See ?prop.table as well. Also be acutely aware of potential problems with exact equality comparisons with floating point numbers and the break points...if you have a float value equal to a breakpoint in your vector. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problems with submitting an eps-file created in R
On Fri, 2005-07-22 at 16:26 +0200, Christian Bieli wrote: Dear all I've got some problems submitting a manuscript, because I can't manage creating the favourable eps-file of a graph created in R. The journal's graphic requirements are as followed: format: eps width: max. 6 inches resolution: min. 1000 dpi supported fonts: Arial, Courier, Helvetica, Symbol, Times, Charcoal, Chicago, Geneva, Georgia, Monaco, Zapf, New York Itried to ways of getting appropriate file: 1.Creating eps-file in R by drawing into a x11-device and then: /dev.copy2eps(file = file.eps, onefile = TRUE, paper = a4, family = Helvetica, pointsize=1, print.it = FALSE, fonts = Helvetica) /2. Generating a postscript-file in R with / //postscript(file = file.ps, onefile = TRUE, paper = a4, family = Helvetica, width = 6, height = 2.2, pointsize=1, print.it = FALSE, fonts = Helvetica)/ an trying to convert it in ghostview. Neither approach brought the favoured result. The error message I got from the quality checking program was : Warning: Document is Missing Non-Standard Font One or more non-standard fonts used in this image is not embedded. Standard fonts are: Arial, Courier, Helvetica, Symbol, Times, Charcoal, Chicago, Geneva, Georgia, Monaco, Zapf, New York. In order to repair this problem, save your document with fonts embedded. Error: Missing Fonts Challenge One or more linked or used fonts cannot be found. This is caused by DigitalExpert not being able to locate fonts on your system. All fonts used in the document must be active or have a defined and valid path so that DigitalExpert can find them. Solution The only way to repair this problem is to make the fonts available to DigitalExpert, and then reprocess the files. In order to make fonts active, either activate them using a font management program, or move them into the system:fonts folder. 1. Obviously there's a font problem. Helvetica IS installed in my OS (windows 2000). Why DigitalExpert does not recognize the Helvetica font as a standard font? I thought the fonts option would embed the specified font in the file. Am I right or is there another way to embed fonts? 2. Is there a way to set up the resolution of the created ps/eps-file in R (until now I did the settings in ghostview)? 3. I did not find particulars about the pointsize option. What is the effect of changing the pointsize value? With best regards Christian You need to read the help for postscript(), which specifically tells you in the Details section to use: postscript(..., onefile = FALSE, horizontal = FALSE, paper = special) to generate EPS files. There is no resolution setting for a postscript file per se, since PS is a device independent vector based format and the resolution of the output is dependent upon the target device/viewer. One exception to this would be the embedding of a bitmapped object in a PS file, but that is not applicable here. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Question about 'text' (add lm summary to a plot)
Ok guys, So I played around with this a bit, going back to Dan's original requirements and using Thomas' do.call() approach with legend(). Gabor's approach using sapply() will also work here. I have the following: # Note the leading spaces here for alignment in the table # This could be automated with formatC() or sprintf() my.slope.1 -3.22 my.slope.2 - 0.13 my.inter.1 - -10.66 my.inter.2 - 1.96 my.Rsqua - 0.97 plot(1:5) L - list(Intercept:, Slope:, bquote(paste(R^2, :)), bquote(.(my.inter.1) %+-% .(my.inter.2)), bquote(.(my.slope.1) %+-% .(my.slope.2)), bquote(.(my.Rsqua))) par(family = mono) legend(topleft, legend = do.call(expression, L), ncol = 2) Note however, that while using the mono font helps with vertical alignment of numbers, the +/- sign still comes out in the default font, which is bold[er] than the text. If one uses the default font, which is variable spaced, it is problematic to get the proper alignment for the numbers. I even tried using phantom(), but that didn't quite get it, since the spacing is variable, as opposed to LaTeX's mono numeric spacing with default fonts. Also, note that I am only using two columns, rather than three, since trying to place the : as a middle column results in spacing that is too wide, given that the text.width argument is a scalar and is set to the maximum width of the character vectors. Note also that even with mono spaced fonts, the exponent in R^2 is still horizontally smaller than the other characters. Thus, spacing on that line may also be affected depending upon what else one might attempt. Not sure where else to go from here. HTH, Marc Schwartz On Fri, 2005-07-22 at 14:01 -0400, Gabor Grothendieck wrote: You are right. One would have to use do.call as you did or the sapply method of one of my previous posts: a - 7 plot(1) L - list(bquote(alpha==.(a)),bquote(alpha^2+1==.(a^2+1))) legend(topleft,legend=sapply(L, as.expression)) On 7/22/05, Thomas Lumley [EMAIL PROTECTED] wrote: On Fri, 22 Jul 2005, Gabor Grothendieck wrote: I think legend accepts a list argument directly so that could be simplified to just: a-7 plot(1) L - list(bquote(alpha==.(a)),bquote(alpha^2+1==.(a^2+1))) legend(topleft,legend=L) Except that it wouldn't then work: the mathematical stuff comes out as text. The same comment seems to apply to my prior suggestion about as.expression(bquote(...)), namely that one can just write the following as text also supports a list argument: And this doesn't work either: you end up with %+-% rather than the plus-or-minus symbol. The reason I gave the do.call() version is that I had tried these simpler versions and they didn't work. -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R:plot and dots
On Thu, 2005-07-21 at 16:18 +0200, Clark Allan wrote: hi all a very simple question. i have plot(x,y) but i would like to add in on the plot the observation number associated with each point. how can this be done? / allan If you mean the unique observation number associated with each x,y pair, you can use: text(x, y, labels = ObsNumberVector, pos = 3) after the plot(x, y) call: df - data.frame(x = rnorm(10), y = rnorm(10), ID = 1:10) with(df, plot(x, y)) with(df, text(x, y, labels = ID, pos = 3)) See ?text for more information. Note that I used pos = 3 which places the label above the data point. There are other positioning parameters available, which are noted in the help file. Note also that you might have to adjust the plot axis limits depending upon where you place the text and your extreme points. If you mean the frequency of each x,y pair (if there is more than one observation per x,y pair), you might want to review Deepayan's recent post here: https://stat.ethz.ch/pipermail/r-help/2005-July/074042.html HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Question about 'text' (add lm summary to a plot)
[Note: the initial posts have been re-arranged to attempt to maintain the flow from top to bottom] Dan Bolser writes: I would like to annotate my plot with a little box containing the slope, intercept and R^2 of a lm on the data. I would like it to look like... ++ | Slope : 3.45 +- 0.34 | | Intercept : -10.43 +- 1.42 | | R^2 : 0.78 | ++ However I can't make anything this neat, and I can't find out how to combine this with symbols for R^2 / +- (plus minus). Below is my best attempt (which is franky quite pour). Can anyone improve on the below? Specifically, aligned text and numbers, aligned decimal places, symbol for R^2 in the text (expression(R^2) seems to fail with 'paste') and +- Cheers, Dan. dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS) abline(coef(dat.lm),lty=2,lwd=1.5) dat.lm.sum - summary(dat.lm) dat.lm.sum attributes(dat.lm.sum) my.text.1 - paste(Slope : , round(dat.lm.sum$coefficients[2],2), +/-, round(dat.lm.sum$coefficients[4],2)) my.text.2 - paste(Intercept : , round(dat.lm.sum$coefficients[1],2), +/-, round(dat.lm.sum$coefficients[3],2)) my.text.3 - paste(R^2 : , round(dat.lm.sum$r.squared,2)) my.text.1 my.text.2 my.text.3 ## Add legend text(x=3, y=300, paste(my.text.1, my.text.2, my.text.3, sep=\n), adj=c(0,0), cex=1 On Thu, 21 Jul 2005, Christoph Buser wrote: Dear Dan I can only help you with your third problem, expression and paste. You can use: plot(1:5,1:5, type = n) text(2,4,expression(paste(Slope : , 3.45%+-%0.34, sep = )), pos = 4) text(2,3.8,expression(paste(Intercept : , -10.43%+-%1.42)), pos = 4) text(2,3.6,expression(paste(R^2,: , 0.78, sep = )), pos = 4) I do not have an elegant solution for the alignment. On Thu, 2005-07-21 at 19:55 +0100, Dan Bolser wrote: Cheers for this. I was trying to get it to work, but the problem is that I need to replace the values above with variables, from the following code... dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS) dat.lm.sum - summary(dat.lm) my.slope.1 - round(dat.lm.sum$coefficients[2],2) my.slope.2 - round(dat.lm.sum$coefficients[4],2) my.inter.1 - round(dat.lm.sum$coefficients[1],2) my.inter.2 - round(dat.lm.sum$coefficients[3],2) my.Rsqua.1 - round(dat.lm.sum$r.squared,2) Anything I try results in either the words 'paste(Slope:, my.slope.1, %+-%my.slope.2,sep=)' being written to the plot, or just 'my.slope.1+-my.slope2' (where the +- is correctly written). I want to script it up and write all three lines to the plot with 'sep=\n', rather than deciding three different heights. Thanks very much for what you gave, its a good start for me to figure out how I am supposed to be telling R what to do! Any way to just get fixed width fonts with text? (for the alignment problem) Dan, Here is one approach. It may not be the best, but it gets the job done. You can certainly take this and encapsulate it in a function to automate the text/box placement and to pass values as arguments. A couple of quick concepts: 1. As far as I know, plotmath cannot do multiple lines, so each line in your box needs to be done separately. 2. The horizontal alignment is a bit problematic when using expression() or bquote() since I don't believe that multiple spaces are honored as such after parsing. Thus I break up each component (label, : and values) into separate text() calls. The labels are left justified. 3. The alignment for the numeric values are done with right justification. So, as long as you use a consistent number of decimals in the value outputs (2 here), you should be OK. This means you might need to use formatC() or sprintf() to control the numeric output values on either side of the +/- sign. 4. In the variable replacement, note the use of substitute() and the list of x and y arguments as replacement values in the expressions. # Set your values my.slope.1 - 3.45 my.slope.2 - 0.34 my.inter.1 - -10.43 my.inter.2 - 1.42 my.Rsqua - 0.78 # Create the initial plot as per Christoph's post plot(1:5, 1:5, type = n) #- # Do the Slope #- text(1, 4.5, Slope, pos = 4) text(2, 4.5, :) text(3, 4.5, substitute(x %+-% y, list(x = my.slope.1, y = my.slope.2)), pos = 2) #- # Do the Intercept #- text(1, 4.25, Intercept, pos = 4) text(2, 4.25, :) text(3, 4.25, substitute(x %+-% y, list(x = my.inter.1, y = my.inter.2)), pos = 2)
Re: [R] using argument names (of indeterminate number) within a function
On Tue, 2005-07-19 at 17:29 +0200, Dirk Enzmann wrote: Although I tried to find an answer in the manuals and archives, I cannot solve this (please excuse that my English and/or R programming skills are not good enough to state my problem more clearly): I want to write a function with an indeterminate (not pre-defined) number of arguments and think that I should use the ... construct and the match.call() function. The goal is to write a function that (among other things) uses cbind() to combine a not pre-defined number of vectors specified in the function call. For example, if my vectors are x1, x2, x3, ... xn, within the function I want to use cbind(x1, x2) or cbind(x1, x3, x5) or ... depending on the vector names I use in the funcion call. Additionally, the function has other arguments. In the archives I found the following thread (followed by Marc Schwartz) http://finzi.psych.upenn.edu/R/Rhelp02a/archive/15186.html [R] returning argument names from Peter Dalgaard BSA on 2003-04-10 (stdin) that seems to contain the solution to my problem, but I am stuck because sapply(match.call()[-1], deparse) gives me a vector of strings and I don't know how to use the names in this vector in the cbind() function. Up to now my (clearly deficit) function looks like: test - function(..., mvalid=1) { args = sapply(match.call()[-1], deparse) # and here, I don't know how the vector names in args # can be used in the cbind() function to follow: # # temp - cbind( ??? if (mvalid 1) { # here it goes on } } Ultimately, I want that the function can be called like test(x1,x2,mvalid=1) or test(x1,x3,x5,mavlid=2) and that within the function cbind(x1,x2) or cbind(x1,x3,x5) will be used. Can someone give and explain an example / a solution on how to proceed? Hi Dirk, How about this: my.cbind - function(...) { do.call(cbind, list(...)) } a - 1:10 b - 11:20 c - 21:30 d - 31:40 my.cbind(a, b) [,1] [,2] [1,]1 11 [2,]2 12 [3,]3 13 [4,]4 14 [5,]5 15 [6,]6 16 [7,]7 17 [8,]8 18 [9,]9 19 [10,] 10 20 my.cbind(b, c, d) [,1] [,2] [,3] [1,] 11 21 31 [2,] 12 22 32 [3,] 13 23 33 [4,] 14 24 34 [5,] 15 25 35 [6,] 16 26 36 [7,] 17 27 37 [8,] 18 28 38 [9,] 19 29 39 [10,] 20 30 40 my.cbind(a, b, c, d) [,1] [,2] [,3] [,4] [1,]1 11 21 31 [2,]2 12 22 32 [3,]3 13 23 33 [4,]4 14 24 34 [5,]5 15 25 35 [6,]6 16 26 36 [7,]7 17 27 37 [8,]8 18 28 38 [9,]9 19 29 39 [10,] 10 20 30 40 The use of list(...) in the function allows you to use list based functions such as do.call() or lapply() against the argument objects directly without having to deparse and re-parse the character names of the arguments, which is the approach that Peter used in his response in the thread you referenced. The OP in that thread wanted the argument names as character vectors, as opposed to the argument objects themselves, which is what you need here. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] .gct file
On Tue, 2005-07-19 at 12:28 -0400, Duncan Murdoch wrote: On 7/19/2005 12:10 PM, mark salsburg wrote: I have two files to compare, one is a regular txt file that I can read in no prob. The other is a .gct file (How do I read in this one?) I tried a simple read.table(data.gct, header = T) How do you suggest reading in this file?? .gct is not a standard filename extension. You need to know what is in that file. Where did you get it? What program created it? Chances are the easiest thing to do is to get the program that created it to export in a well known format, e.g. .csv. Duncan Murdoch A quick Google search would suggest Gene Cluster Text file: http://www.broad.mit.edu/cancer/software/genepattern/tutorial/gp_tutorial_fileformats.html#gct produced by Gene Pattern: http://www.broad.mit.edu/cancer/software/genepattern/ If correct, I would point Mark to the Bioconductor folks for more information and assistance: http://www.bioconductor.org/ HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] .gct file
On Tue, 2005-07-19 at 13:16 -0400, mark salsburg wrote: ok so the gct file looks like this: #1.2 (version number) 7283 19 (matrix size) Name Description Values ... .. How can I tell R to disregard the first two lines and start reading the 3rd line in this gct file. I would just delete them, but I do not know how to open a gct. file thank you On 7/19/05, Duncan Murdoch [EMAIL PROTECTED] wrote: On 7/19/2005 12:10 PM, mark salsburg wrote: I have two files to compare, one is a regular txt file that I can read in no prob. The other is a .gct file (How do I read in this one?) I tried a simple read.table(data.gct, header = T) How do you suggest reading in this file?? .gct is not a standard filename extension. You need to know what is in that file. Where did you get it? What program created it? Chances are the easiest thing to do is to get the program that created it to export in a well known format, e.g. .csv. Duncan Murdoch The above would be consistent with the info in my reply. I guess if the format is consistent, as per Mark's example above, you can use: read.table(data.gct, skip = 2, header = TRUE) which will start by skipping the first two lines and then reading in the header row and then the data. See ?read.table HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] .gct file
For the TAB delimited columns, adjust the 'sep' argument to: read.table(data.gct, skip = 2, header = TRUE, sep = \t) The 'quote' argument is by default: quote = \' which should take care of the quoted strings and bring them in as a single value. The above presumes that the header row is also TAB delimited. If not, you may have to set 'skip = 3' to skip over the header row and manually set the column names. HTH, Marc Schwartz On Tue, 2005-07-19 at 13:52 -0400, mark salsburg wrote: This is all extremely helpful. The data turns out is a little atypical, the columns are tab-delemited except for the description columns DATA1.gct looks like this #1.2 23 3423 NAME DESCRIPTION VALUE gene1 a protein inducer 1123 . . .. How do I get R to read the data as tab delemited, but read in the 2nd coloumn as one value based on the quotation marks.. thanks.. On 7/19/05, Marc Schwartz (via MN) [EMAIL PROTECTED] wrote: On Tue, 2005-07-19 at 13:16 -0400, mark salsburg wrote: ok so the gct file looks like this: #1.2 (version number) 7283 19 (matrix size) Name Description Values ... .. How can I tell R to disregard the first two lines and start reading the 3rd line in this gct file. I would just delete them, but I do not know how to open a gct. file thank you On 7/19/05, Duncan Murdoch [EMAIL PROTECTED] wrote: On 7/19/2005 12:10 PM, mark salsburg wrote: I have two files to compare, one is a regular txt file that I can read in no prob. The other is a .gct file (How do I read in this one?) I tried a simple read.table(data.gct, header = T) How do you suggest reading in this file?? .gct is not a standard filename extension. You need to know what is in that file. Where did you get it? What program created it? Chances are the easiest thing to do is to get the program that created it to export in a well known format, e.g. .csv. Duncan Murdoch The above would be consistent with the info in my reply. I guess if the format is consistent, as per Mark's example above, you can use: read.table(data.gct, skip = 2, header = TRUE) which will start by skipping the first two lines and then reading in the header row and then the data. See ?read.table HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] .gct file
On Tue, 2005-07-19 at 13:08 -0500, Marc Schwartz (via MN) wrote: For the TAB delimited columns, adjust the 'sep' argument to: read.table(data.gct, skip = 2, header = TRUE, sep = \t) The 'quote' argument is by default: quote = \' which should take care of the quoted strings and bring them in as a single value. The above presumes that the header row is also TAB delimited. If not, you may have to set 'skip = 3' to skip over the header row and manually set the column names. One correction. If the final para applies and you need to use 'skip = 3', you would also need to leave out the 'header = TRUE' argument, which defaults to FALSE. Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Can't get sample function from An Introduction to R to work
On Fri, 2005-07-15 at 15:40 -0500, David Groos wrote: I'm trying to figure out R, a piece at a time, hours at a time... I was trying to copy the sample function in, An Introduction to R (for version 2.1.0) by W. N. Venables, D. M. Smith, page 42. Section 10.1 Simple examples provides a sample function which I tried to duplicate (I'm using Mac OS X 10.3.9, and R for Mac OS X Aqua GUI v1.11). The following is what I typed and the last line is R's response when I hit the return key after the penultimate line. I've re-checked and re-typed the code many times to no avail. I wasn't able to find this issue using search options, either. Any help is GREATLY appreciated! twosam-function(y1, y2) { + n1-length(y1);n2 -length(y2) + yb1-mean(y1); yb2-mean(y2) + s1-var(y1);s2-var(y2) + s-((n1-1)*s1 + (n2-1)*s2)/(n1+n2-2) + tst-(yb1-yb2)/sqrt(s*(1/n1+1/n2)) Error: syntax error David The code as you have above (without the + on each line) works for me both in ESS and in the R console under Linux. There should be another + on the next line, in anticipation of the remaining two lines: tst } Try to copy and paste the following into the console as is: twosam-function(y1, y2) { n1 - length(y1);n2 - length(y2) yb1 - mean(y1); yb2 - mean(y2) s1 - var(y1);s2 - var(y2) s - ((n1-1)*s1 + (n2-1)*s2)/(n1+n2-2) tst - (yb1-yb2)/sqrt(s*(1/n1+1/n2)) and see what happens. You should be left at a: + on a new line again. It is possible that there is a bug in the Aqua GUI, but not using a Mac, I cannot replicate it. You might want to consider subscribing and posting to the R-SIG-Mac e-mail list, which is focused on Mac users of R. More information is here: https://stat.ethz.ch/mailman/listinfo/r-sig-mac HTH, Marc Schwartz P.S. Greetings from Eden Prairie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plot the number of replicates at the same point
On Thu, 2005-07-14 at 12:30 -0500, Deepayan Sarkar wrote: On 7/14/05, Kerry Bush [EMAIL PROTECTED] wrote: Thank you for thinking about the problem for me. However, I have found that your method doesn't work at all. You may test the following example: x1=c(0.6,0.4,.4,.4,.2,.2,.2,0,0) x2=c(0.4,.2,.4,.6,0,.2,.4,0,.2) x1=rep(x1,4) x2=rep(x2,4) temp=data.frame(x1,x2) temp1=table(temp) plot(temp$x1,temp$x2,cex=0) text(as.numeric(rownames(temp1)), as.numeric(colnames(temp1)), temp1) what I got here is not what I wanted. You may compare with plot(x1,x2) I actually want some plots similar to what SAS proc plot produced. Does anybody have a clue of how to do this easily in R? Try this: x1=c(0.6,0.4,.4,.4,.2,.2,.2,0,0) x2=c(0.4,.2,.4,.6,0,.2,.4,0,.2) x1=rep(x1,4) x2=rep(x2,4) temp=data.frame(x1,x2) foo - subset(as.data.frame(table(temp)), Freq 0) foo$x1 - as.numeric(as.character(foo$x1)) foo$x2 - as.numeric(as.character(foo$x2)) with(foo, plot(x1, x2, type = n)) with(foo, text(x1, x2, lab = Freq)) -Deepayan Great solution Deepayan. I was just in the process of working on something similar. One quick modification is that the two plot()/text() functions can be combined into: with(foo, plot(x1, x2, pch = as.character(Freq))) Best regards, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plot the number of replicates at the same point
On Thu, 2005-07-14 at 15:08 -0500, Deepayan Sarkar wrote: On 7/14/05, Marc Schwartz (via MN) [EMAIL PROTECTED] wrote: On Thu, 2005-07-14 at 12:30 -0500, Deepayan Sarkar wrote: On 7/14/05, Kerry Bush [EMAIL PROTECTED] wrote: Thank you for thinking about the problem for me. However, I have found that your method doesn't work at all. You may test the following example: x1=c(0.6,0.4,.4,.4,.2,.2,.2,0,0) x2=c(0.4,.2,.4,.6,0,.2,.4,0,.2) x1=rep(x1,4) x2=rep(x2,4) temp=data.frame(x1,x2) temp1=table(temp) plot(temp$x1,temp$x2,cex=0) text(as.numeric(rownames(temp1)), as.numeric(colnames(temp1)), temp1) what I got here is not what I wanted. You may compare with plot(x1,x2) I actually want some plots similar to what SAS proc plot produced. Does anybody have a clue of how to do this easily in R? Try this: x1=c(0.6,0.4,.4,.4,.2,.2,.2,0,0) x2=c(0.4,.2,.4,.6,0,.2,.4,0,.2) x1=rep(x1,4) x2=rep(x2,4) temp=data.frame(x1,x2) foo - subset(as.data.frame(table(temp)), Freq 0) foo$x1 - as.numeric(as.character(foo$x1)) foo$x2 - as.numeric(as.character(foo$x2)) with(foo, plot(x1, x2, type = n)) with(foo, text(x1, x2, lab = Freq)) -Deepayan Great solution Deepayan. I was just in the process of working on something similar. One quick modification is that the two plot()/text() functions can be combined into: with(foo, plot(x1, x2, pch = as.character(Freq))) Only as long as all(Freq 10) (I've been bitten by this before. :-) ). Deepayan D'oh! Indeed you are correct, since 'pch' is a single character: plot(1:20, pch = as.character(1:20)) as opposed to: plot(1:20, type = n) text(1:20, lab = 1:20) Thanks for the correction. Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] question for IF ELSE usage
On Tue, 2005-07-12 at 16:22 +0200, ecoinfo wrote: Hi R users, Maybe the question is too simple. In a IF ... ELSE ... statement if(cond) cons.expr else alt.expr, IF and ELSE should be at the same line? For example, if (x1==12) { y1 - 5 }else { y1 - 3 } is right, while if (x1==12) { y1 - 5 } else # Error: syntax error { y1 - 3 } is wrong? Thanks Note the following from the Details section of ?if Note that it is a common mistake to forget to put braces ({ .. }) around your statements, e.g., after if(..) or for(). In particular, you should not have a newline between } and else to avoid a syntax error in entering a if ... else construct at the keyboard or via source. For that reason, one (somewhat extreme) attitude of defensive programming is to always use braces, e.g., for if clauses. One other approach is the following: if (x1 == 12) { y1 - 5 } else { y1 - 3 } Note the presence of both braces on the 'else' line. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] simple question: reading lower triangular matrix
On Tue, 2005-07-12 at 14:37 -0300, Rogério Rosa da Silva wrote: Dear list, I will like to learn how to read a lower triangular matrix in R. The input file *.txt have the following format: A B C D E A 0 B 10 C 250 D 3680 E 479 100 How this can be done? Thanks in advance for your help Rogério I don't know that this is the easiest way of doing it, but here is one approach: # I saved your data above in a file called test.txt # Read the first line of test.txt to get the colnames as chars col.names - unlist(read.table(test.txt, nrow = 1, as.is = TRUE)) # now read the rest of the file using 'fill = TRUE' to pad lines # with NAs # skip the first line # set the row.names as the first column in the text file # coerce to a matrix df - as.matrix(read.table(test.txt, fill = TRUE, skip = 1, row.names = 1)) # Now set the colnames of df colnames(df) - col.names df A B C D E A 0 NA NA NA NA B 1 0 NA NA NA C 2 5 0 NA NA D 3 6 8 0 NA E 4 7 9 10 0 If you should further want to set the diagonal to NA: diag(df) - NA df A B C D E A NA NA NA NA NA B 1 NA NA NA NA C 2 5 NA NA NA D 3 6 8 NA NA E 4 7 9 10 NA See ?read.table for more information on the file reading part. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Plotting Character Variable
On Thu, 2005-07-07 at 11:47 -0400, [EMAIL PROTECTED] wrote: Any ideas about the following problem: I have a matrix (A) that looks like this: gene_names values hsa-mir-124 0.3 hsa-mir-234 0.1 hsa-mir-344 0.4 hsa-mir-333 0.7 . ... (This is a 2 by 22283 matrix: quite large) To split hairs, it would be a 22283 by 2 object in R's [row, column] approach to indexing. I am also presuming that A is a data frame, since you seem to have two different data types above, with gene_names being a factor? I would like to plot the values, but output the gene_names as the plotting symbol. I have tried regular x,y plots, but since the gene_names are quite large and there are 22283 of them, it's impossible to fit them on the x-axis. Basically, can I plot the above matrix plot(gene_names, value) where the gene_names are used as the plotting symbol. thank you, Well... I would defer to those with more experience in plotting genetic data, but from a practical standpoint, it seems to me to be highly problematic to plot 20,000 data points with labels and have them be human readable without an STMunless you have _very_ wide paper on a large format plotter ;-) That being said, one approach is to rotate the x axis labels vertically, to make more room, while using points for the plotting symbols: # Adjust bottom margin to make room for vertical labels par(mar = c(7, 4, 4, 2)) plot(1:nrow(A), A$values, xaxt = n, ann = FALSE, las = 2) # use 'las = 3' to rotate the labels axis(1, at = 1:nrow(A), labels = as.character(A$gene_names), las = 3) You might want to review some of the tools available at the Bioconductor site to see if there are specialized plotting functions for this type of data: http://www.bioconductor.org/ HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] xmat[1, 2:3] - NULL
On Thu, 2005-07-07 at 10:20 -0700, Mikkel Grum wrote: I have a situation where I'm filling out a dataframe from a database. Sometimes the database query doesn't get anything, so I end up trying to place NULL in the dataframe like below. temp - NULL xmat - as.data.frame(matrix(NA, 2, 3)) xmat[1, 2:3] - temp Error in if (m n * p (n * p)%%m) stop(gettextf(replacement has %d items, need %d, : missing value where TRUE/FALSE needed I can't get the programme to accept that sometimes what the query looks for just doesn't exist, and I just want to move on to the next calculation leaving the dataframe with a missing value in the given cell. It's a real show stopper and I haven't found a way round it. Best wishes, Mikkel PS. I'm using dbGetQuery to query an SQLite database. NULL represents a zero length object in R. Thus, trying to set only the first row in a data frame to NULL makes no sense, since you cannot have a 0 length object that also has a single row (as you seem to be trying to do above). Since a data frame is a series of lists, you could do the following: temp - NULL xmat - as.data.frame(matrix(NA, 2, 3)) xmat V1 V2 V3 1 NA NA NA 2 NA NA NA xmat[, 1] - temp xmat V2 V3 1 NA NA 2 NA NA which removes the first column in the data frame. This is the same as: xmat[, -1] V2 V3 1 NA NA 2 NA NA You could also set the entire xmat to NULL as follows: xmat V1 V2 V3 1 NA NA NA 2 NA NA NA xmat - NULL xmat NULL You can then test to see if 'xmat' is a NULL: is.null(xmat) [1] TRUE and base a boolean expression and resultant action on that result: if (!is.null(xmat)) { do_calculations... } If your calculations are on a row by row basis, where NA's represent missing data, you can also use one of several functions to eliminate those rows. See ?na.action, ?na.omit and ?complete.cases for more information and examples. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] the format of the result
On Fri, 2005-07-01 at 19:40 +0800, ronggui wrote: I write a function to get the frequency and prop of a variable. freq-function(x,digits=3) {naa-is.na(x) nas-sum(naa) if (any(naa)) x-x[!naa] n-length(x) ta-table(x) prop-prop.table(ta)*100 res-rbind(ta,prop) rownames(res)-c(Freq,Prop) cat(Missing value(s) are,nas,.\n) cat(Valid case(s) are,n,.\n) cat(Total case(s) are,(n+nas),.\n\n) print(res,digits=(digits+2)) cat(\n) } freq(sample(letters[1:3],48,T),2) Missing value(s) are 0 . Valid case(s) are 48 . Total case(s) are 48 . a b c Freq 11.00 20.00 17.00 Prop 22.92 41.67 35.42 and i want the result to be like a b c Freq 11.00 20.00 17.00 Prop 22.92% 41.67% 35.42% how should i change my function to get what i want? Here is a modification of the function that I think should work. Note that part of the output formatting process has to take into account the a priori unknowns involving your 'digits' argument, the lengths of the dimnames resulting from the table and the lengths of the frequency counts in the table. Thus, a fair amount of the code is establishing the 'width' argument, which is then used in formatC() so that the output can be column aligned properly. Note that by default, table() will exclude NA, so you do not need to subset 'x' before using table(). Also, note that I change Prop to Pct. freq - function(x, digits = 3) { n - length(x) missing - sum(is.na(x)) ta - table(x) pct - prop.table(ta) * 100 width - max(nchar(unlist(dimnames(ta))) + 1, nchar(ta) + digits + 1, 5 + digits) Vals - paste(formatC(unlist(dimnames(ta)), format = s, width = width), collapse = ) Freq - paste(formatC(ta, format = f, digits = digits, width = width), collapse = ) Pct - paste(formatC(pct, format = f, digits = digits, width = width), %, sep = , collapse = ) cat(Missing value(s) are, missing, .\n) cat(Valid case(s) are, n - missing,.\n) cat(Total case(s) are, n, .\n\n) cat(, Vals, \n) cat(Freq, Freq, \n) cat(Pct , Pct, \n) cat(\n) } Thus: freq(sample(letters[1:3], 48, TRUE), 2) Missing value(s) are 0 . Valid case(s) are 48 . Total case(s) are 48 . abc Freq 28.00 8.0012.00 Pct58.33% 16.67% 25.00% freq(sample(c(letters[1:3], NA), 1000, TRUE), 2) Missing value(s) are 257 . Valid case(s) are 743 . Total case(s) are 1000 . abc Freq 250.00 218.00 275.00 Pct33.65% 29.34% 37.01% freq(iris$Species) Missing value(s) are 0 . Valid case(s) are 150 . Total case(s) are 150 . setosa versicolorvirginica Freq 50.000 50.000 50.000 Pct 33.333% 33.333% 33.333% freq(iris$Species, 0) Missing value(s) are 0 . Valid case(s) are 150 . Total case(s) are 150 . setosa versicolorvirginica Freq 50 50 50 Pct 33% 33% 33% HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to rotate the axisnames in a BARPLOT
On Thu, 2005-06-30 at 23:05 +0200, [EMAIL PROTECTED] wrote: Hi all, - how can I do a barplot with rotated axis labels? I've seen the example for just a plot in the FAQ, but I'll missing the coordinates to plot my text at the right position beneath the bars. Is there any (easy?) solution? - how can I set the y-axis in a barplot to logarithmic scale? Many thanks in advance! Best Regards Tom The same concept for rotating axis labels in a regular plot per the FAQ can be used in a barplot, as long as you know that barplot() returns the bar midpoints. So: ## Increase bottom margin to make room for rotated labels par(mar = c(7, 4, 4, 2) + 0.1) ## Create plot and get bar midpoints in 'mp' mp - barplot(1:10) ## Set up x axis with tick marks alone axis(1, at = mp, labels = FALSE) ## Create some text labels labels - paste(Label, 1:10, sep = ) ## Plot x axis labels at mp text(mp, par(usr)[3] - 0.5, srt = 45, adj = 1, labels = labels, xpd = TRUE) ## Plot x axis label at line 4 mtext(1, text = X Axis Label, line = 4) See ?barplot for more information. With respect to the log scales, you would need to use barplot2(), which is in the 'gplots' package on CRAN. barplot2() supports log scales using the same 'log' argument as plot(). HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] faster algorithm for Kendall's tau
On Tue, 2005-06-28 at 13:03 -0400, ferdinand principia wrote: Hi, I need to calculate Kendall's tau for large data vectors (length 100'000). Is somebody aware of a faster algorithm or package function than cor(, method=kendall)? There are ties in the data to be considered (Kendall's tau-b). Any suggestions? Regards Ferdinand The time intensive part of the process is typically the ranking/ordering of the vector pairs to calculate the numbers of concordant and discordant pairs. If the number of _unique pairs_ in your data is substantially less than the number of total pairs (in other words, creating a smaller 2d contingency table from a pair of your vectors makes sense), then the following may be of help. # Calculate CONcordant Pairs in a table # cycle through x[r, c] and multiply by # sum(x elements below and to the right of x[r, c]) # x = table concordant - function(x) { x - matrix(as.numeric(x), dim(x)) # get sum(matrix values r AND c) # for each matrix[r, c] mat.lr - function(r, c) { lr - x[(r.x r) (c.x c)] sum(lr) } # get row and column index for each # matrix element r.x - row(x) c.x - col(x) # return the sum of each matrix[r, c] * sums # using mapply to sequence thru each matrix[r, c] sum(x * mapply(mat.lr, r = r.x, c = c.x)) } # Calculate DIScordant Pairs in a table # cycle through x[r, c] and multiply by # sum(x elements below and to the left of x[r, c]) # x = table discordant - function(x) { x - matrix(as.numeric(x), dim(x)) # get sum(matrix values r AND c) # for each matrix[r, c] mat.ll - function(r, c) { ll - x[(r.x r) (c.x c)] sum(ll) } # get row and column index for each # matrix element r.x - row(x) c.x - col(x) # return the sum of each matrix[r, c] * sums # using mapply to sequence thru each matrix[r, c] sum(x * mapply(mat.ll, r = r.x, c = c.x)) } # Calculate Kendall's Tau-b # x = table calc.KTb - function(x) { x - matrix(as.numeric(x), dim(x)) c - concordant(x) d - discordant(x) n - sum(x) SumR - rowSums(x) SumC - colSums(x) KTb - (2 * (c - d)) / sqrt(((n ^ 2) - (sum(SumR ^ 2))) * ((n ^ 2) - (sum(SumC ^ 2 KTb } Note that I made some modifications of the above, relative to prior versions that I have posted to handle large numbers of pairs to avoid integer overflows in summations. Hence the: x - matrix(as.numeric(x), dim(x)) conversion in each function. Now, create some random test data, with 100,000 elements in each vector, sampling from 'letters', which would yield a 26 x 26 table: a - sample(letters, 10, replace = TRUE) b - sample(letters, 10, replace = TRUE) dim(table(a, b)) [1] 26 26 system.time(print(calc.KTb(table(a, b [1] 0.0006906088 [1] 0.77 0.02 0.83 0.00 0.00 Note that in the above, the initial table takes most of the time: system.time(table(a, b)) [1] 0.55 0.00 0.56 0.00 0.00 Hence: tab.ab - table(a, b) system.time(print(calc.KTb(tab.ab))) [1] 0.0006906088 [1] 0.25 0.01 0.27 0.00 0.00 I should note that I also ran: system.time(print(cor(a, b, method = kendall))) [1] 0.0006906088 [1] 694.80 7.72 931.89 0.00 0.00 Nice to know the results work out at least... :-) I have not tested with substantially larger 2d matrices, but would envision that as the dimensions of the resultant tabulation increases, my method probably approaches and may even become less efficient than the approach implemented in cor(). Some testing would validate this and perhaps point to coding the concordant() and discordant() functions in C for improvement in timing. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] conversion
On Tue, 2005-06-28 at 17:23 -0400, Anna Oganyan wrote: Dear List, How can I convert a list with elements being character strings, like: c(1,2,3,4), “c(1,3,4,2) … to a list with elements as numerical vectors: c(1,2,3,4), c(1,3,4,2)…? Thanks! Anna l - list(c(1,2,3,4), c(1,3,4,2)) l [[1]] [1] c(1,2,3,4) [[2]] [1] c(1,3,4,2) Now use lapply() over each list element in 'l', converting the character vectors to R expressions and then evaluating them: lapply(l, function(x) eval(parse(text = x))) [[1]] [1] 1 2 3 4 [[2]] [1] 1 3 4 2 See ?lapply, ?eval and ?parse. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html