Re: [R] Comparing rows of matrices with different dimensions
On Mon, 2005-11-21 at 16:57 +0100, Antje Döring wrote: > > Hi there, > > > > I have a question, which I thought is very easy to solve, but somehow > I can't find a solution. Probably someone could help me quickly? > > > > Here it is: > > > > I have two matrices: > > > > a > > [,1] [,2] [,3] > > [1,]149 > > [2,]26 10 > > [3,]36 11 > > [4,]48 12 > > > > > > b > > [,1] [,2] > > [1,]14 > > [2,]25 > > [3,]36 > > > > Now I want to find out which rows of b can also be found in a (without > its last column). So the solution must be something like either "TRUE > FALSE TRUE" or the rows where their is a match (rows 1 and 3) > > > > Till now I have tried things like b %in% a[,1:2] or so but that > doesn't work because I want to compare the WHOLE row of b with the > whole row of a without column 3. > > > > Thank you very much for any help. > > > > Regards, Antje Here is one possible approach, though not tested beyond this example: > which(apply(matrix(b %in% a, dim(b)), 1, all)) [1] 1 3 The steps are: # which elements of b are in a > b %in% a [1] TRUE TRUE TRUE TRUE FALSE TRUE # Turn that into a matrix the same shape as b > matrix(b %in% a, dim(b)) [,1] [,2] [1,] TRUE TRUE [2,] TRUE FALSE [3,] TRUE TRUE # get T/F for which rows are all TRUE > apply(matrix(b %in% a, dim(b)), 1, all) [1] TRUE FALSE TRUE # Now get the indices > which(apply(matrix(b %in% a, dim(b)), 1, all)) [1] 1 3 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help with read.csv2
On Thu, 2005-11-17 at 17:09 +0100, Matthieu Cornec wrote: > Hello, > I am importing the following file > ;aa;bb;cc > 1988;12;12;12 > 1989;78;78;12 > 1990;78;78;12 > 1991;78;78;12 > 1992;78;78;12 > 1993;78;78;12 > 1994;78;78;12 > > data<-read.csv2("test.csv",header=T) > -- > it gives > X aa bb cc > 1 1988 12 12 12 > 2 1989 78 78 12 > 3 1990 78 78 12 > 4 1991 78 78 12 > 5 1992 78 78 12 > 6 1993 78 78 12 > 7 1994 78 78 12 > How do I get : > > aa bb cc > 1988 12 12 12 > 1989 78 78 12 > 1990 78 78 12 > 1991 78 78 12 > 1992 78 78 12 > 1993 78 78 12 > 1994 78 78 12 > Are you indicating that you want the years to be the rownames and NOT a column in the data frame? If so, use the 'row.names' argument to indicate that the first column of the incoming data file contains the rownames: > dat <- read.csv2("clipboard", row.names = 1) > dat aa bb cc 1988 12 12 12 1989 78 78 12 1990 78 78 12 1991 78 78 12 1992 78 78 12 1993 78 78 12 1994 78 78 12 > str(dat) `data.frame': 7 obs. of 3 variables: $ aa: int 12 78 78 78 78 78 78 $ bb: int 12 78 78 78 78 78 78 $ cc: int 12 12 12 12 12 12 12 > rownames(dat) [1] "1988" "1989" "1990" "1991" "1992" "1993" "1994" Note that you are getting the "X" in your initial example above, since you are missing the first column name in the text file. A better approach here might be (depending upon your intent) to use the 'col.names' argument to specify the column names explicitly: > dat <- read.csv2("clipboard", header = TRUE, col.names = c("Years", "aa", "bb", "cc")) > dat Years aa bb cc 1 1988 12 12 12 2 1989 78 78 12 3 1990 78 78 12 4 1991 78 78 12 5 1992 78 78 12 6 1993 78 78 12 7 1994 78 78 12 This way, you are defining the first column name during the import and your 'Years' are available as a data column. It all depends upon what you want to do with the data at this point. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] dev.copy legend problem
On Thu, 2005-11-17 at 17:03 +0100, Florence Combes wrote: > Dear all, > > We are facing this problem for long, and so ask for your help. > > We are plotting 2 graphs in a postscript device (left part -layout > function-), and the common legend for these graphs on the right part. > The legend in the postscript device looks ok: this is color lines with > numbers on the right (6 columns) , see the code below: > > > nblock<-c(1:48) > > leg<-paste(c(1:npin)," ",sep=" ") > > legend(0,19,legend = leg, col=rainbow(nblock), lty=1, > merge=TRUE,ncol=6,bty="n",cex=0.6) > > The problem we are facing is that we dev.copy to a pdf device and then, the > legend doesn't look the same: numbers overlap a little lines. The problem with using dev.copy() is that it can result in slightly (or notably) different plotting characteristics, which are device dependent. A plot that is viewed on the screen, for example, may or may not (most likely not) look the same when created using the same code to a postscript device or a pdf device. This can be further exacerbated if the plot device dimensions and pointsizes are not explicitly defined, as "device shrinkage" may occur. I suspect that this is what you are experiencing. If you want to end up with a PDF plot, use pdf() and set the various plotting parameters (ie. font size, etc.) based upon what you see there, not what you see on the screen or in another device. One other possibility, just to throw it out there since you are on Linux, is to use ps2pdf from a console to convert the PS file to PDF. This uses Ghostscript to do the conversion and will generally work well, but testing with your particular plot, given possible issues with fonts, etc. would still be a good idea. See 'man ps2pdf' for more information. > Has someone already encountered such a thing ? > > Any help apreciated, thanks a lot. > > Florence. > > > R 2.1.0 on a Linux Debian. Have not been able to get your SysAdmin to upgrade you yet? :-) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] [Rd] Scan data from a .txt file
I have a feeling that Vasu wants (mistakenly) this: dat <- read.table("clipboard", header = FALSE) > dat V1 V2 V3 V4 1 Name Weight Height Gender 2 Anne150 65 F 3Rob160 68 M 4 George180 65 M 5 Greg205 69 M > str(dat) `data.frame': 5 obs. of 4 variables: $ V1: Factor w/ 5 levels "Anne","George",..: 4 1 5 2 3 $ V2: Factor w/ 5 levels "150","160","180",..: 5 1 2 3 4 $ V3: Factor w/ 4 levels "65","68","69",..: 4 1 2 1 3 $ V4: Factor w/ 3 levels "F","Gender","M": 2 1 3 3 3 > dat$V1 [1] Name Anne RobGeorge Greg Levels: Anne George Greg Name Rob > dat$V2 [1] Weight 150160180205 Levels: 150 160 180 205 Weight > dat$V3 [1] Height 65 68 65 69 Levels: 65 68 69 Height > dat$V4 [1] Gender F M M M Levels: F Gender M So that the colnames are actually part of the data frame columns. Vasu, note however that all values become factors or you can convert to character, for example: > as.character(dat$V1) [1] "Name" "Anne" "Rob""George" "Greg" neither of which I suspect is what you really want. You can access the column names of the data frame using colnames(): > dat <- read.table("clipboard", header = TRUE) > dat Name Weight Height Gender 1 Anne150 65 F 2Rob160 68 M 3 George180 65 M 4 Greg205 69 M > colnames(dat) [1] "Name" "Weight" "Height" "Gender" This keeps the column names separate from the actual data, which unless we are missing something here, is the proper way to do this. Think of a data frame as a rectangular data set, which can contain more than one data type across the columns, much like a spreadsheet. The difference here (unlike a spreadsheet) is that the first row does not contain the column names/labels. These are separate from the data itself, which in a typical spreadsheet would start on row 2. Note as Andy pointed out, that in this case, you should use read.table(), not scan(). Review "An Introduction To R" and the "R Data Import/Export" manuals for more information. Both are available with your installation and/or from the main R web site under Documentation. HTH, Marc Schwartz On Thu, 2005-11-17 at 10:41 -0500, Liaw, Andy wrote: > [Re-directing to R-help, as this is more appropriate there.] > > I tried copying the snippet of data into the windows clipboard and tried it: > > > dat <- read.table("clipboard", header=T) > > dat > Name Weight Height Gender > 1 Anne150 65 F > 2Rob160 68 M > 3 George180 65 M > 4 Greg205 69 M > > str(dat) > `data.frame': 4 obs. of 4 variables: > $ Name : Factor w/ 4 levels "Anne","George",..: 1 4 2 3 > $ Weight: int 150 160 180 205 > $ Height: int 65 68 65 69 > $ Gender: Factor w/ 2 levels "F","M": 1 2 2 2 > > dat <- read.table("clipboard", header=T, row=1) > > str(dat) > `data.frame': 4 obs. of 3 variables: > $ Weight: int 150 160 180 205 > $ Height: int 65 68 65 69 > $ Gender: Factor w/ 2 levels "F","M": 1 2 2 2 > > dat >Weight Height Gender > Anne 150 65 F > Rob 160 68 M > George180 65 M > Greg 205 69 M > > Don't see how it "doesn't work". Please give more detail on what "doesn't > work" means. > > Andy > > > From: Vasundhara Akkineni > > > > Hi all, > > Am trying to read data from a .txt file in such a way that i > > can access the > > column names too. For example, the data in the table.txt file > > is as below: > > Name Weight Height Gender > > Anne 150 65 F > > Rob 160 68 M > > George 180 65 M > > Greg 205 69 M > > i used the following commands: > > data<-scan("table.txt",list("",0,0,0),sep="") > > a<-data[[1]] > > b<-data[[2]] > > c<-data[[3]] > > d<-data[[4]] > > But this doesn't work because of type mismatch. I want to > > pull the col > > names also into the respective lists. For example i want 'b' to have > > (weight,150,160,180,205) so that i can access the col name > > and also the > > induvidual weights. I tried using the read.table method too, > > but couldn't > > get this working. Can someone suggest a way to do this. > > Thanks, > > Vasu. > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] invert y-axis in barplot
On Wed, 2005-11-16 at 16:46 +, Jörg Schlingemann wrote: > Hi! > > > > This is probably a very trivial question. Is there an easy way to invert the > y-axis (low values on top) when using the function barplot()? > > > > Thanks, > > Jrg You mean something like this?: barplot(1:10, ylim = rev(c(0, 12))) or, something like this?: barplot(1:10, yaxt = "n") axis(2, labels = rev(seq(0, 10, 2)), at = seq(0, 10, 2)) Note the use of rev() in each case. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] as.integer with base other than ten.
On Mon, 2005-11-14 at 19:01 +, William Astle wrote: > Is there an R function analogous to the C function strtol? > I would like to convert a binary string into an integer. > > Cheers for advice > > Will There was some discussion in the past and you might want to search the archive for a more generic solution for any base to any base, but for binary to decimal specifically, something like the following will work: bin2dec <- function(x) { b <- as.numeric(unlist(strsplit(x, ""))) pow <- 2 ^ ((length(b) - 1):0) sum(pow[b == 1]) } The function takes the binary string and splits it up into individual numbers ('b'). It then creates a vector of powers of 2 as long as 'b' less one through 0 ('pow'). It then takes the sum of the values of pow, indexed by 'b == 1'. > bin2dec("101") [1] 5 > bin2dec("") [1] 15 > bin2dec("101") [1] 95 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Coercion of percentages by as.numeric
On Mon, 2005-11-14 at 19:07 +0200, Brandt, T. (Tobias) wrote: > > >-Original Message- > >From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] > >Sent: 14 November 2005 06:21 PM > > > >On 11/14/05, Brandt, T. (Tobias) <[EMAIL PROTECTED]> wrote: > >> Hi > >> > >> Given that things like the following work > >> > >> > a <- c("-.1"," 2.7 ","B") > >> > a > >> [1] "-.1" " 2.7 " "B" > >> > as.numeric(a) > >> [1] -0.1 2.7 NA > >> Warning message: > >> NAs introduced by coercion > >> > > >> > >> I naively expected that the following would behave differently. > >> > >> > b <- c('10%', '-20%', '30.0%', '.40%') > >> > b > >> [1] "10%" "-20%" "30.0%" ".40%" > >> > as.numeric(b) > >> [1] NA NA NA NA > >> Warning message: > >> NAs introduced by coercion > > > >Try this: > > > >as.numeric(sub("%", "e-2", b)) > > > > Thank you, that accomplishes what I had intended. > > I would have thought though that the expression "53%" would be a fairly > standard representation of the number 0.53 and might be handled as such. Is > there a specific reason for avoiding this behaviour? "53%" is a 'shorthand' character representation of a mathematical concept. To wit, the specific representation of a fraction using 100 as the denominator (ie. 53 / 100). The symbol '%' can be replaced by the word "percent", such as "53 percent", which is also a character representation. 0.53, in context, is a numeric representation of a proportion in the range of 0 - 1.0. > I can imagine that it might add unnecessary overhead to routines like > "as.numeric" which one would like to keep as fast as possible. > > Perhaps there are other areas though where it might be desirable? For > example I'm thinking of the read.table function for reading in csv files > since I have many of these that have been saved from excel and now contain > numbers in the "%" format. In Excel, numbers displayed with a '%' are what you see visually. However, the internal representation (how the value is actually stored in the program) is still as a floating point value, without the '%'. For example: > a <- 53 > a [1] 53 > sprintf("%.0f%%", a) [1] "53%" > is.numeric(a) [1] TRUE > is.numeric(sprintf("%.0f%%", a)) [1] FALSE Unfortunately (depending upon your perspective), Excel, and other similar programs, tend to export the visually displayed values and not the internal representations of them. Thus, as Gabor pointed out, you will need to do some 'editing' of the values before using them in R. You can either do this in Excel, by removing the "%" formatting, or post-import in R as Gabor has described. You need to keep separate the internal representation of a value and its printed or displayed representation for human readable consumption. as.numeric() does basically one thing and it does it well and properly. It is up to the user to ensure that it is passed the proper values. When that is not the case, it issues an appropriate warning message and returns NA. Of course, using Gabor's hint, you can also write your own variation of as.numeric(), creating a function that takes percent formatted values and converts them as you require. One of the many strengths of R, is that you can extend it to meet your own specific requirements when the base functions do not. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] bug/feature with barplot?
On Mon, 2005-11-14 at 15:55 +0100, Karin Lagesen wrote: > I have found a bug/feature with barplot that at least to me shows > undesireable behaviour. When using barplot and plotting fewer > groups/levels/factors(I am unsure what they are called) than the number > of colors stated in a col statement, the colors wrap around such that > the colors are not fixed to one group. This is mostly problematic when > I make R figures using scripts, since I sometimes have empty input > groups. I have in these cases experienced labeling the empty group as > red, and then seeing a bar being red when that bar is actually from a > different group. > > Reproducible example (I hope): > > barplot(VADeaths, beside=TRUE, col=c("red", "green", "blue", "yellow", > "black")) > barplot(VADeaths[1:4,], beside=TRUE, col=c("red", "green", "blue", "yellow", > "black")) > > Now, I don't know if this is a bug or a feature, but it sure bugged me...:) > > Karin Most definitely not a bug. As with many vectorized function arguments, they will be recycled as required to match the length of other appropriate arguments. In this case, the number of colors (5) does not match the number of groups (4). Thus, they are "out of synch" with each other and you get the result you have. Not unexpected behavior. You should adjust your code and the function call so that the number of groups matches the number of colors. Something along the lines of the following: col <- c("red", "green", "blue", "yellow", "black") no.groups <- 4 barplot(VADeaths[1:no.groups, ], beside = TRUE, col = col[1:no.groups]) Now try: no.groups <- 5 barplot(VADeaths[1:no.groups, ], beside = TRUE, col = col[1:no.groups]) no.groups <- 3 barplot(VADeaths[1:no.groups, ], beside = TRUE, col = col[1:no.groups]) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Orientation of tickmarks labels in boxplot/plot
On Wed, 2005-11-02 at 16:06 -0600, Michal Lijowski wrote: > Hi, > > I have been trying draw tickmark labels along > the y - axis perpendicular to the y axis while > labels along the x - axis parallel to x axis > while making box plot. > > Here is my test dataset. > > TData >ID Ratio > 1 0 7.075 > 2 0 7.414 > 3 0 7.403 > 4 0 7.168 > 5 0 6.820 > 6 0 7.294 > 7 0 7.238 > 8 0 7.938 > 9 1 7.708 > 10 1 8.691 > 11 1 8.714 > 12 1 8.066 > 13 1 8.949 > 14 1 8.590 > 15 1 8.714 > 16 1 8.601 > > boxplot(Ratio ~ ID, data=TData) > > makes box plot with tickmark labels parallel to the y - axis. > So I try > > boxplot(Ratio ~ ID, data=TData, axes=FALSE) > par(las=0) > axis(1) > and I get x - axis ranging from 0.5 to 2.5 (why?) and > boxes at 1 and 2. > par(las=2) > axis(2) > box() > So, if I set tickmark labels parallel to y - axis > somehow the x - axis range is not what I expect even > if I use xlim = c(0.0, 3.0) > in boxplot(Ratio ~ Id, data=TData, axes=FALSE, xlim=c(0.0, 3.0)) > par(las=0) > axis(1) > > Plots are in the attachments in pdf format. > > I appreciate any tips. > > I am using R 2.2.0 (2005-10-06) on FC4. > > Michal I suspect that you want this: boxplot(Ratio ~ ID, data=TData, las = 1) so that the x and y axis labels are horizontal. See ?par and review the options for 'las'. '1' is for axis tick mark labels to be horizontal. The x axis (horizontal axis if the boxes are vertical, the vertical axis if the boxes are horizontal) is set by default to the number of boxes (groups) +/- 0.5. 'xlim' is ignored in both cases. The boxes themselves are placed at integer values from 1:N, where N is the number of groups. There is the 'at' argument and there is an example of its use in ?boxplot. You can also search the archives for posts where variations on the use of 'at' have been posted (recently, in fact...) The actual plotting of the boxplots is done by bxp(), so review the help for that function as well. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R Graphs in Powerpoint
One other option, just to throw it out there, though it involves a few more steps. 1. Generate the R plots as EPS files. 2. Import them into Powerpoint onto the required slides. Resize and/or place as required. Recent versions of Powerpoint will auto-generate a bitmapped preview image upon import. 3. Print the full Powerpoint presentation to a PS file, using a PS printer driver. This will result in high quality images. 4. Convert the PS file to PDF, using Ghostscript (ps2pdf) or similar. 5. Display the presentation using Acrobat Reader in full screen mode to your audience. This works well, as long as you are not using complex object/slide transitions, animations and the like in Powerpoint and takes advantage of the higher quality vector format of EPS graphics as opposed to the bitmapped graphic formats. HTH, Marc Schwartz On Tue, 2005-11-01 at 13:43 -0800, Smith, Daniel (DHS-DEODC-EHIB) wrote: > I've tried several methods in OS X, and here's what works best for me. > Save the R graphic as a PDF file. Open it with Apple's "Preview" > application, and save it as a PNG file. The resulting .png file can > be inserted into MS Word or PowerPoint, can be resized, and looks good > on either OS X or Windows. There are other programs available for > translating the pdf file to png (like the shareware application > Graphic Converter), but I've found that Preview produces the best > results. > > Daniel Smith > Environmental Health Investigations Branch > California Dept of Health Services > > > -Original Message- > Date: Mon, 31 Oct 2005 15:14:06 -0800 > From: Jarrett Byrnes <[EMAIL PROTECTED]> > Subject: [R] R Graphs in Powerpoint > To: r-help@stat.math.ethz.ch > Message-ID: <[EMAIL PROTECTED]> > Content-Type: text/plain; charset=US-ASCII; format=flowed > > Hey, all. Quick question. I'm attempting to use some of the great > graphs generated in R for an upcoming talk that I'm writing in > Powerpoint. Copying and pasting (I'm using OSX) yields graphs that > look great in Powerpoint - until I resize them. Then fonts, points, > and lines all become quite pixelated and blurry. Even if I size the > window properly first, and then copy and paste in the graph, when I > then view the slideshow, the graphs come out pixelated and blurry. > > Is there any good solution to this, or is this some fundamental > incompatibility that I can't get around? > > -Jarrett __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] multiple boxplots
On Fri, 2005-10-28 at 10:26 -0700, J.M. Breiwick wrote: > Hello, > > I want to plot 3 boxplots [ par(mfrow=c(3,1)) ] but the first one has 8 > groups, the 2nd has 7 and the third has 6. But I the groups to line up: > > 1 2 3 4 5 6 7 8 >2 3 4 5 6 7 8 > 3 4 5 6 7 8 > > where the numbers actually refer to years. I suspect I have to manipulate > the function bxp(). Is there a relatively simple way to do this? Thanks for > any help. > > Jeff Breiwick > NMFS, Seattle Here is one approach. It is based upon using 2 principal concepts: 1. Create the first plot, save the plot region ranges from par("usr") and then use this information for the two subsequent plots, where we use the 'add = TRUE' argument. 2. Use the 'at' argument to specify the placement of the boxes in the second and third plots to line up with the first. So: # Create our data. dat <- rnorm(80) years <- rep(1991:1998, each = 10) # MyDF will be a data frame with two columns and we will use # this in the formula method for boxplot. # The second and third plots will use subset()s of MyDF MyDF <- cbind(dat, years) # Set the plot matrix par(mfrow = c(3, 1)) # Create the first boxplot and save par("usr") boxplot(dat ~ years, MyDF) usr <- par("usr") # Now open a new plot, setting it's par("usr") to # match the first plot. # Then use boxplot and set the boxes 'at' x pos 2:8 # and add it to the open plot plot.new() par(usr = usr) boxplot(dat ~ years, subset(MyDF, years %in% 1992:1998), at = 2:8, add = TRUE) # Rinse and repeat ;-) # Different subset and use 3:8 for 'at' plot.new() par(usr = usr) boxplot(dat ~ years, subset(MyDF, years %in% 1993:1998), at = 3:8, add = TRUE) Replace MyDF in the above with your actual datasets of course. This would of course be a bit easier if one could set an 'xlim' argument in boxplot(), but this is ignored by bxp(). HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] encrypted RData file?
On Thu, 2005-10-27 at 16:15 -0500, Na Li wrote: > On 27 Oct 2005, Duncan Temple Lang wrote: > > > Yes, it is of interest and was sitting on my todo list at > > some time. If you want to go ahead and provide code to do it, > > that would be terrific. There are other areas where encryption > > would be good to have, so a general mechanism would be nice. > > > > D. > > > > Na Li wrote: > > > Hi, I wonder if there is interest/intention to allow for encrypted .RData > > > files? One can certainly do that outside R manually but that will leave a > > > decrypted RData file somewhere which one has to remember to delete. > > > > > I was hoping someone has already done it. ;-( > > One possibility is to implement an interface package to gpgme library which > itself is an interface to GnuPG. > > But I'm not sure how the input of passphrase can be handled without using > clear text. > > Michael Seems to me that a better option would be to encrypt the full partition such that (unless you write the files to a non-encrypted partition) these issues are transparent. This would include the use of save(), save.image() and write() type functions to save what was an encrypted dataset/object to a unencrypted file. Of course, you would also have to encrypt the swap and tmp partitions (as appropriate) for similar reasons. On Linuxen/Unixen, full encryption of partitions is available via loopback devices and other mechanisms and some distros have this available as a built-in option. I believe that the FC folks finally have this on their list of functional additions for FC5. Windows of course can do something similar. The other consideration here, is that if R Core builds in some form of encryption, there is the potential for import/export restrictions on such technology since R is available via international CRAN mirrors. It may be best to provide for a plug-in "encryption black box" of sorts, so that folks can use a particular encryption schema that meets various legal/regulatory requirements. Of course, simply encrypting the file or even a complete partition has to be considered within a larger security strategy (ie. network security, physical access control, etc.) that meets a particular functional requirement (such as HIPAA here in the U.S.) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Graphics window always overlaps console window!
On Tue, 2005-10-25 at 13:07 -0500, Marc Schwartz (via MN) wrote: > On Tue, 2005-10-25 at 11:55 -0600, [EMAIL PROTECTED] wrote: > > Does anyone know how I can set up R so that when I make a graphic, the > > graphics window remains behind the console window? It's annoying to > > have to reach for the mouse every time I want to type another line of > > code (e.g., to add another line to the plot). Thanks. > > What operating system? > > Default window focus behavior is highly OS and even window manager > specific and is not an R issue. > > Depending upon your OS and window manager, you may need to check the > documentation and/or do a Google search on "window focus" for further > information. > > Another alternative, if you are on Windows, is to review Windows FAQ 5.4 > "How do I move focus to a graphics window or the console?", but this is > a programmatic approach and not a means to affect default behavior. Yet another approach which I just remembered is that (if on Windows) MS offers a program called Tweak UI: http://www.microsoft.com/windowsxp/downloads/powertoys/xppowertoys.mspx as part of their Power Toys add-ons. You might want to review that to see if there is a setting for the handling of new window focus behavior. HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Graphics window always overlaps console window!
On Tue, 2005-10-25 at 11:55 -0600, [EMAIL PROTECTED] wrote: > Does anyone know how I can set up R so that when I make a graphic, the > graphics window remains behind the console window? It's annoying to > have to reach for the mouse every time I want to type another line of > code (e.g., to add another line to the plot). Thanks. What operating system? Default window focus behavior is highly OS and even window manager specific and is not an R issue. Depending upon your OS and window manager, you may need to check the documentation and/or do a Google search on "window focus" for further information. Another alternative, if you are on Windows, is to review Windows FAQ 5.4 "How do I move focus to a graphics window or the console?", but this is a programmatic approach and not a means to affect default behavior. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Basic: setting resolution and size of an R graphic
On Mon, 2005-10-24 at 22:32 +0200, Dr. med. Peter Robinson wrote: > Dear List, > > I am sorry if this perhaps a too basic question, but I have not found an > answer in Google or in the R help system. I am trying to use R to do a > very simple analysis of some data (RT-PCR and Western analysis) with a > T-test and > to plot the results as a histogram with error bars. (I have pasted an > example script at the bottom of this mail). > In order to use this for publication, I would like to adjust the > resolution and size of the final image. However, even using file types > such as postscript or pdf that are vector based, I get rather bad-looking > results with > >pdf(file="test.pdf") > >source("script at bottom of mail") > >dev.off() > > using either pdf or postscript or jpg devices. > > > Therefore I would like to ask the list, how to best produce a graphic from > the script below that would fit into one column of a published article and > have a high resolution (as eps, or failing that tiff or png)? > Thanks in advance for any advice, > > Peter What OS are you on? Running your example on FC4, I get the attached output for a pdf(). I suspect that on your OS, the height and width arguments are not appropriate by default. Thus, you may need to adjust your pdf (or postscript) function call to explicitly specify larger height and width arguments. Also note that to generate an EPS file, pay attention to the details section of ?postscript, taking note of the 'onefile', 'horizontal' and 'paper' arguments and settings. Also, check with your journal to see if they specify dimensions for such graphics so that you can abide by their specs if provided. If they are using LaTeX, there are means of specifying and/or adjusting the height and/or width specs in the code based upon proportions of various measures (ie. \includegraphics[width=0.9\textwidth]{GraphicsFile.eps} ). HTH, Marc Schwartz test.pdf Description: Adobe PDF document __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] peculiar matrices
On Fri, 2005-10-21 at 21:32 +, Ben Bolker wrote: > As far as I can tell from reading The Fine Documentation > (R Language Definition and Intro to R), matrices are supposed > to be of homogeneous types. Yet giving matrix() an inhomogeneous > list seems to work, although it produces a peculiar object: > > v = list(1:3,4,5,"a") > m = matrix(v,nrow=2) > m > > [,1] [,2] > [1,] Integer,3 5 > [2,] 4 "a" > > > m[1,] > > [[1]] > [1] 1 2 3 > > [[2]] > [1] 3 > > (this is R 2.1.1, running under Linux) Ben, If you review the structure of 'm' note: > str(m) List of 4 $ : int [1:3] 1 2 3 $ : num 4 $ : num 5 $ : chr "a" - attr(*, "dim")= int [1:2] 2 2 that it is actually a list, even though: > class(m) [1] "matrix" Also: > mode(m) [1] "list" > typeof(m) [1] "list" If you remove the dim attributes, you get: > dim(m) <- NULL > m [[1]] [1] 1 2 3 [[2]] [1] 4 [[3]] [1] 5 [[4]] [1] "a" So I would argue that it is consistent with the documentation in that, while the printed output is that of a matrix, it is a list, which of course can handle heterogeneous data types. This is Version 2.2.0 Patched (2005-10-20 r35979). > Should there be a check/error? Or is this just analogous to > the joke about going to the doctor and saying "it hurts when > I do this", and the doctor saying "well then, don't do that"? Maybe more like: "Doctor, my eye hurts when I drink my tea." and the doctor says, "Well, remove the spoon from the cup before you drink." ;-) Regards, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] make three plot to one plot
On Fri, 2005-10-21 at 21:13 +0200, Jan Sabee wrote: > Dear all, > I want to make three plot below to only one plot together with legend, > how can I do that? > I have tried with matplot function but I did not succeed. > Thanks for your help. > Sincerelly, > Jan Sabee > > test.five.x <- > c(0.02,0.05,0.07,0.09,0.10,0.12,0.13,0.14,0.16,0.17,0.20,0.21,0.34,0.40) > test.five.y <- c(18,12,17,12,3,15,1,5,1,1,3,10,15,10) > plot(test.five.x, test.five.y, type="l",lty=1, lwd=3, col='red') > legend(par('usr')[2], par('usr')[4], xjust=1, > c('five'), > lwd=3, > lty=1, > col=c('red')) > test.six.x <- > c(0.03,0.05,0.07,0.08,0.09,0.10,0.12,0.13,0.14,0.15,0.17,0.18,0.21,0.22,0.24,0.33,0.39,0.43) > test.six.y <- c(8,32,14,21,3,8,11,14,11,21,16,14,10,21,1,2,13,19) > plot(test.six.x, test.six.y, type="l",lty=2, lwd=3, col='green') > legend(par('usr')[2], par('usr')[4], xjust=1, > c('six'), > lwd=3, > lty=2, > col=c('green')) > test.seven.x <- > c(0.04,0.05,0.08,0.09,0.10,0.12,0.13,0.14,0.16,0.17,0.19,0.21,0.22,0.24,0.29,0.32,0.38,0.40,0.46,0.50) > test.seven.y <- > c(18,135,240,42,63,128,215,267,127,36,21,23,223,12,66,96,12,26,64,159) > plot(test.seven.x, test.seven.y, type="l",lty=3, lwd=2, col='blue') > legend(par('usr')[2], par('usr')[4], xjust=1, > c('seven'), > lwd=3, > lty=3, > col=c('blue')) Jan, matplot() won't work well here, because you do not have common x axis values across the three sets of data. Thus, you need to use plot() and then lines(). Try this: # First set up your data test.five.x <- c(0.02,0.05,0.07,0.09,0.10,0.12, 0.13,0.14,0.16,0.17,0.20,0.21, 0.34,0.40) test.five.y <- c(18,12,17,12,3,15,1,5,1,1,3,10, 15,10) test.six.x <- c(0.03,0.05,0.07,0.08,0.09,0.10, 0.12,0.13,0.14,0.15,0.17,0.18, 0.21,0.22,0.24,0.33,0.39,0.43) test.six.y <- c(8,32,14,21,3,8,11,14,11,21,16, 14,10,21,1,2,13,19) test.seven.x <- c(0.04,0.05,0.08,0.09,0.10,0.12, 0.13,0.14,0.16,0.17,0.19,0.21, 0.22,0.24,0.29,0.32,0.38,0.40, 0.46,0.50) test.seven.y <- c(18,135,240,42,63,128,215,267, 127,36,21,23,223,12,66,96,12, 26,64,159) # Now use plot to draw the first set of data # The key here is to properly set to the x and y axis # ranges ('xlim' and 'ylim') so that they are based # upon all three sets of data. # Note that I also set the x and y axis labels to "". You # can adjust these and the plot title as you require. plot(test.five.x, test.five.y, type="l",lty=1, lwd=3, col="red", xlim = range(test.five.x, test.six.x, test.seven.x), ylim = range(test.five.y, test.six.y, test.seven.y), xlab = "", ylab = "") # Now use lines() to add the second and third sets of data lines(test.six.x, test.six.y, lty=2, lwd=3, col="green") lines(test.seven.x, test.seven.y, lty=3, lwd=2, col="blue") # Now use legend(). Place it at the upper right hand corner # and set the line types and colors to reflect the above legend("topright", xjust = 1, legend = c("five", "six", "seven"), lwd = 3, lty = 1:3, col = c("red", "green", "blue")) See ?lines for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Boxplot labels
On Thu, 2005-10-20 at 15:41 -0500, Marc Schwartz (via MN) wrote: > On Thu, 2005-10-20 at 14:46 -0400, Keith Sabol wrote: > > I am creating boxplots from a dataframe and would like to add to the > > standard output a marker representing the value from a particular row in > > the > > dataframe. > > > > .And I apologize if the solution is as trivial as it seems it should be. > > > > Thanks for any assistance. > > > > Keith > > Some example code would be helpful here. > > However, some hints that I suspect will be helpful: > > 1. Unless you use the 'at' argument in boxplot(), the box center > positions are at integer values from 1:n, where n is the number of > groups. > > 2. You can thus use the points() function to add symbols or the text() > function to add labels to the existing plot as you may require. > See ?points and/or ?text for more information. One other note to add here is that boxplot() will return a list which contains various stats about your grouped data. This requires using the form: stats <- boxplot(...) Thus using the 'stats$group' and 'stats$out' components will give you the x and y positions of any outliers. This will be helpful if these are the values that you might want to identify See the "Value" section in ?boxplot and ?boxplot.stats for more information. HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Boxplot labels
On Thu, 2005-10-20 at 14:46 -0400, Keith Sabol wrote: > I am creating boxplots from a dataframe and would like to add to the > standard output a marker representing the value from a particular row in the > dataframe. > > .And I apologize if the solution is as trivial as it seems it should be. > > Thanks for any assistance. > > Keith Some example code would be helpful here. However, some hints that I suspect will be helpful: 1. Unless you use the 'at' argument in boxplot(), the box center positions are at integer values from 1:n, where n is the number of groups. 2. You can thus use the points() function to add symbols or the text() function to add labels to the existing plot as you may require. See ?points and/or ?text for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] having scaling problems with a histogram
On Thu, 2005-10-20 at 17:31 +0200, Christian Jones wrote: > > Hello, "urn:schemas-microsoft-com:office:office" /> > > I would like to create a histogram from a data collumn consisting of 4 > classes (0; 0.05;0.5;25;75). Due to the difference in scale the > classes 0;0.05 and 0.5 are displayed within one combined bin by > default with the code:Hist(x, scale="percent", breaks="Sturges"). How > can I display them all, or as unique classes, leaving out the rest of > the x_axes scale? There was no enlightenment in the help function and > the command : breaks=5 did not do the trick. > > Thanks in advance > > Chris I may be mis-understanding what you are doing here, but it sounds like you should be using barplot() rather than hist(), which is referenced in the "See Also" in ?hist. This would be preferred if you want individual counts or proportions (or percents) from specific groups without binning. Where does the 'scale = "percent"' come from? I don't recognize that argument from the standard histogram plotting functions. Oh, wait a minute, just found it doing a search. The Hist() function is in John Fox's Rcmdr. Please be sure to note this if you are using a function from a contributed package. It saves time and increases the likelihood of your getting a reply. Also, I am guessing that the extraneous content here is the result of using MS Word as your e-mail editor? Please use plain text only, which is the defined format for this list and from reviewing the archive, has been pointed out to you previously by Frank Harrell and Ted Harding. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Automatic rounding of values after factors , converted to numeric, are multipled by a real number
On Thu, 2005-10-20 at 16:20 +0200, Peter Dalgaard wrote: > "Nelson, Gary (FWE)" <[EMAIL PROTECTED]> writes: > > > Peter, > > > > Thank you for your response. I knew how close the values are to > > integers, but I still don't understand why I don't have control over > > how the numbers are displayed (rounded or not)? > > It's all in the conventions. The digits in print() and friends are > significant digits, so we first round to that many significant digits, > then discard trailing zeros, which is why > > > 12.51 > [1] 12.5 > > 12.50001 > [1] 12.50001 > > The exception is that we do not discard significant digits to the left > of the decimal point, unless we are using scientific notation > > > print(12345678,digits=2) > [1] 1.2e+07 > > print(12345678,digits=5) > [1] 12345678 > > (the "scipen" options controls the logic for switching notation). > > For finer control we have the formatC function: > > > format(1234.1,digits=9) # same thing as with print() > [1] "1234.1" > > format(1234.1,digits=8) > [1] "1234" > > formatC(1234.1,digits=5,format="f") > [1] "1234.1" > > formatC(1234.1,digits=4,format="f") > [1] "1234." Also, sprintf(): > sprintf("%.9f", 1234.1) [1] "1234.1" > sprintf("%.4f", 1234.1) [1] "1234." > sprintf("%12.4f", 1234.1) [1] " 1234." HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] matching two plots
On Wed, 2005-10-19 at 18:27 +0100, Rui Cardoso wrote: > Hi, > > I have a problem about graphics. I would like to plot two graphs: a barplot > and curve. Here is the code: > > > barplot(dpois(0:45,20),xlim=c(0,45),names=0:45) > > curve(dnorm(x,20,sqrt(20)),from=0,to=45,add=T) > > Both graphs are drawn in the same figure, however the scale in both graphs > dooes not match. For some reason the second plot is shifted to left. I > think there is a problem concerning the axis scale. > > Thanks a lot. > > Rui This came up this past summer and Gabor and I had a couple of different approaches to the solution. You can see the posts here: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/57431.html and http://finzi.psych.upenn.edu/R/Rhelp02a/archive/57432.html Using my approach with barplot(), what you need to do is to set the 'space' argument to 0 so that there is no space between the bars. This will then place the bar centers at increments of 0.5, in your case seq(0.5, 45.5, 5). Knowing this, you can then adjust the x axis position in curve() by +0.5 to coincide with the bars. So you end up with this: barplot(dpois(0:45, 20), names = 0:45, space = 0, ylim = c(0, .1)) curve(dnorm(x, 20 + 0.5, sqrt(20)), add = TRUE) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Efficient ways of finding functions and Breslow-Day test for homogeneity of the odds ratio
On Wed, 2005-10-19 at 06:47 +1000, Tim Churches wrote: > Marc Schwartz (via MN) wrote: > > > There is also code for the Woolf test in ?mantelhaen.test > > Is there? How is it obtained? The documentation on mantelhaen.test in R > 2.2.0 contains a note: "Currently, no inference on homogeneity of the > odds ratios is performed." and a quick scan of the source code for the > function didn't reveal any meantion of Woolf's test. > > Tim C Review the code in the examples on the cited help page... :-) HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] p-value calculation
On Tue, 2005-10-18 at 17:13 +0200, richard mendes wrote: > hello everybody > > i'm very new at using R so probably this is a very stupid question. > I have a problem calculating a p-value. When i do this with excel i > can use the method CHIDIST for 1.2654 with 1 freedom degree i get the > answer 0.261 > > i just want to do the same thing in R but i can't find a method. > can somebody help me > > friendly regards > > richard > pchisq(1.2654, 1, lower.tail = FALSE) [1] 0.2606314 See ?pchisq for more information. You might also want to read Chapter 8 "Probability Distributions" in "An Introduction To R", available with your R installation or from the Documentation links on the main R web site. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Efficient ways of finding functions and Breslow-Day test for homogeneity of the odds ratio
On Tue, 2005-10-18 at 14:27 +0100, MJ Price, Social Medicine wrote: > Dear all, > > I have been trying to find a function to calculate the Breslow-Day test for > homogeneity of the odds ratio in R. I know the test can be preformed in SAS > but i was wondering if anyone could help me to perform this in r. > > In addition i have the fullrefman file to search for functions in the basic > R packages, does anyone have any suggestions of an efficient way of > searching for functions in the other add-on packages on the website? > > Thanks in advance > > Malcolm RSiteSearch("Breslow"), which will perform a search of the e-mail list archives and other online documentation, reveals this post: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/50202.html which in turn leads to this code: http://www.math.montana.edu/~jimrc/classes/stat524/Rcode/breslowday.test.r There is also code for the Woolf test in ?mantelhaen.test Further searches can be conducted using help.search("Keyword"), which will search your installed set of packages. See ?help.search and ?RSiteSearch for more help on these. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] y axis in histograms
On Mon, 2005-10-17 at 16:51 +0200, Jorge Gaspar Sanz Salinas wrote: > Hi all, > > This is my first post, I hope you will help me. > > I've some data to present with histograms. I have few values with almost 99% > of > the frequencies (thousands) and some other values with low frequencies (below > one hundred) that I want to emphasize. I think if I could present the > logarithms > of the frequencies, these could be presented in a more convenient way, but I > don't know how to deal with this. > > Thanks, and sorry for my English If you are plotting the counts, you should probably look at barplot(), which as of R version 2.2.0 supports log scale axes (ie. log = "y"). See ?barplot for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] high resolution images for publication
On Thu, 2005-10-13 at 15:20 -0600, Chris Buddenhagen wrote: > Dear all > I am using R to produce ordinations library(vegan) and the plot function > produced looks great on the screen but when I send it to jpg or pdf or eps > the resolution is not so good. Can you tell me how to get high resolution > images out of R for publication? It would be helpful to see the actual code that you are using, as is asked for in the Posting Guide. For publication, it would be rare to want to use a bitmapped format such as jpg/png. pdf and eps are vector based formats and would be generally preferred over the above. I can only hazard a guess to consider that perhaps your height and width arguments for pdf() and/or postscript() are not set properly, as there are no other resolution based settings for those formats. By design, the output resolution is target device dependent. Please provide the code that you are using and we can respond with more specific guidance. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] shell scripts in R
On Fri, 2005-10-14 at 00:04 +0200, Benedykt P. Barszcz wrote: > Dnia czwartek, 13 października 2005 23:25, Andrew Robinson napisał: > > Marco, > > > > use the system command. > > > > ?system > > > > I hope that this helps, > > system(ls, intern = FALSE, ignore.stderr = TRUE) > Error in as.character(args[[i]]) : cannot coerce to vector > > In fact it ignores the stderr ! As per ?system: command the system command to be invoked, as a string. Thus, system("ls", intern = FALSE, ignore.stderr = TRUE) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] subsetting data frame using by() or tapply() or other
On Thu, 2005-10-13 at 14:28 -0600, Brian S Cade wrote: > Ok so I see the problem that I'm having creating a new variable (LAG1DBC) > in the example data transformation below is that tapply() is creating a > list that is not dimensionally consistent with the data frame (data). So > how do I go from the list output of tapply() to create a dimensionally > consistent vector that can create the new variable in my original data > frame? I've been trying to use a function like > data$LAG1DBC <- tapply(data$DBC, data$LOCID, function(x) c(NA, > x[-length(x)])) > which creates a list of dimension much smaller than the nrows in data. And > I've tried things like using as.data.frame.array() or as.data.frame.list() > in front of tapply() and still have the same problem. I know this can't > be that unusual of a data manipulation and that someone has to have done > similar things before. > > I want to go from something like this: > >LOCID POPULATION YEARDBC > 1 algb-1 A 1992 0.70451575 > 2 algb-1 A 1993 0.59506851 > 3 algb-1 A 1997 0.84837544 > 4 algb-1 A 1998 0.50283182 > 5 algb-1 A 2000 0.91242707 > 6 algb-2 A 1992 0.09747155 > 7 algb-2 A 1993 0.84772253 > 8 algb-2 A 1997 0.43974081 > 9 algb-2 A 1998 0.83108544 > 10 algb-2 A 2000 0.22291192 > 11 algb-3 A 1992 0.44234175 > 12 algb-3 A 1993 0.54089534 > 5680 taylr-73 B 2001 0.43918082 > 5681 taylr-73 B 2002 0.34694427 > 5682 taylr-73 B 2003 3.35619190 > 5683 taylr-73 B 2004 0.71575815 > 5684 taylr-73 B 2005 0.42038506 > 5685 taylr-74 B 1992 3.88410354 > 5686 taylr-74 B 1993 3.32472557 > 5687 taylr-74 B 1994 3.29861501 > 5688 taylr-74 B 1996 0.48153827 > 5689 taylr-74 B 1997 3.63570636 > 5690 taylr-74 B 1998 1.94630194 > > to something like this: > >LOCID POPULATION YEARDBC LAG1DBC > 1 algb-1 A 1992 0.70451575 NA > 2 algb-1 A 1993 0.59506851 0.70451575 > 3 algb-1 A 1997 0.84837544 0.59506851 > 4 algb-1 A 1998 0.50283182 0.84837544 > 5 algb-1 A 2000 0.91242707 0.50283182 > 6 algb-2 A 1992 0.09747155 NA > 7 algb-2 A 1993 0.84772253 0.09747155 > 8 algb-2 A 1997 0.43974081 0.84772253 > 9 algb-2 A 1998 0.83108544 0.43974081 > 10 algb-2 A 2000 0.22291192 0.83108544 > 11 algb-3 A 1992 0.44234175 NA > 12 algb-3 A 1993 0.54089534 0.44234175 > 5680 taylr-73 B 2001 0.43918082 NA > 5681 taylr-73 B 2002 0.34694427 0.43918082 > 5682 taylr-73 B 2003 3.35619190 0.34694427 > 5683 taylr-73 B 2004 0.71575815 3.35619190 > 5684 taylr-73 B 2005 0.42038506 0.71575815 > 5685 taylr-74 B 1992 3.88410354 NA > 5686 taylr-74 B 1993 3.32472557 3.88410354 > 5687 taylr-74 B 1994 3.29861501 3.32472557 > 5688 taylr-74 B 1996 0.48153827 3.29861501 > 5689 taylr-74 B 1997 3.63570636 0.48153827 > 5690 taylr-74 B 1998 1.94630194 3.63570636 > > Brian Brian, Use unlist(): > data$LAG1DBC <- unlist(tapply(data$DBC, data$LOCID, function(x) c(NA, x[-length(x)]))) > data LOCID POPULATION YEARDBCLAG1DBC 1 algb-1 A 1992 0.70451575 NA 2 algb-1 A 1993 0.59506851 0.70451575 3 algb-1 A 1997 0.84837544 0.59506851 4 algb-1 A 1998 0.50283182 0.84837544 5 algb-1 A 2000 0.91242707 0.50283182 6 algb-2 A 1992 0.09747155 NA 7 algb-2 A 1993 0.84772253 0.09747155 8 algb-2 A 1997 0.43974081 0.84772253 9 algb-2 A 1998 0.83108544 0.43974081 10 algb-2 A 2000 0.22291192 0.83108544 11 algb-3 A 1992 0.44234175 NA 12 algb-3 A 1993 0.54089534 0.44234175 5680 taylr-73 B 2001 0.43918082 NA 5681 taylr-73 B 2002 0.34694427 0.43918082 5682 taylr-73 B 2003 3.35619190 0.34694427 5683 taylr-73 B 2004 0.71575815 3.35619190 5684 taylr-73 B 2005 0.42038506 0.71575815 5685 taylr-74 B 1992 3.88410354 NA 5686 taylr-74 B 1993 3.32472557 3.88410354 5687 taylr-74 B 1994 3.29861501 3.32472557 5688 taylr-74 B 1996 0.48153827 3.29861501 5689 taylr-74 B 1997 3.63570636 0.48153827 5690 taylr-74 B 1998 1.94630194 3.63570636 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
Re: [R] How to generate for one vector matrix
On Thu, 2005-10-13 at 19:47 +0200, Jan Sabee wrote: > Is there any routine to generate for one vector matrix. > If I have X I want to generate start from zero to maximum value each vector. > > For example, I have a vector x = (4,2,3,1,4) > I want to generate n=6 times, for 4, start 0 to 4, then 2 start 0 to 2, ect. > > The result something like this: > > generate(x,n=6) > 1,1,2,1,4 > 1,2,3,0,3 > 4,0,1,1,1 > 3,1,0,1,4 > 0,0,3,0,0 > 4,1,3,0,4 > > Could anyone help me. Thanks. > Regards, > Jan Sabee If I am properly understanding what you are doing here, you have an initial vector of values. You want to create a matrix, whose columns are the result of random sampling with replacement from the initial vector 'n' times, where the sampling space for each column "i" is from 0:x[i]? If correct, this should do it: > x <- c(4, 2, 3, 1, 4) > x [1] 4 2 3 1 4 > sapply(x, function(x) sample(0:x, 6, replace = TRUE)) [,1] [,2] [,3] [,4] [,5] [1,]42112 [2,]30000 [3,]12013 [4,]42111 [5,]31310 [6,]01110 Just replace '6' in the sample() arguments with the 'n' you require. See ?sapply and ?sample. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Removing and restoring factor levels (TYPO CORRECTED)
On Thu, 2005-10-13 at 14:31 -0400, Duncan Murdoch wrote: > On 10/13/2005 1:07 PM, Marc Schwartz (via MN) wrote: > > On Thu, 2005-10-13 at 10:02 -0400, Duncan Murdoch wrote: > >> Sorry, a typo in my previous message (parens in the wrong place in the > >> conversion). > >> > >> Here it is corrected: > >> > >> I'm doing a big slow computation, and profiling shows that it is > >> spending a lot of time in match(), apparently because I have code like > >> > >> x %in% listofxvals > >> > >> Both x and listofxvals are factors with the same levels, so I could > >> probably speed this up by stripping off the levels and just treating > >> them as integer vectors, then restoring the levels at the end. > >> > >> What is the safest way to do this? I am worried that at some point x > >> and listofxvals will *not* have the same levels, and the optimization > >> will give the wrong answer. So I need code that guarantees they have > >> the same coding. > >> > >> I think this works, where "master" is a factor with the master list of > >> levels (guaranteed to be a superset of the levels of x and listofxvals), > >> but can anyone spot anything that might go wrong? > >> > >> # Strip the levels > >> x <- as.integer( factor(x, levels = levels(master) ) ) > >> > >> # Restore the levels > >> x <- structure( x, levels = levels(master), class = "factor" ) > >> > >> Thanks for any advice... > >> > >> Duncan Murdoch > > > > Duncan, > > > > With the predicate that 'master' has the full superset of all possible > > factor levels defined, it would seem that this would be a reasonable way > > to go. > > > > This approach would also seem to eliminate whatever overhead is > > encountered as a result of the coercion of 'x' as a factor to a > > character vector, which is done by match(). > > > > One question I have is, what is the advantage of using structure() > > versus: > > > >x <- factor(x, levels = levels(master)) > > > > ? > > That one doesn't work. What "factor(x, levels=levels(master))" says is > to convert x to a factor, coding the values in it according the levels > in master. But at this point x has values which are integers, so they > won't match the levels of master, which are probably character strings. > > For example: > > > master <- factor(letters) > > print(x <- factor(letters[1:3])) > [1] a b c > Levels: a b c > > print(x <- as.integer( factor(x, levels = levels(master) ) ) ) > [1] 1 2 3 > > print(x <- factor(x, levels = levels(master))) > [1] > Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z > > I get NA's at the end because the values 1,2,3 aren't in the vector of > factor levels (which are the lowercase letters). As opposed to: > print(x <- structure(x, levels = levels(master), class = "factor" )) [1] a b c Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z OK. Makes sense. Thanks for the clarification. Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Removing and restoring factor levels (TYPO CORRECTED)
On Thu, 2005-10-13 at 10:02 -0400, Duncan Murdoch wrote: > Sorry, a typo in my previous message (parens in the wrong place in the > conversion). > > Here it is corrected: > > I'm doing a big slow computation, and profiling shows that it is > spending a lot of time in match(), apparently because I have code like > > x %in% listofxvals > > Both x and listofxvals are factors with the same levels, so I could > probably speed this up by stripping off the levels and just treating > them as integer vectors, then restoring the levels at the end. > > What is the safest way to do this? I am worried that at some point x > and listofxvals will *not* have the same levels, and the optimization > will give the wrong answer. So I need code that guarantees they have > the same coding. > > I think this works, where "master" is a factor with the master list of > levels (guaranteed to be a superset of the levels of x and listofxvals), > but can anyone spot anything that might go wrong? > > # Strip the levels > x <- as.integer( factor(x, levels = levels(master) ) ) > > # Restore the levels > x <- structure( x, levels = levels(master), class = "factor" ) > > Thanks for any advice... > > Duncan Murdoch Duncan, With the predicate that 'master' has the full superset of all possible factor levels defined, it would seem that this would be a reasonable way to go. This approach would also seem to eliminate whatever overhead is encountered as a result of the coercion of 'x' as a factor to a character vector, which is done by match(). One question I have is, what is the advantage of using structure() versus: x <- factor(x, levels = levels(master)) ? Thanks, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] apply and plot
On Thu, 2005-10-13 at 14:50 +0100, Luis Ridao Cruz wrote: > R-help, > > I use the code below to plot some data by applying "apply" function. > But I don't know how I can get the argument "type" or "col" on the > "plot" function to distinguish the different lines > in the graph: > > apply ( my.data, 2, function ( x ) lines ( dimnames ( my.data ) [[1]] , > x ) ) > > > Thank you in advance Rather than trying the use the construct above, take a look at ?matlines and/or ?matplot (both on the same help page with ?matpoints.) I think that you will find these purposefully designed functions better suited to what I believe you are trying to do here. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Correlation, by date, of two variables?
On Wed, 2005-10-12 at 12:01 -0700, t c wrote: > I have a dataset with three variables: date, var1, var2 > How can I calculate the correlation, by date, between var1 and var2? > > e.g. > > > datevar1var2 > 1/1/200154 > 1/1/200185 > 1/1/200197 > > > > 2/1/200172 > 2/1/200121 > 2/1/200146 > > > > 3/1/200135 > 3/1/200143 > 3/1/200169 > 3/1/20017-1 > > > > the results I want: > 1/1/2001 0.891042111 > 2/1/2001 0.075093926 > 3/1/2001 -0.263117406 t c, Given your series of posts here, I would highly recommend that before you consider spending money on a tutor or other similar resource, you take the time to read the freely available documentation that both R Core and useRs have kindly provided to the Community. The R Core manuals are all available with your downloaded installation of R and/or from the main R web site under Documentation as are the Contributed documents. You will find that your time will be well invested in that effort. If you read the Posting Guide for the e-mail lists, a link to which is provided at the bottom of every e-mail that comes through, you will find an excellent guide to the resources that are available to assist you in using R. Trying to learn R (as with any technical endeavor) without reading at least a basic set of the growing list of documents, books and publications on R is going to be problematic. A hint for you here: See ?by, ?tapply, ?split and ?lapply for guidance on how to generate summary statistics on subsets of data. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Hmisc latex function
On Wed, 2005-10-12 at 08:33 -0500, Charles Dupont wrote: > Marc Schwartz (via MN) wrote: > > On Tue, 2005-10-11 at 10:01 -0400, Rick Bilonick wrote: > > > >>I'm using R 2.2.0 on an up-to-date version of Fedora Core 4 with the > >>latest version of Hmisc. When I run an example from the latex function I > >>get the following: > >> > >> > >>>x <- matrix(1:6, nrow=2, dimnames=list(c('a','b'),c('c','d','enLine > >> > >>2'))) > >> > >>>x > >> > >> c d enLine 2 > >>a 1 35 > >>b 2 46 > >> > >>>latex(x) # creates x.tex in working directory > >> > >>sh: line 0: cd: “/tmp/Rtmpl10983”: No such file or directory > >>This is pdfeTeX, Version 3.141592-1.21a-2.2 (Web2C 7.5.4) > >>entering extended mode > >>! I can't find file `“/tmp/Rtmpl10983/file643c9869”'. > >><*> “/tmp/Rtmpl10983/file643c9869” > >> > >>Please type another input file name: q > >>(/usr/share/texmf/tex/latex/tools/q.tex > >>LaTeX2e <2003/12/01> > >>Babel and hyphenation patterns for american, french, german, > >>ngerman, b > >>ahasa, basque, bulgarian, catalan, croatian, czech, danish, dutch, > >>esperanto, e > >>stonian, finnish, greek, icelandic, irish, italian, latin, magyar, > >>norsk, polis > >>h, portuges, romanian, russian, serbian, slovak, slovene, spanish, > >>swedish, tur > >>kish, ukrainian, nohyphenation, loaded. > >>File ignored > >>xdvi-motif.bin: Fatal error: /tmp/Rtmpl10983/file643c9869.dvi: No such > >>file. > >> > >> > >>How can I fix this? > >> > >>Rick B. > > > > > > I get the same results, also on FC4 with R 2.2.0. > > > > I am cc:ing Frank here for his input, but a quick review of the code and > > created files suggests that there may be conflict between the locations > > of some of the resultant files during the latex system call. Some files > > appear in a temporary R directory, while others appear in the current R > > working directory. > > > > For example, if I enter the full filename: > > > > /tmp/RtmpC12100/file643c9869.tex > > > > at the latex prompt, I get: > > > > > >>latex(x) > > > > sh: line 0: cd: “/tmp/RtmpC12100”: No such file or directory > > This is pdfeTeX, Version 3.141592-1.21a-2.2 (Web2C 7.5.4) > > entering extended mode > > ! I can't find file `“/tmp/RtmpC12100/file643c9869”'. > > <*> “/tmp/RtmpC12100/file643c9869” > > > > Please type another input file name: *** loading the extensions > > datasource > > /tmp/RtmpC12100/file643c9869.tex > > (/tmp/RtmpC12100/file643c9869.tex > > LaTeX2e <2003/12/01> > > Babel and hyphenation patterns for american, french, german, > > ngerman, b > > ahasa, basque, bulgarian, catalan, croatian, czech, danish, dutch, > > esperanto, e > > stonian, finnish, greek, icelandic, irish, italian, latin, magyar, > > norsk, polis > > h, portuges, romanian, russian, serbian, slovak, slovene, spanish, > > swedish, tur > > kish, ukrainian, nohyphenation, loaded. > > (/usr/share/texmf/tex/latex/base/report.cls > > Document Class: report 2004/02/16 v1.4f Standard LaTeX document class > > (/usr/share/texmf/tex/latex/base/size10.clo)) > > (/usr/share/texmf/tex/latex/geometry/geometry.sty > > (/usr/share/texmf/tex/latex/graphics/keyval.sty) > > (/usr/share/texmf/tex/latex/geometry/geometry.cfg)) > > No file file643c9869.aux. > > [1] (./file643c9869.aux) ) > > Output written on file643c9869.dvi (1 page, 368 bytes). > > Transcript written on file643c9869.log. > > xdvi-motif.bin: Fatal error: /tmp/RtmpC12100/file643c9869.dvi > > > H, It works for me. Interesting. > > It almost looks like the temp dir is not being created, but thats not > possible because R does that. It might be a Unicode issue with you > system shell. Can you run this statement in R > > sys(paste('cd',dQuote(tempdir()),";", > "echo Hello BOB > test.test", > ";","cat test.test")) > > > What version of Hmisc are you using? What local are you using? > > Charles Hmisc version 3.0-7, Dated 2005-09-15, which is the latest according to CRAN. > sys(paste('cd',dQuote(tempdir()),";", + "echo Hello BOB > test.test", + ";","cat test.test&qu
Re: [R] Two factor (or more) non-parametric comparison of means
Peter, Good points all around. Hopefully John will provide further clarification. Best regards, Marc On Tue, 2005-10-11 at 19:46 +0200, Peter Dalgaard wrote: > "Marc Schwartz (via MN)" <[EMAIL PROTECTED]> writes: > > > I suspect that John may be looking for ?friedman.test, which I believe > > will allow for a two-way non-parametric test. > > You could be right, but you could also be wrong... It depends quite a > bit on what is meant by a "two-factor comparison of means". If he > literally means means (!), then a permutation test (permuting within > levels of the other factor) could be appropriate. It also makes a > difference whether there's a single replication or more per cell in > the cross-classification. > > > kruskal.test() will perform a non-parametric test on two or more samples > > (or factor levels) as a generalization of wilcox.test() for one or two > > samples (or factor levels). > > > > HTH, > > > > Marc Schwartz > > > > On Tue, 2005-10-11 at 18:22 +0200, Johan Sandblom wrote: > > > ?kruskal.test > > > > > > 2005/10/11, John Sorkin <[EMAIL PROTECTED]>: > > > > Can anyone suggest a good non-parametric test, and an R > > > implementation of the test, that allows for two or more factors? > > > Wilcoxon signed rank allows for only one. > > > > Thanks, > > > > John > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Hmisc latex function
On Tue, 2005-10-11 at 10:01 -0400, Rick Bilonick wrote: > I'm using R 2.2.0 on an up-to-date version of Fedora Core 4 with the > latest version of Hmisc. When I run an example from the latex function I > get the following: > > > x <- matrix(1:6, nrow=2, dimnames=list(c('a','b'),c('c','d','enLine > 2'))) > > x > c d enLine 2 > a 1 35 > b 2 46 > > latex(x) # creates x.tex in working directory > sh: line 0: cd: “/tmp/Rtmpl10983”: No such file or directory > This is pdfeTeX, Version 3.141592-1.21a-2.2 (Web2C 7.5.4) > entering extended mode > ! I can't find file `“/tmp/Rtmpl10983/file643c9869”'. > <*> “/tmp/Rtmpl10983/file643c9869” > > Please type another input file name: q > (/usr/share/texmf/tex/latex/tools/q.tex > LaTeX2e <2003/12/01> > Babel and hyphenation patterns for american, french, german, > ngerman, b > ahasa, basque, bulgarian, catalan, croatian, czech, danish, dutch, > esperanto, e > stonian, finnish, greek, icelandic, irish, italian, latin, magyar, > norsk, polis > h, portuges, romanian, russian, serbian, slovak, slovene, spanish, > swedish, tur > kish, ukrainian, nohyphenation, loaded. > File ignored > xdvi-motif.bin: Fatal error: /tmp/Rtmpl10983/file643c9869.dvi: No such > file. > > > How can I fix this? > > Rick B. I get the same results, also on FC4 with R 2.2.0. I am cc:ing Frank here for his input, but a quick review of the code and created files suggests that there may be conflict between the locations of some of the resultant files during the latex system call. Some files appear in a temporary R directory, while others appear in the current R working directory. For example, if I enter the full filename: /tmp/RtmpC12100/file643c9869.tex at the latex prompt, I get: > latex(x) sh: line 0: cd: “/tmp/RtmpC12100”: No such file or directory This is pdfeTeX, Version 3.141592-1.21a-2.2 (Web2C 7.5.4) entering extended mode ! I can't find file `“/tmp/RtmpC12100/file643c9869”'. <*> “/tmp/RtmpC12100/file643c9869” Please type another input file name: *** loading the extensions datasource /tmp/RtmpC12100/file643c9869.tex (/tmp/RtmpC12100/file643c9869.tex LaTeX2e <2003/12/01> Babel and hyphenation patterns for american, french, german, ngerman, b ahasa, basque, bulgarian, catalan, croatian, czech, danish, dutch, esperanto, e stonian, finnish, greek, icelandic, irish, italian, latin, magyar, norsk, polis h, portuges, romanian, russian, serbian, slovak, slovene, spanish, swedish, tur kish, ukrainian, nohyphenation, loaded. (/usr/share/texmf/tex/latex/base/report.cls Document Class: report 2004/02/16 v1.4f Standard LaTeX document class (/usr/share/texmf/tex/latex/base/size10.clo)) (/usr/share/texmf/tex/latex/geometry/geometry.sty (/usr/share/texmf/tex/latex/graphics/keyval.sty) (/usr/share/texmf/tex/latex/geometry/geometry.cfg)) No file file643c9869.aux. [1] (./file643c9869.aux) ) Output written on file643c9869.dvi (1 page, 368 bytes). Transcript written on file643c9869.log. xdvi-motif.bin: Fatal error: /tmp/RtmpC12100/file643c9869.dvi: No such file. The temporary .tex file is present, but the .dvi, .aux and .log files are created in the current working R directory. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Two factor (or more) non-parametric comparison of means
I suspect that John may be looking for ?friedman.test, which I believe will allow for a two-way non-parametric test. kruskal.test() will perform a non-parametric test on two or more samples (or factor levels) as a generalization of wilcox.test() for one or two samples (or factor levels). HTH, Marc Schwartz On Tue, 2005-10-11 at 18:22 +0200, Johan Sandblom wrote: > ?kruskal.test > > 2005/10/11, John Sorkin <[EMAIL PROTECTED]>: > > Can anyone suggest a good non-parametric test, and an R > implementation of the test, that allows for two or more factors? > Wilcoxon signed rank allows for only one. > > Thanks, > > John __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] wildcards and removing variables
On Mon, 2005-10-10 at 16:01 +0100, Prof Brian Ripley wrote: > On Mon, 10 Oct 2005, Marc Schwartz (via MN) wrote: > > > On Mon, 2005-10-10 at 10:37 -0400, Afshartous, David wrote: > >>All, > >> > >>Is there are a wildcard in R for varible names as in unix? For example, > >> > >>rm(results*) > >> > >>to remove all variable or function names that begin w/ "results"? > >> > >>cheers, > >>Dave > >>ps - please respond directly to [EMAIL PROTECTED] > > > > > > See ?ls, which has a 'pattern' argument, enabling the use of Regex to > > define the objects to be listed and subsequently removed using rm(). > > > > You can then use something like: > > > > rm(list = ls(pattern = "\\bresults.")) > > One new feature of R-2.2.0 is glob2rx, which converts wildcards to regexps > for you. E.g. > > rm(list = ls(pattern = glob2rc("results*"))) > > (I think if does it a little better, as that trailing dot is not I think > correct: result* matches result.) Thanks to both Gabor and Prof. Ripley for pointing out the use of glob2rc(). One of the new features I had forgotten about since reading NEWS. On the trailing dot, I was just in the process of drafting a follow up after realizing my error, since as you point out, 'results' would not be matched in that case. Thanks, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] wildcards and removing variables
On Mon, 2005-10-10 at 10:37 -0400, Afshartous, David wrote: > All, > > Is there are a wildcard in R for varible names as in unix? For example, > > rm(results*) > > to remove all variable or function names that begin w/ "results"? > > cheers, > Dave > ps - please respond directly to [EMAIL PROTECTED] See ?ls, which has a 'pattern' argument, enabling the use of Regex to define the objects to be listed and subsequently removed using rm(). You can then use something like: rm(list = ls(pattern = "\\bresults.")) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] spline.des
On Mon, 2005-10-03 at 15:28 -0500, lforzani wrote: > Hello, I am using library fda and I can not run a lot of functions because > I receive the error: > > Error in bsplineS(evalarg, breaks, norder, nderiv) : > couldn't find function "spline.des" > > > do you know how I can fix that? Thnaks. Liliana spline.des() is in the 'splines' package, installed with the basic R distribution. It appears that bsplineS() has a dependency on this not otherwise referenced in the fda package documentation. Looks like fda has not been updated in some time as well. I have cc'd Jim Ramsay on this reply as an FYI. You probably need a: library(splines) before calling your code above. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Getting eps into Word documents.
On Tue, 2005-10-04 at 08:38 -0500, Robert Baer wrote: > > > On 03-Oct-05 Marc Schwartz (via MN) wrote: > > > > On Mon, 2005-10-03 at 16:31 -0300, Rolf Turner wrote: > > > >> A student in one of my courses has asked me about getting R graphics > > > >> output (under Linux) into a Word document. I.e. she wants to do her > > > >> R thing under Linux, but then do her word processing using Word. > --snip-- > > > > So use something like the following: > > > > > > > > > > > > postscript("RPlot.eps", height = 4, width = 4, > > > >horizontal = FALSE, onefile = FALSE, > > > >paper = "special") > > > > > > > > plot(1:5) > > > > > > > > dev.off() > > > > > > > > > > > > You can then import the .eps file into Word or most other such > > > > applications that can import encapsulated postscript files. > > -snip > > > > More information is available from MS here: > > > > > > > > http://support.microsoft.com/?kbid=290362 > > > > > > > > HTH, > > > > > > > > Marc Schwartz > --snip--- > > > b) It won't work anyway if printed to a non-PostScript printer. > > > > True, which is the case irrespective of Word/Windows. If you don't have > > a PS printer locally or accessible via network, you can always install a > > PS printer driver and print to a file, which can then be printed by a > > third party if required. > > > Well, as a lowly Windows and Office user, I most often right click on R > grahics, cut to clipboard, and paste into Word. So one possiblility is for > the student to install R on her own machine (Windows or Mac?). > > But I just tried Marc's suggestion, and it looks VERY VIABLE to me. I > generated the graph from his code snippet and used "Insert picture from > file" in Word 2003 to place the graphic in a Word document. I then tried > printing on both an HP 4100 TN laserjet and an HP 960c deskjet. The image > printed perfectly on both printers with crisp lines and text that apprear to > be vector-based not degraded bitmapped representations. Certainly worth the > student trying. Glad to hear that worked for you. I do think that the majority of problems with importing EPS files from R into other applications (Word, Powerpoint, Writer, Impress, etc.) are due to not properly configuring the postscript() arguments that are on the help page. I should also note that beyond plots, I use this same approach for putting LaTeX tables into these documents as well. There are times where I need create one or more nicely formatted tables for inclusion in another document, to then be used by someone else. I generate the LaTeX preamble, document and table code using R, then run: latex FileName.tex dvips -E FileName -o FileName.ps where the '-E' option to dvips attempts to create an EPS output file with a tight bounding box. The result is a single EPS file with the table (much like the plot example) ready for import. I had been using 'epstool' to create the image preview, but now that OO.org 2.0 has included this automatically upon import (as do the latest versions of Word), this step is no longer required. In this way, I can use R to create reproducible and publication ready tables for use by others in their documents, in situations where they are not (or cannot) use LaTeX for the entire document. HTH, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Getting eps into Word documents.
On Mon, 2005-10-03 at 22:00 +0100, Ted Harding wrote: > Rolf (& Marc) > > On 03-Oct-05 Marc Schwartz (via MN) wrote: > > On Mon, 2005-10-03 at 16:31 -0300, Rolf Turner wrote: > >> A student in one of my courses has asked me about getting R graphics > >> output (under Linux) into a Word document. I.e. she wants to do her > >> R thing under Linux, but then do her word processing using Word. > >> > >> Scanning around the r-help archives I encountered an inquiry about > >> this topic --- eps into Word documents --- from Paul Johnson but > >> found no replies to it. I tried contacting him but the email address > >> in the archives appeared not to be valid. Does anyone know a > >> satisfactory solution to the problem of including a graphic which > >> exists in the form of a *.eps (encapsulated postscript) file into a > >> Word document. If so, would you be willing to share it with me and > >> my student? > >> > >> If so, please be gentle in your explanation. I am not myself (repeat > >> ***NOT***) a user of Word! > > > > Hehe... :-) > > > > Rolf, just use the guidance provided in ?postscript. In the details > > section it indicates: > > > > The postscript produced by R is EPS (_Encapsulated PostScript_) > > compatible, and can be included into other documents, e.g., into > > LaTeX, using '\includegraphics{}'. For use in this way > > you will probably want to set 'horizontal = FALSE, onefile = > > FALSE, paper = "special"'. > > > > So use something like the following: > > > > > > postscript("RPlot.eps", height = 4, width = 4, > >horizontal = FALSE, onefile = FALSE, > >paper = "special") > > > > plot(1:5) > > > > dev.off() > > > > > > You can then import the .eps file into Word or most other such > > applications that can import encapsulated postscript files. > > > > The recent versions of Word will also automatically generate a > > bitmapped preview of the plot upon import. BTW, OO.org 2.0, > > which is in late beta testing now, also generates EPS preview > > images upon import. > > > > The key to doing this successfully is using the arguments to > > postscript() as defined above. I have never had a problem with this. > > > > More information is available from MS here: > > > > http://support.microsoft.com/?kbid=290362 > > > > HTH, > > > > Marc Schwartz > > This suggestion could be problematic in that Ted, > a) According to the MS web site above, it applies to recent Word >(Office 2002/2003) or possibly earlier "depending on installed >graphics filters". The Word EPS filter has been around for some time. I recall using it years ago. Note that it is listed on that site in both categories of filters for some reason. In the older versions of Word, there was no preview image generated, hence you ended up with a box/frame place holder of sorts, unless you added a preview image before importing. That being said, one does need to install the import filters from the Office CD, which may not be part of the default settings on all prior versions. It may require going back into the Office set up program to install them. > b) It won't work anyway if printed to a non-PostScript printer. True, which is the case irrespective of Word/Windows. If you don't have a PS printer locally or accessible via network, you can always install a PS printer driver and print to a file, which can then be printed by a third party if required. If one has ps2pdf/Ghostscript available, you can also convert the PS file to PDF and then print it via Acrobat or other PDF viewers. > If either of these applies to Rolf's student, she could have problems. > > [Just to add my own "disclaimer": the only version of Word I'm in > any position to ever touch, and then only if driven to, belongs > to Office 98; and I'm sure that this doesn't know a thing about > PostScript!] > > Another option to consider, since she's doing her R work on Linux, > is that recent versions of the ImageMagick program 'convert' have > the capability to convert EPS into WMF (Windows Metafile; use > file extension ".wmf" for 'convert') or EMF (Enhanced Metafile); > use file extension ".emf". The gubbins is built in to a file > "wmf.so" in the lib/ImageMagick tree. > > Likewise, the program 'pstoedit' can do it (to ".wmf" or ".emf"), > usi
Re: [R] Getting eps into Word documents.
On Mon, 2005-10-03 at 16:31 -0300, Rolf Turner wrote: > A student in one of my courses has asked me about getting R graphics > output (under Linux) into a Word document. I.e. she wants to do her > R thing under Linux, but then do her word processing using Word. > > Scanning around the r-help archives I encountered an inquiry about > this topic --- eps into Word documents --- from Paul Johnson but > found no replies to it. I tried contacting him but the email address > in the archives appeared not to be valid. Does anyone know a > satisfactory solution to the problem of including a graphic which > exists in the form of a *.eps (encapsulated postscript) file into a > Word document. If so, would you be willing to share it with me and > my student? > > If so, please be gentle in your explanation. I am not myself (repeat > ***NOT***) a user of Word! Hehe... :-) Rolf, just use the guidance provided in ?postscript. In the details section it indicates: The postscript produced by R is EPS (_Encapsulated PostScript_) compatible, and can be included into other documents, e.g., into LaTeX, using '\includegraphics{}'. For use in this way you will probably want to set 'horizontal = FALSE, onefile = FALSE, paper = "special"'. So use something like the following: postscript("RPlot.eps", height = 4, width = 4, horizontal = FALSE, onefile = FALSE, paper = "special") plot(1:5) dev.off() You can then import the .eps file into Word or most other such applications that can import encapsulated postscript files. The recent versions of Word will also automatically generate a bitmapped preview of the plot upon import. BTW, OO.org 2.0, which is in late beta testing now, also generates EPS preview images upon import. The key to doing this successfully is using the arguments to postscript() as defined above. I have never had a problem with this. More information is available from MS here: http://support.microsoft.com/?kbid=290362 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] gnomeGUI installation
On Mon, 2005-10-03 at 18:01 +0100, Prof Brian Ripley wrote: > On Mon, 3 Oct 2005, Daniel Pick wrote: > > > I have successfully downloaded the sources and built R as a shared > > library on a Red Hat Enterprise Level 3 box. I am now trying to build the > > GNOME GUI, but configure is barfing on glade. According to the system > > logs, the RPM for libglade2-devel-2.0.1-3.x86_64 is installed, but in > > /usr/bin, where gnomeGUI configure is looking, what's there is > > libglade-convert. How do I fix this? > > By reading the manual. In the R-admin manual it says > > `Please check you have all the requirements. You need at least the > following packages or later installed: > > audiofile-0.2.1 > esound-0.2.23 > glib-1.2.10 > gtk+-1.2.10 > imlib-1.9.10 > ORBit-0.5.12 > gnome-libs-1.4.1.2 > libxml-1.8.16 > libglade-0.17 > > ... > > Remember that some package management systems (such as @acronym{RPM} and > deb) make a distinction between the user version of a package and the > developer version. The latter usually has the same name but with the > extension @samp{-devel} or @samp{-dev}. If you use a pre-packaged > version of @acronym{GNOME} then you must have the developer versions of > the above packages in order to compile the R-GNOME console.' > > My FC3 box has > > gannet% rpm -qa | grep ^libglade > libglade2-devel-2.4.0-5 > libglade-devel-0.17-15 > libglade2-2.4.0-5 > libglade-0.17-15 > > and note that libglade2[-devel] is NOT a later version of libglade (in the > same way that gtk2 is different from gtk). > > AFAICS it is likely that /usr/bin/libglade-config is provided by > libglade-devel-0.17-15. That is correct, subject of course to RPM version differences between FC3/4 and RHEL3/4. Using RPM, this can be confirmed with: $ rpm -q --whatprovides /usr/bin/libglade-config libglade-devel-0.17-16 which is the output on FC4. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Binary Logit Regression with R
On Thu, 2005-09-29 at 18:08 -0400, Johann Park wrote: > Hi to all, > > I am a PH.D Student doing statistical analysis. > I am totally new to R. I previously use Stata and am changing into R. I > ususally do with logit regression with binary dependent variable (war > occurence:1 or 0). > > I just want to know command to do that. More sepcifically, > > Let say, my Y is war occurence (occur=1, otherwise 0). And my independent > variables (Xs) are trade, democracy, military poweretc. > > In Stata, I do like what follows: > > logit war trade democracy militarypower... > > Then I will get results. > > What are the equivalent command in R? > > Many thanks, > > JP See ?glm in the base stats package or ?lrm in Frank Harrell's Design package on CRAN. BTW, doing: help.search("logit") or RSiteSearch("logit") would provide you with the ability to do keyword searches of your current R installation or the online e-mail list and documentation archives, respectively. You should also review Chapter 11 - Statistical Models in R in "An Introduction to R", which is installed with R or available online under the Documentation/Manuals link on the main R web site. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to add frame (frame.plot=T) to Plot.Design?
On Thu, 2005-09-29 at 18:36 +0200, Jan Verbesselt wrote: > Hi R-help, > > > > When using the package Design and the plot.Design function all the graphs of > an lrm.fit (lrm()) are plotted without a frame (only with axes). How can the > frame be added to these plots? > > > > In the plot.default function the frame can be added/or removed via the > frame.plot=T/F, respectively. > > e.g., plot(1,axes=T,frame.plot=T). How can this be done with > plot.Design(lrm.fit) > > > > Thanks, > > Jan See ?box HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Display values in piechart/barplot
On Thu, 2005-09-29 at 14:34 +0200, Volker Rehbock wrote: > Is it possible to automatically display the underlying values of a > piechart/barplot in the graphic? If so, which package/function/argument do I > need for it? > Thanks, > Volker Using pie charts are not a particularly good way of displaying data, even though the pie() function is available in R. See ?pie for more information on this. For barplots, the following provide two approaches: # Place the bar values above the bars # Note that I set 'ylim' to make room for # the text labels above the bars vals <- 1:5 names(vals) <- LETTERS[1:5] mp <- barplot(vals, ylim = c(0, 6)) text(mp, vals, labels = vals, pos = 3) # Place the bar values below the x axis vals <- 1:5 names(vals) <- LETTERS[1:5] mp <- barplot(vals) mtext(side = 1, at = mp, text = vals, line = 3) Note that barplot() returns the bar midpoints, so you can use these values for annotation placement. See ?barplot, ?text and ?mtext for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] boxplot and xlim confusion?
On Thu, 2005-09-29 at 15:28 +0200, Karin Lagesen wrote: > "Petr Pikal" <[EMAIL PROTECTED]> writes: > > > Hi Karin > > > > I did not have seen any answer for your question yet so here is a > > try. > > > > I gues you want the horizotal layout or your boxplot. > > > > boxplot(split(rnorm(30), rep(1:3, each=10)), horizontal =T, > > names=letters[1:3]) > > > > boxplot(split(rnorm(30), rep(1:3, each=10)), horizontal =T, > > names=c(NA,"b",NA)) > > These look like the plots I want, yes. However, is there a way of > locking the x-axis? When I do these several times in a row, the range > of the x axis moves with the data being plotted. If I set xlim in the > boxplot command, it makes no difference what so ever. That's because when you use 'horizontal = TRUE' in boxplot(), the x and y axes are rotated. Thus you need to set 'ylim' and not 'xlim', where ylim is the range of values in your data irrespective of the axis orientation. This is done in bxp(), which is the actual function doing the plotting for boxplot(). In bxp(), there is the follow code snippet: if (horizontal) plot.window(ylim = c(0.5, n + 0.5), xlim = ylim, log = log) else plot.window(xlim = c(0.5, n + 0.5), ylim = ylim, log = log) Note in the first case, that the xlim value for plot.window() is set to the ylim value passed from boxplot() or from bxp(), if called directly. So, the above examples by Petr should be something like: boxplot(split(rnorm(30), rep(1:3, each=10)), horizontal = TRUE, names=letters[1:3], ylim = c(-4, 4)) boxplot(split(rnorm(30), rep(1:3, each=10)), horizontal =TRUE, names=c(NA,"b",NA), ylim = c(-4, 4)) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] correct syntax
On Thu, 2005-09-29 at 06:57 +1000, Stephen Choularton wrote: > Hi > > I am trying to produce a little table like this: > > 0 1 > 0 601 408 > 1 290 2655 > > but I cannot get the syntax right. > > Can anyone help. > > Stephen Do you have the raw data or only the tabulated counts? If you have the raw data, see ?table for ways to create n-dimensional tables. If you only have the tabulated counts, then you can use matrix(): > mat <- matrix(c(601, 290, 408, 2655), ncol = 2) > dimnames(mat) <- list(c(0, 1), c(0, 1)) > mat 01 0 601 408 1 290 2655 See ?matrix for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Using unsplit - unsplit does not seem to reverse the effect of split
On Tue, 2005-09-27 at 19:12 +0200, Søren Højsgaard wrote: > In data OME in MASS I would like to extract the first 5 observations > per subject (=ID). So I do > > library(MASS) > OMEsub <- split(OME, OME$ID) > OMEsub <- lapply(OMEsub,function(x)x[1:5,]) > unsplit(OMEsub, OME$ID) > > - which results in > > [[1]] > [1] 1 1 1 1 1 > [[2]] > [1] 30 30 30 30 30 > [[3]] > [1] low low low low low > Levels: N/A high low > [[4]] > [1] 35 35 40 40 45 > [[5]] > [1] coherent incoherent coherent incoherent coherent > Levels: coherent incoherent > [[6]] > [1] 1 4 0 1 2 > > > > [[1094]] > [1] 4 5 5 5 2 > [[1095]] > [1] 100 100 100 100 100 > [[1096]] > [1] 18 18 18 18 18 > [[1097]] > [1] N/A N/A N/A N/A N/A > Levels: N/A high low > There were 50 or more warnings (use warnings() to see the first 50) > > warnings() > Warning messages: > 1: number of items to replace is not a multiple of replacement length > 2: number of items to replace is not a multiple of replacement length > 3: number of items to replace is not a multiple of replacement length > > > According to documentation unsplit is the reverse of split, but I must > be missing a point somewhere... Can anyone help? Thanks in advance. > Søren If you read the documentation for split/unsplit, you will also note that in the Details section it says: 'unsplit' works only with lists of vectors as opposed to lists of data frames, which is the result of your split() operation. Also note that in the Value section, it indicates: 'unsplit' returns a vector for which 'split(x, f)' equals 'value' as opposed to unsplit returning a data frame. Thus, use: OME1 <- do.call("rbind", OMEsub) where OME1 will be the result of rbind()'ing the data frames in the OMEsub list. See ?do.call for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Help: x11 position in the Unix environment
On Mon, 2005-09-26 at 17:45 +0200, Shengzhe Wu wrote: > Hello, > > In the Unix environment, I open a window by x11(). May I specify the > position of this window by specifying the position of the top left of > the window as in Windows environment? Or some other parameters can be > used to do that? > > Thank you, > Shengzhe I don't believe so. In general, under Unix/Linux, the Window Manager determines window positioning upon startup unless the application overrides this behavior. Some applications let you specify application window positioning via command line 'geometry' arguments or via the use of an .Xresources file. Some WM's provide more or less functionality for this behavior relative to user customization. For example, Sawfish provides quite a bit, whereas Metacity hides much of it. You may want to check the documentation for the WM that you are using. There is also an application called Devil's Pie: http://www.burtonini.com/blog/computers/devilspie which provides additional Sawfish-like customization for window positioning, etc. However, this is global for a given window, not on a per instance basis. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Fligner-Policello robust rank test
On Thu, 2005-09-22 at 10:44 -0400, Elisabetta Manduchi wrote: > Can anybody tell me if there is an R implementation of the > Fligner-Policello robust rank test? > Thanks, > Elisabetta I don't know about the Fligner-Policello test, but the Fligner-Killeen test has been implemented in fligner.test(). From ?fligner.test: The Fligner-Killeen (median) test has been determined in a simulation study as one of the many tests for homogeneity of variances which is most robust against departures from normality, see Conover, Johnson & Johnson (1981). It is a k-sample simple linear rank which uses the ranks of the absolute values of the centered samples and weights a(i) = qnorm((1 + i/(n+1))/2). The version implemented here uses median centering in each of the samples (F-K:med X^2 in the reference). FWIW, in a Google search, I located the following SAS macro by Paul von Hippel, which includes a comment that does not lead to optimism regarding software implementations: http://www.sociology.ohio-state.edu/people/ptv/macros/fligner_policello.htm The above should not be overly difficult to convert to R, presuming you know SAS macros... HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to extract the column name or value from the numerical value of the matrix
On Mon, 2005-09-19 at 12:30 -0700, shanmuha boopathy wrote: > Dear sir, > i have a matrix like > x<-c(1:20) > A<-matrix(x,4,5) > > A > [,1] [,2] [,3] [,4] [,5] > [1,]159 13 17 > [2,]26 10 14 18 > [3,]37 11 15 19 > [4,]48 12 16 20 > > I want to extract the column value for the matrix > value 11... > > or the row value for 14.. > how it is possible? > > thanks for your help > > with regards, > boopathy. Why did you copy r-devel? r-help is the appropriate forum. See ?which and take note of the 'arr.ind' argument: > which(A == 11, arr.ind = TRUE) row col [1,] 3 3 > which(A == 14, arr.ind = TRUE) row col [1,] 2 4 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How do I get the row indices?
On Fri, 2005-09-16 at 10:34 -0700, Martin Lam wrote: > Hi, > > I was wondering if it's possible to get the row > numbers from a filtering. Here's an example: > > # give me the rows with sepal.length == 6.2 > iris[(iris[,1]==6.2),] > > # output > Sepal.Length Sepal.Width Petal.Length Petal.Width > Species > 69 6.2 2.2 4.5 1.5 > versicolor > 98 6.2 2.9 4.3 1.3 > versicolor > 127 6.2 2.8 4.8 1.8 > virginica > 149 6.2 3.4 5.4 2.3 > virginica > > What I really want is that it return the row numbers: > 69, 98, 127, 149. > > Thanks in advance, > > Martin Martin, First: Be very, very careful when performing exact equalities on floating point numbers. They won't always result in the answer you expect. For more information see R FAQ 7.31: Why doesn't R think these numbers are equal? Second: See ?all.equal, ?sapply and ?which. Here is a possible vectorized solution: > which(sapply(iris[, 1], function(x) isTRUE(all.equal(x, 6.2 [1] 69 98 127 149 The above applies isTRUE(all.equal(x, 6.2)) for each element 'x' in iris[, 1], returning the indices of the TRUE results for the near equality comparison, based upon the tolerance argument in all.equal(). HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Graphics 'snapshots' in Linux?
On Thu, 2005-09-15 at 10:43 -0400, Tyler Smith wrote: > Hi, > > I'm working on a MEPIS (Debian-based Linux) computer, using the > emacs/ESS package to do my R work. I've got some plots that I label > interactively using the locate function. With the Windows GUI there is > an option to take a snapshot of the graphics output, saving it as an > image file. Is there a way to do this with emacs/ESS? > > Thanks, > > Tyler Tyler, Take a look at ?dev.copy2eps and on the same page dev.copy(), which enable you to copy the current X11 plot supported output devices. You could do something like the following for an EPS file: plot(1:5) text(locator(1), "Place Text Here") dev.copy2eps(file = "MyPlot.eps") or the following for a PNG file: par(bg = "white") plot(1:5) text(locator(1), "Place Text Here") dev.copy(device = png, file = "MyPlot.png") dev.off() Note that in the first example, dev.off() is not required, as the EPS output device is closed after the call. Also, note in the second example, you will need to set the background to white (unless already specified for whatever color you may be using), as the default output file will have a transparent background, even though the png() function shows the default as white. If my memory is correct this is because the X11 device itself has a transparent background by default and this is what is copied. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Forcing hist()
On Wed, 2005-09-14 at 20:17 +0200, Par Leijonhufvud wrote: > I'm trying to create histogram (using hist()) that fullfill the following > criteria: > > * data is on a ordinal scale (1, 2, 3, 4, 5) > * I want bars centered over the number on the x-axis > * I want 5 bars of equal width > > I have tried various versions of the hist() command, with no luck. what > am I missing? > > /Par More than likely, you want to use barplot() and not hist(): Try: barplot(1:5, names.arg = 1:5) See ?barplot for more information. HTH, Marc Schwartz P.S. To R Core: It probably makes sense to add barplot to the See Also section of ?hist, since hist is listed in the See Also for ?barplot. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] png scaling problem
On Thu, 2005-09-01 at 21:51 +0200, Knut Krueger wrote: > scaling<-4 > xywidth<-480 > resolution<-150 > png(filename = "c:/r/anschluss/plots/4.png", width = xywidth*scaling, > height = xywidth*scaling,pointsize = 12, bg = "white", res = > resolution*scaling) > .. > > barplot(xrow,col = barcolors,cex.axis=scaling, ylab="mean time till > attachment in sec",cex.lab=1.2*scaling) > > I tried to scale the barplot but there is one strange result: > scaling=1 > http://biostatistic.de/temp/1.png--- the ylab is ok > > scaling=2 > http://biostatistic.de/temp/2.png--- the ylab is not ok > > scaling=4 > http://biostatistic.de/temp/4.png--- the ylab is terrible > > is there any better solution to scale the resolution and the width/height? > > > with regards > Knut Probably a better first question is, why are you using a bitmapped graphics format if you need the image to be re-scaled? In general, bitmapped graphics do not resize well, though if you have a specific need and know a target image size, you can adjust various parameters to look decent. Are you going to view these images in a web browser, where you are concerned with display size and resolution? >From your e-mail headers it appears you are on Windows. If you need a re-sizable graphic, use a vector based format such as wmf/emf, especially if you need the graphics embedded in a Windows application such as Word or Powerpoint. This is the default format under Windows when copying and pasting a graphic between applications. You can then, fairly freely, resize the image in the target application as you may require. If you are going to print the graphic directly or include it in a document for printing (as opposed to just viewing it), then use PDF or Postscript. The latter in EPS format, can be imported into many Windows applications like Word, including the generation of a preview image. However, they don't look good for direct use in presentations (unless you print to a PS file and then convert to PDF for viewing). See ?Devices for more information. With a better idea of how you plan to use the graphic(s), we can offer more specific recommendations on how to proceed. Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] axis of plot
On Thu, 2005-09-01 at 16:58 +0100, [EMAIL PROTECTED] wrote: > hy, > > I need to have the 0 on the bottom left corner of the graph being > joined , not with this little hole between x axis and y axis... > > I've saw this question with the answer one time but i'm unable to find > it again.. > > > thks. > guillaume See 'xaxs' and 'yaxs' in ?par: plot(1:5, xaxs = "i", yaxs = "i", xlim = c(0, 5), ylim = c(0, 5)) By default, both are set to "r", which adds +/- 4% to the range of each axis. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] crosstab for n-way contingency tables
On Tue, 2005-08-30 at 12:28 -0700, Isotta Felli wrote: > Dear list. > > New to R, I'm looking for a way of using crosstab to output > low-dimensional (higher than 2) contingency tables (frequencies, > per-cents by rows, % by columns, mean, quantiles) I'm looking for > something of the following sort > > dataframe: singers, > categorical variates: voice category (soprano,mezzo-soprano, ...) , > voice type( drammatic, spinto, lirico-spinto, lirico, leggero), school > (german, italian, french, russian, anglo-saxon, other);repertory > (opera, Lieder, oratorio, operetta) > continuous variate: age > > I would like to tabulate the frequencies (relative percentages) say > in the following way > columns: school repertory > rows : voice category voice type > > or to output in the cells of the above table, the statistics > (mean/median/quantiles) for age > > > I've seen that the function bwplot(age~school | repertory, data= > singers, layout=c(4,2)) would do graphically something similar to what > I want, but I desire the output also in tabular form > > Thanks > > Isotta I don't know that you will find a single function that will do all of what you desire, but you can look at the ctab() function, which is in the 'catspec' package on CRAN by John Hendrickx. This will do multi-way tables with summary row/col statistics. There is also the CrossTable() function in the 'gmodels' package on CRAN, though this will only do one way and two way tables, with summary row/col statistics. Neither of the above provide typical summary output for continuous data. They are more for categorical variables. For simple count multi-way output, you can also look at the ftable() function which is in the base package. Also look at the summary() function in the base package which will provide range, mean, quantile data for continuous variables. It may just be a matter of formatting the output in a fashion that you desire on a post analysis basis. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] ylim for graphic
On Mon, 2005-08-29 at 12:58 -0400, Doran, Harold wrote: > Dear list: > > I have some data for which I am generating a series of barplots for > percentages. One issue that I am dealing with is that I am trying to get > the legend to print in a fixed location for each chart generated by the > data. Because these charts are being created in a loop, with different > data, my code searches the data to identify the maximum value in the > data and then print the data values 5 points above the tallest bar. > > Now, in situations where the largest value is 100, I needed to create > the y-axis high enough to accommodate the legend w/o crowding up the > data or the bars. So, I have ylim =c(0,200) and then place the legend at > c(2,150). Visually, this places things exactly where I want them. But, > it seems silly to have a y-axis labeled as high as 200%. I'm certain > there is a smarter technique. Is it possible to place the legend at a > location higher than 100 to avoid the crowding of the bars and the data > and then also have the labels for the y-axis not print after the value > 100%? > > In looking at ?legend I didn't see an option that would address this. > Below is some code that you can use to create a similar chart. > > Thanks, > Harold > > > math.bar <- c(53,31,55,28,55,100) > apmxpmeet <- c(47, 50, 49, 50, 49, 46) > par(ps=10) > math.bar <- rbind(math.bar, apmxpmeet) > math.barplot <- barplot(math.bar, beside=T, col=c('blue','orange'), > names=c('Grade \n 3','Grade \n 4','Grade \n 5','Grade \n 6','Grade \n > 7','Grade \n 8'), > ylim=c(0,200), ylab="Percentage", xlab="Grade Level") > tot <- round(math.bar,digits=0) > graph.max <- max(math.bar, apmxpmeet, na.rm=T) > text(math.barplot, graph.max+5, tot, xpd = TRUE, col = c("blue", > "orange") ) > legend(2,150,legend=(c("Label A", "Average")), fill=c("blue","orange")) Harold, A few thoughts: 1. Instead of fixing the y axis max value at 200, simply set ylim to c(0, max(math.bar * 1.2)) or a similar constant. In this case, you get an extra 20% above the max(y) value for the legend placement. 2. In the legend call, use: legend("topleft", legend=(c("Label A", "Average")), fill = c("blue","orange")) This will place the legend at the topleft of the plot region, rather than you having to calculate the x and y coords. If you want it moved in from the upper left hand corner, you can use the 'inset' argument as well, which moves the legend by a proportion of the plot region limits (0 - 1): legend("topleft", legend=(c("Label A", "Average")), fill = c("blue","orange"), inset = .1) 3. You can use barplot(..., yaxt = "n") to have the y axis not drawn and then use axis() to place the labels at locations of your choosing, which do not need to run the full length of the axis range: barplot(1:5, yaxt = "n", ylim = c(0, 10)) axis(2, at = 0:5) 4. You can place the legend outside the plot region, enabling you to keep the y axis range to <=100. This would need some tweaking, but the idea is the same: # Increase the size of the top margin par(mar = c(5, 4, 8, 2) + 0.1) # Draw a barplot barplot(1:5) # Disable clipping outside the plot region par(xpd = TRUE) # Now draw the legend, but move it up by 30% from the top left legend("topleft", legend = LETTERS[1:5], inset = c(0, -.3)) You could also place the legend to the right or left of the plot region if you prefer, adjusting the above accordingly. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Unpaste Problem
On Fri, 2005-08-26 at 22:39 +0530, A Mani wrote: > Hello, > Easy ways to "unpaste"? > xp <- paste(x2, x3) # x2, x3 are two non-numeric columns. > . > . > xfg <- data.frame(xp,sc1, sc2, sc3) # sc1,sc2, sc3 are numeric cols. > > I want xp to be split up to form a new dataframe of the form (x3, sc1, > sc2, sc3). > IMPORTANT info : elements of xp have the form abcefg, with abc > in x2 and efg in x3. > > Thanks in advance, I think I understand what you are trying to do. Hopefully the below may be helpful: # Create the data frame with 3 rows x2 <- letters[1:3] x3 <- LETTERS[1:3] xp <- paste(x2, x3) sc1 <- rnorm(3) sc2 <- rnorm(3) sc3 <- rnorm(3) xfg <- data.frame(xp, sc1, sc2, sc3) > xfg xpsc1sc2sc3 1 a A 1.3479123 -1.0642578 0.2479218 2 b B -0.1586587 1.1237456 -1.3952176 3 c C 2.7807484 -0.9778066 -1.9322279 # Use strsplit() here to break apart 'xp' using " " as the split # Use "[" in sapply() to get the second (2) element from each # 'xp' list pair. Note that I use as.character() here, since xfg$xp is # a factor. # See the output of: strsplit(as.character(xfg$xp), " ") # for some insight into this approach xp.split <- sapply(strsplit(as.character(xfg$xp), " "), "[", 2) # show post split values > xp.split [1] "A" "B" "C" # Now cbind it all together into a new data frame # don't include column 1 from xfg (xp) xfg.new <- cbind(xp.split, xfg[, -1]) > xfg.new xp.splitsc1sc2sc3 1A 1.3479123 -1.0642578 0.2479218 2B -0.1586587 1.1237456 -1.3952176 3C 2.7807484 -0.9778066 -1.9322279 See ?strsplit for more information. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Matrix oriented computing
Prof. Ripley, Excellent point. Neither sapply() nor outer() are the "elephant in the room" in this situation. On Fri, 2005-08-26 at 16:55 +0100, Prof Brian Ripley wrote: > Try profiling. Doing this many times to get an overview, e.g. for sapply > with df=1:1000: > > % self% total > self secondstotalsecondsname > 98.26 6.78 98.26 6.78 "FUN" >0.58 0.04 0.58 0.04 "unlist" >0.29 0.02 0.87 0.06 "as.vector" >0.29 0.02 0.58 0.04 "names<-" >0.29 0.02 0.29 0.02 "names<-.default" >0.29 0.02 0.29 0.02 "names" > > so almost all the time is in qchisq. > > > On Fri, 26 Aug 2005, Marc Schwartz (via MN) wrote: > > > On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote: > >> Marc Schwartz <[EMAIL PROTECTED]> writes: > >> > >>> x <- c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, > >>>0.95, 0.975, 0.99, 0.995) > >>> > >>> df <- c(1:100) > >>> > >>> mat <- sapply(x, qchisq, df) > >>> > >>>> dim(mat) > >>> [1] 100 11 > >>> > >>>> str(mat) > >>> num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ... > >> > >> outer() is perhaps a more natural first try... It does give the > >> transpose of the sapply approach, though. > >> > >> round(t(outer(x,df,qchisq)),2) > >> > >> should be close. You should likely add dimnames. > > > > > > > > What I find interesting, is that I would have intuitively expected > > outer() to be faster than sapply(). However: > > > > > >> system.time(mat <- sapply(x, qchisq, df), gcFirst = TRUE) > > [1] 0.01 0.00 0.01 0.00 0.00 > > > >> system.time(mat1 <- round(t(outer(x, df, qchisq)), 2), > > gcFirst = TRUE) > > [1] 0.01 0.00 0.01 0.00 0.00 > > > > # No round() or t() to test for overhead > >> system.time(mat2 <- outer(x, df, qchisq), gcFirst = TRUE) > > [1] 0.01 0.00 0.02 0.00 0.00 > > > > > > # Bear in mind the round() on mat1 above > >> all.equal(mat, mat1) > > [1] "Mean relative difference: 4.905485e-05" > > > >> all.equal(mat, t(mat2)) > > [1] TRUE > > > > > > Even when increasing the size of 'df' to 1:1000: > > > > > >> system.time(mat <- sapply(x, qchisq, df), gcFirst = TRUE) > > [1] 0.16 0.01 0.16 0.00 0.00 > > > >> system.time(mat1 <- round(t(outer(x, df, qchisq)), 2), gcFirst = > > TRUE) > > [1] 0.16 0.00 0.18 0.00 0.00 > > > >> # No round() or t() to test for overhead > >> system.time(mat2 <- outer(x, df, qchisq), gcFirst = TRUE) > > [1] 0.16 0.01 0.17 0.00 0.00 > > > > > > > > It also seems that, at least in this case, t() and round() do not add > > much overhead. > > Definitely not for such small matrices. True and both are C functions, which of course helps as well. Best regards, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Matrix oriented computing
On Fri, 2005-08-26 at 15:25 +0200, Peter Dalgaard wrote: > Marc Schwartz <[EMAIL PROTECTED]> writes: > > > x <- c(0.005, 0.010, 0.025, 0.05, 0.1, 0.5, 0.9, > >0.95, 0.975, 0.99, 0.995) > > > > df <- c(1:100) > > > > mat <- sapply(x, qchisq, df) > > > > > dim(mat) > > [1] 100 11 > > > > > str(mat) > > num [1:100, 1:11] 3.93e-05 1.00e-02 7.17e-02 2.07e-01 4.12e-01 ... > > outer() is perhaps a more natural first try... It does give the > transpose of the sapply approach, though. > > round(t(outer(x,df,qchisq)),2) > > should be close. You should likely add dimnames. What I find interesting, is that I would have intuitively expected outer() to be faster than sapply(). However: > system.time(mat <- sapply(x, qchisq, df), gcFirst = TRUE) [1] 0.01 0.00 0.01 0.00 0.00 > system.time(mat1 <- round(t(outer(x, df, qchisq)), 2), gcFirst = TRUE) [1] 0.01 0.00 0.01 0.00 0.00 # No round() or t() to test for overhead > system.time(mat2 <- outer(x, df, qchisq), gcFirst = TRUE) [1] 0.01 0.00 0.02 0.00 0.00 # Bear in mind the round() on mat1 above > all.equal(mat, mat1) [1] "Mean relative difference: 4.905485e-05" > all.equal(mat, t(mat2)) [1] TRUE Even when increasing the size of 'df' to 1:1000: > system.time(mat <- sapply(x, qchisq, df), gcFirst = TRUE) [1] 0.16 0.01 0.16 0.00 0.00 > system.time(mat1 <- round(t(outer(x, df, qchisq)), 2), gcFirst = TRUE) [1] 0.16 0.00 0.18 0.00 0.00 > # No round() or t() to test for overhead > system.time(mat2 <- outer(x, df, qchisq), gcFirst = TRUE) [1] 0.16 0.01 0.17 0.00 0.00 It also seems that, at least in this case, t() and round() do not add much overhead. Best regards, Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help on retrieving output from by( ) for regression
Sorry to reply to my own post, but in reviewing the NAMESPACE file for lme4, it looks like Doug is perhaps planning to add additional model object accessor methods for the lmList class, including resid() and summary(), which are commented out now. coef() is available presently. So when in place, it would appear that the use of lmList would be advantageous over the use of by(). Marc On Thu, 2005-08-25 at 11:52 -0500, Marc Schwartz (via MN) wrote: > That works also (using the example in ?lmList) > > library(lme4) > > ?lmList > > fm1 <- lmList(breaks ~ wool | tension, warpbreaks) > > > However, one still would need to use either sapply() or lapply() as > below to get the details that Krishna is looking for. > > 'fm1' above is a list of models (S4 class 'lmList') not overly different > from 'tmp' below, which is a list of models (S3 class 'by'). > > If you review str(fm1) and str(tmp), you would note that they are > virtually identical, save the use of slots, etc. > > HTH, > > Marc Schwartz > > On Thu, 2005-08-25 at 12:17 -0400, Randy Johnson wrote: > > What about using lmList from the lme4 package? > > > > Randy > > > > > > On 8/25/05 9:44 AM, "Marc Schwartz" <[EMAIL PROTECTED]> wrote: > > > > > Also, looking at the last example in ?by would be helpful: > > > > > > attach(warpbreaks) > > > tmp <- by(warpbreaks, tension, function(x) lm(breaks ~ wool, data = x)) > > > > > > # To get coefficients: > > > sapply(tmp, coef) > > > > > > # To get residuals: > > > sapply(tmp, resid) > > > > > > # To get the model matrix: > > > sapply(tmp, model.matrix) > > > > > > > > > > > > To get the summary() output, I suspect that using: > > > > > > lapply(tmp, summary) > > > > > > would yield more familiar output as compared to using: > > > > > > sapply(tmp, summary) > > > > > > The output from the latter might require a bit more "navigation" through > > > the resultant matrix, depending upon how the output is to be ultimately > > > used. > > > > > > HTH, > > > > > > Marc Schwartz > > > > > > > > > > > > On Thu, 2005-08-25 at 14:57 +0200, TEMPL Matthias wrote: > > >> Look more carefully at > > >> ?lm > > >> at the See Also section ... > > >> > > >> X <- rnorm(30) > > >> Y <- rnorm(30) > > >> lm(Y~X) > > >> summary(lm(Y~X)) > > >> > > >> Best, > > >> Matthias > > >> > > >> > > >>> Hi all > > >>> > > >>> I used a function > > >>>> qtrregr <- by(AB, AB$qtr, function(AB) lm(AB$X~AB$Y)) > > >>> > > >>> objective is to run a regression on quartery subsets in the > > >>> data set AB, having variables X and Y, grouped by variable qtr. > > >>> > > >>> Now i retrieved the output using qtrregr, however it only > > >>> showed the coefficients (intercept and B) with out > > >>> significant levels and residuals for each qtr. Can some on > > >>> help me on how can retrieve the detailed regression output. > > >>> > > >>> rgds > > >>> > > >>> snvk > > >>> > > > > > > __ > > > R-help@stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide! > > > http://www.R-project.org/posting-guide.html > > > > > > > ~~ > > Randy Johnson > > Laboratory of Genomic Diversity > > NCI-Frederick > > Bldg 560, Rm 11-85 > > Frederick, MD 21702 > > (301)846-1304 > > ~~ > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help on retrieving output from by( ) for regression
That works also (using the example in ?lmList) library(lme4) ?lmList fm1 <- lmList(breaks ~ wool | tension, warpbreaks) However, one still would need to use either sapply() or lapply() as below to get the details that Krishna is looking for. 'fm1' above is a list of models (S4 class 'lmList') not overly different from 'tmp' below, which is a list of models (S3 class 'by'). If you review str(fm1) and str(tmp), you would note that they are virtually identical, save the use of slots, etc. HTH, Marc Schwartz On Thu, 2005-08-25 at 12:17 -0400, Randy Johnson wrote: > What about using lmList from the lme4 package? > > Randy > > > On 8/25/05 9:44 AM, "Marc Schwartz" <[EMAIL PROTECTED]> wrote: > > > Also, looking at the last example in ?by would be helpful: > > > > attach(warpbreaks) > > tmp <- by(warpbreaks, tension, function(x) lm(breaks ~ wool, data = x)) > > > > # To get coefficients: > > sapply(tmp, coef) > > > > # To get residuals: > > sapply(tmp, resid) > > > > # To get the model matrix: > > sapply(tmp, model.matrix) > > > > > > > > To get the summary() output, I suspect that using: > > > > lapply(tmp, summary) > > > > would yield more familiar output as compared to using: > > > > sapply(tmp, summary) > > > > The output from the latter might require a bit more "navigation" through > > the resultant matrix, depending upon how the output is to be ultimately > > used. > > > > HTH, > > > > Marc Schwartz > > > > > > > > On Thu, 2005-08-25 at 14:57 +0200, TEMPL Matthias wrote: > >> Look more carefully at > >> ?lm > >> at the See Also section ... > >> > >> X <- rnorm(30) > >> Y <- rnorm(30) > >> lm(Y~X) > >> summary(lm(Y~X)) > >> > >> Best, > >> Matthias > >> > >> > >>> Hi all > >>> > >>> I used a function > qtrregr <- by(AB, AB$qtr, function(AB) lm(AB$X~AB$Y)) > >>> > >>> objective is to run a regression on quartery subsets in the > >>> data set AB, having variables X and Y, grouped by variable qtr. > >>> > >>> Now i retrieved the output using qtrregr, however it only > >>> showed the coefficients (intercept and B) with out > >>> significant levels and residuals for each qtr. Can some on > >>> help me on how can retrieve the detailed regression output. > >>> > >>> rgds > >>> > >>> snvk > >>> > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > ~~ > Randy Johnson > Laboratory of Genomic Diversity > NCI-Frederick > Bldg 560, Rm 11-85 > Frederick, MD 21702 > (301)846-1304 > ~~ > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Remove NAs from Barplot
On Wed, 2005-08-24 at 13:06 -0400, Doran, Harold wrote: > Dear List: > > I'm creating a series of barplots using Sweave that must assume a > standard format. This is student achievement data and the x-axis must > include all grades 3 to 8. In some cases, the data for a grade (or more > than one grade) are missing in the vector math.bar, but are never > missing for the vector apmxpmeet. The following sample code illustrates > the issue. > > Using the code below to create the plot works fine. However, the > following command is designed to place the data onto the plot: > > text(math.barplot, graph.max+5, format(tot), xpd = TRUE, col = c("blue", > "orange") ) > > This does work, but, it also places 'NA's on the plot. Is there a way to > place the data onto the plot such that the numbers appear, but for the > NAs not to appear? I've searched through ?text but was not able to find > a solution. > > math.bar <- c(NA,NA,NA,NA,55,48) > apmxpmeet <- c(47, 50, 49, 50, 49, 46) > > par(ps=10) > math.bar <- rbind(math.bar, apmxpmeet) > math.barplot <- barplot(math.bar, beside=T, col=c('blue','orange'), > names=c('Grade \n 3','Grade \n 4','Grade \n 5','Grade \n 6','Grade \n > 7','Grade \n 8'), > ylim=c(0,120), ylab="Percentage", xlab="Grade Level") > tot <- round(math.bar,digits=0) > graph.max <- max(math.bar, apmxpmeet, na.rm=T) > text(math.barplot, graph.max+5, format(tot), xpd = TRUE, col = c("blue", > "orange") ) > tmp$sch_nameS05 <- as.character(tmp$sch_nameS05) > legend(2,120,legend=(c(tmp$sch_nameS05, "Average")), > fill=c("blue","orange")) > > Thanks, > Harold Harold, Get rid of the 'format(tot)' in your text() call: text(math.barplot, graph.max+5, tot, xpd = TRUE, col = c("blue", "orange")) By using format(tot), you are coercing the values to text. Thus: > format(tot) [,1] [,2] [,3] [,4] [,5] [,6] math.bar "NA" "NA" "NA" "NA" "55" "48" apmxpmeet "47" "50" "49" "50" "49" "46" versus: > tot [,1] [,2] [,3] [,4] [,5] [,6] math.barNA NA NA NA 55 48 apmxpmeet 47 50 49 50 49 46 In the first case, text() will draw "NA", whereas in the second, the NA is ignored. Also, your code above is not entirely reproducible, since: > tmp$sch_nameS05 <- as.character(tmp$sch_nameS05) Error: Object "tmp" not found HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Seeking help with an apparently simple recoding problem
On Tue, 2005-08-23 at 10:12 -0500, Greg Blevins wrote: > Hello, > > I have struggled, for longer than I care to admit, with this seemingly > simple problem, but I cannot find a solution other than the use of > long drawn out ifelse statements. I know there has to be a better > way. Here is stripped down version of the situation: > > I start with: > a <- c(1,0,1,0,0,0,0) > b <- c(1,1,1,1,0,0,0) > c <- c(1,1,0,1,0,0,0) > > rbind(a,b,c) > [,1] [,2] [,3] [,4] [,5] [,6] [,7] > a1010000 > b1111000 > c1101000 > > I refer to column 3 as the target column, which at the end of the day > will be NA in all instances. > > The logic involved: > > 1) If columns 2, 4 thru 7 do NOT include at least one '1', then recode > columns 2 thru 7 to NA and recode column 1 to code 2. > > 2) If columns 2, 4 thru 7 contain at least one '1', then recode column > 3 to NA. > > Desired recoding of the above three rows: > [,1][,2][,3][,4][,5][,6][,7] > a2NA NA NA NA NA NA > b11 NA 1 0 0 0 > c11 NA 1 0 0 0 > > Thanks you. You left out one key detail in the explanation, which is that the recoding appears to be done on a row by row basis, not overall. The following gets the job done, though there may be a more efficient approach: > a <- c(1,0,1,0,0,0,0) > b <- c(1,1,1,1,0,0,0) > c <- c(1,1,0,1,0,0,0) > d <- rbind(a, b, c) > d [,1] [,2] [,3] [,4] [,5] [,6] [,7] a1010000 b1111000 c1101000 > mod.row <- function(x) { if (all(x[c(2, 4:7)] == 0)) { x[2:7] <- NA x[1] <- 2 } else { x[3] <- NA } x } > y <- t(apply(d, 1, mod.row)) > y [,1] [,2] [,3] [,4] [,5] [,6] [,7] a2 NA NA NA NA NA NA b11 NA1000 c11 NA1000 HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plotting with same axes
On Mon, 2005-08-22 at 12:58 -0400, [EMAIL PROTECTED] wrote: > I have used the 'par' command to > overlay one plot on another. But how > do I overlay it with the x-values > plotted at the same points on the > x-axis? > > Thank you, > Steven The specific answer depends to an extent on the graphics you are plotting and whether there is a need to actually use par("new"). In some cases there is an 'add = TRUE' option to plotting functions that allows for this. In others, you can create the initial plot and then use lines(), points() or the like to simply add further plotting components to the existing plot. If all else fails and you need to use par("new"), then you will want to explicitly define the x and y axes to the same ranges by using the same 'xlim' and 'ylim' arguments in _both_ plotting functions. A reproducible example of what it is you are doing would enable us to give you more specific guidance, if the above is not helpful. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Copying rows from a matrix using a vector of indices
On Wed, 2005-08-17 at 12:02 -0700, Martin Lam wrote: > Hi, > > I am trying to use a vector of indices to select some > rows from a matrix. But before I can do that I somehow > need to convert 'combinations' into a list, since > 'mode(combinations)' says it's 'numerical'. Any idea > how I can do that? > > library("combinat") > > combinations <- t(combn(8,2)) > > indices <- c(sample(1:length(combinations),10)) > > # convert > ??? > > subset <- combinations[indices] > > Thanks in advance, > > Martin Your goal is not entirely clear. Given your code above, you end up with: > library("combinat") > > combinations <- t(combn(8,2)) > combinations [,1] [,2] [1,]12 [2,]13 [3,]14 [4,]15 [5,]16 [6,]17 [7,]18 [8,]23 [9,]24 [10,]25 [11,]26 [12,]27 [13,]28 [14,]34 [15,]35 [16,]36 [17,]37 [18,]38 [19,]45 [20,]46 [21,]47 [22,]48 [23,]56 [24,]57 [25,]58 [26,]67 [27,]68 [28,]78 combinations is a 28 x 2 matrix (or a 1d vector of length 56). Your use of sample() with '1:length(combinations)' is the same as 1:56. > length(combinations) [1] 56 Hence, you end up with: > set.seed(128) > indices <- sample(1:length(combinations), 10) > indices [1] 41 53 38 46 50 44 25 11 43 7 I suspect that you really want to use '1:nrow(combinations)', which yields 1:28. > nrow(combinations) [1] 28 Thus, > set.seed(128) > indices <- sample(1:nrow(combinations), 10) > indices [1] 21 26 18 22 23 20 11 5 27 3 Now, you can use: > subset <- combinations[indices, ] > subset [,1] [,2] [1,]47 [2,]67 [3,]38 [4,]48 [5,]56 [6,]46 [7,]26 [8,]16 [9,]68 [10,]14 which yields the rows from combinations, defined by 'indices'. If correct, please review "An Introduction to R" and ?"[" for more information on subsetting objects in R. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] recategorizing a vector into discrete states
On Tue, 2005-08-16 at 13:18 -0400, Allan Strand wrote: > Hi All, > > I'm trying to take a numerical vector and produce a new vector of the > same length where each element in the first is placed into a category > given by a 'breaks-like' vector. The values in the result should equal > the lower bounds of each category as defined in the breaks vector. > > I suspect that a vectorized solution is pretty simple, but I can't > seem to figure it out today. Here is an example of my problem: > > Vector 'a' is the original vector. Vector 'b' gives the lower bounds > of the categories. Vector 'c' is the result I am seeking. > > a <- c(0.9, 11, 1.2, 2.4, 4.0, 5.0, 7.3, 8.1, 3.3, 4.5) > b <- c(0, 2, 4, 6, 8) > > c <- c(0, 8, 0, 2, 4, 4, 6, 8, 2, 4) > > Any suggestions would be greatly appreciated. > > cheers, > Allan Strand See ?cut > a <- c(0.9, 11, 1.2, 2.4, 4.0, 5.0, 7.3, 8.1, 3.3, 4.5) > b <- c(0, 2, 4, 6, 8) > cut(a, c(b, Inf), labels = b, right = FALSE) [1] 0 8 0 2 4 4 6 8 2 4 Levels: 0 2 4 6 8 Note that cut() returns a factor. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] use different symbols for frequency in a plot
On Mon, 2005-08-08 at 11:57 -0700, Kerry Bush wrote: > suppose I have the following data > > x<-c(rep(.1,5),rep(.2,6),rep(.4,10),rep(.5,20)) > y<-c(rep(.5,3),rep(.6,8),rep(1.2,8),rep(2.5,18),rep(3,4)) > > If I plot(x,y) in R, I will only get seven distinct > points. What I want to do is to use different symbols > to show the frequency at each point. > > e.g. if the frequncey is between 1 and 5, then I plot > the point as a circle; if the frequency is between 6 > and 10, then I plot the point as a square; if the > frequency is above 10, then I plot the point as a > triangle. > > I am not sure how to do this in R. Can anybody help me? You might want to review this recent post by Deepayan Sarkar: https://stat.ethz.ch/pipermail/r-help/2005-July/074042.html with modest modification you can replace his example, which plots the frequencies with: x <- c(rep(.1,5),rep(.2,6),rep(.4,10),rep(.5,20)) y <- c(rep(.5,3),rep(.6,8),rep(1.2,8),rep(2.5,18),rep(3,4)) temp <- data.frame(x, y) foo <- subset(as.data.frame(table(temp)), Freq > 0) > foo x y Freq 1 0.1 0.53 5 0.1 0.62 6 0.2 0.66 11 0.4 1.28 15 0.4 2.52 16 0.5 2.5 16 20 0.5 34 # Use cut() to create the bins and specify the plotting symbols # for each bin, which are the 'label' values foo$sym <- with(foo, cut(Freq, c(0, 5, 10, Inf), labels = c(21, 22, 24))) # convert 'foo' to all numeric from factors above for plotting foo <- apply(foo, 2, function(x) as.numeric(as.character(x))) > foo x y Freq sym 1 0.1 0.53 21 5 0.1 0.62 21 6 0.2 0.66 22 11 0.4 1.28 22 15 0.4 2.52 21 16 0.5 2.5 16 24 20 0.5 34 21 # Now do the plot. Keep in mind that 'foo' is now # a matrix, rather than a data frame plot(foo[, "x"], foo[, "y"], pch = foo[, "sym"]) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] elegant solution to transform vector into percentages?
On Tue, 2005-07-26 at 15:48 -0400, [EMAIL PROTECTED] wrote: > Hi, > > I am looking for an elegant way to transform a vector into percentages of > values > that meet certain criteria. > > store<-c(1,1.4,3,1.1,0.3,0.6,4,5) > > # now I want to get the precentages of values > # that fall into the categories <=M , >M & <=N , >N > # let > M <-.8 > N <- 1.2 > # In my real example I have many more of these cutoff-points > > # What I did is: > > out <- matrix(NA,1,3) > > out[1,1] <- ( (sum(store<=M)) /length(store) )*100 > out[1,2] <- ( (sum(store> M & store<= N )) /length(store) )*100 > out[1,3] <- ( (sum(store> N)) /length(store) )*100 > > colnames(out)<-c("percent<=M","percent>M & <=N","percent>N") > out > > But this gets very tedious if I have many cutoff-points. Does anybody know a > more elegant way to do this task? > > Thanks so much. > > Cheers, > Jens Something alone the lines of: store <- c(1, 1.4, 3, 1.1, 0.3, 0.6, 4, 5) M <- 0.8 N <- 1.2 x <- hist(store, br = c(min(store), M, N, max(store)), plot = FALSE)$counts pct.x <- prop.table(x) * 100 names(pct.x) <- c("percent <= M","percent > M & <= N","percent > N") > pct.x percent <= M percent > M & <= Npercent > N 25 25 50 I think that should do it. See ?hist for more information and take note of the 'include.lowest' and 'right' arguments relative to whether or not values are or are not included in the specified intervals. See ?prop.table as well. Also be acutely aware of potential problems with exact equality comparisons with floating point numbers and the break points...if you have a float value equal to a breakpoint in your vector. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plotting horizontally
On Tue, 2005-07-26 at 08:58 +0700, [EMAIL PROTECTED] wrote: > Hello, > > Is there any way to use plot() horizontally similar to > boxplot(., horiz=TRUE)? I want to use to illustrate > the distribution of y-values on an adjacent plot using > layout(). > > Thanks in advance for any help. > > Regards, > > --Dan Have you looked at the last example in ?layout, which has a scatterplot with marginal histograms for the x and y axes? Alternatively, you can generally reverse the x and y values for most plots. For example, compare: d <- density(rnorm(100)) # Vertical density plot plot(d) # Horizontal density plot plot(d$y, d$x, type = "l", xlab = "Density", main = "density(y = rnorm(100))", ylab = paste("N =", d$n, " Bandwidth =", formatC(d$bw))) HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Question about 'text' (add lm summary to a plot)
Ok guys, So I played around with this a bit, going back to Dan's original requirements and using Thomas' do.call() approach with legend(). Gabor's approach using sapply() will also work here. I have the following: # Note the leading spaces here for alignment in the table # This could be automated with formatC() or sprintf() my.slope.1 <- " 3.22" my.slope.2 <- "0.13" my.inter.1 <- "-10.66" my.inter.2 <- "1.96" my.Rsqua <-" 0.97" plot(1:5) L <- list("Intercept:", "Slope:", bquote(paste(R^2, ":")), bquote(.(my.inter.1) %+-% .(my.inter.2)), bquote(.(my.slope.1) %+-% .(my.slope.2)), bquote(.(my.Rsqua))) par(family = "mono") legend("topleft", legend = do.call("expression", L), ncol = 2) Note however, that while using the mono font helps with vertical alignment of numbers, the +/- sign still comes out in the default font, which is bold[er] than the text. If one uses the default font, which is variable spaced, it is problematic to get the proper alignment for the numbers. I even tried using phantom(), but that didn't quite get it, since the spacing is variable, as opposed to LaTeX's mono numeric spacing with default fonts. Also, note that I am only using two columns, rather than three, since trying to place the ":" as a middle column results in spacing that is too wide, given that the text.width argument is a scalar and is set to the maximum width of the character vectors. Note also that even with mono spaced fonts, the exponent in R^2 is still horizontally smaller than the other characters. Thus, spacing on that line may also be affected depending upon what else one might attempt. Not sure where else to go from here. HTH, Marc Schwartz On Fri, 2005-07-22 at 14:01 -0400, Gabor Grothendieck wrote: > You are right. One would have to use do.call as you did > or the sapply method of one of my previous posts: > > a <- 7 > plot(1) > L <- list(bquote(alpha==.(a)),bquote(alpha^2+1==.(a^2+1))) > legend("topleft",legend=sapply(L, as.expression)) > > > On 7/22/05, Thomas Lumley <[EMAIL PROTECTED]> wrote: > > On Fri, 22 Jul 2005, Gabor Grothendieck wrote: > > > > > > I think legend accepts a list argument directly so that could be > > > simplified to just: > > > > > > a<-7 > > > plot(1) > > > L <- list(bquote(alpha==.(a)),bquote(alpha^2+1==.(a^2+1))) > > > legend("topleft",legend=L) > > > > Except that it wouldn't then work: the mathematical stuff comes out as > > text. > > > > > The same comment seems to apply to my prior suggestion about > > > as.expression(bquote(...)), namely that one can just write the > > > following as text also supports a list argument: > > > > And this doesn't work either: you end up with %+-% rather than the > > plus-or-minus symbol. > > > > The reason I gave the do.call() version is that I had tried these simpler > > versions and they didn't work. > > > >-thomas > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problems with submitting an eps-file created in R
On Fri, 2005-07-22 at 16:26 +0200, Christian Bieli wrote: > Dear all > > I've got some problems submitting a manuscript, because I can't > manage > creating the favourable eps-file of a graph created in R. The > journal's > graphic requirements are as followed: > format: eps > width: max. 6 inches > resolution: min. 1000 dpi > supported fonts: Arial, Courier, Helvetica, Symbol, Times, Charcoal, > Chicago, Geneva, Georgia, Monaco, Zapf, New York > > Itried to ways of getting appropriate file: > > 1.Creating eps-file in R by drawing into a x11-device and then: > /dev.copy2eps(file = "file.eps", onefile = TRUE, paper = "a4", family > = > "Helvetica", pointsize=1, print.it = FALSE, fonts = "Helvetica") > > /2. Generating a postscript-file in R with / > //postscript(file = "file.ps", onefile = TRUE, paper = "a4", family = > "Helvetica", width = 6, height = 2.2, pointsize=1, print.it = FALSE, > fonts = "Helvetica")/ > an trying to convert it in ghostview. > > Neither approach brought the favoured result. The error message I got > from the quality checking program was : > > Warning: Document is Missing Non-Standard Font > > One or more non-standard fonts used in this image is not embedded. > Standard fonts are: Arial, Courier, Helvetica, Symbol, Times, > Charcoal, Chicago, Geneva, Georgia, Monaco, Zapf, New York. > > In order to repair this problem, save your document with fonts > embedded. > > Error: Missing Fonts > > Challenge > One or more linked or used fonts cannot be found. This is caused by > DigitalExpert not being able to locate fonts on your system. All fonts > used in the document must be active or have a defined and valid path > so that DigitalExpert can find them. > > Solution > The only way to repair this problem is to make the fonts available to > DigitalExpert, and then reprocess the files. In order to make fonts > active, either activate them using a font management program, or move > them into the system:fonts folder. > > > > 1. Obviously there's a font problem. Helvetica IS installed in my OS > (windows 2000). Why "DigitalExpert" does not recognize the > "Helvetica" > font as a standard font? I thought the fonts option would embed the > specified font in the file. Am I right or is there another way to > embed > fonts? > 2. Is there a way to set up the resolution of the created ps/eps-file > in > R (until now I did the settings in ghostview)? > 3. I did not find particulars about the pointsize option. What is the > effect of changing the pointsize value? > > With best regards > Christian You need to read the help for postscript(), which specifically tells you in the Details section to use: postscript(..., onefile = FALSE, horizontal = FALSE, paper = "special") to generate EPS files. There is no resolution setting for a postscript file per se, since PS is a device independent vector based format and the resolution of the output is dependent upon the target device/viewer. One exception to this would be the embedding of a bitmapped object in a PS file, but that is not applicable here. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Question about 'text' (add lm summary to a plot)
[Note: the initial posts have been re-arranged to attempt to maintain the flow from top to bottom] > >Dan Bolser writes: > > > > > > I would like to annotate my plot with a little box containing the slope, > > > intercept and R^2 of a lm on the data. > > > > > > I would like it to look like... > > > > > > ++ > > > | Slope : 3.45 +- 0.34 | > > > | Intercept : -10.43 +- 1.42 | > > > | R^2 : 0.78 | > > > ++ > > > > > > However I can't make anything this neat, and I can't find out how to > > > combine this with symbols for R^2 / +- (plus minus). > > > > > > Below is my best attempt (which is franky quite pour). Can anyone > > > improve on the below? > > > > > > Specifically, > > > > > > aligned text and numbers, > > > aligned decimal places, > > > symbol for R^2 in the text (expression(R^2) seems to fail with > > > 'paste') and +- > > > > > > > > > > > > Cheers, > > > Dan. > > > > > > > > > dat.lm <- lm(dat$AVG_CH_PAIRS ~ dat$CHAINS) > > > > > > abline(coef(dat.lm),lty=2,lwd=1.5) > > > > > > > > > dat.lm.sum <- summary(dat.lm) > > > dat.lm.sum > > > > > > attributes(dat.lm.sum) > > > > > > my.text.1 <- > > > paste("Slope : ", round(dat.lm.sum$coefficients[2],2), > > > "+/-", round(dat.lm.sum$coefficients[4],2)) > > > > > > my.text.2 <- > > > paste("Intercept : ", round(dat.lm.sum$coefficients[1],2), > > > "+/-", round(dat.lm.sum$coefficients[3],2)) > > > > > > my.text.3 <- > > > paste("R^2 : ", round(dat.lm.sum$r.squared,2)) > > > > > > my.text.1 > > > my.text.2 > > > my.text.3 > > > > > > > > > ## Add legend > > > text(x=3, > > > y=300, > > > paste(my.text.1, > > >my.text.2, > > >my.text.3, > > >sep="\n"), > > > adj=c(0,0), > > > cex=1 > On Thu, 21 Jul 2005, Christoph Buser wrote: > > >Dear Dan > > > >I can only help you with your third problem, expression and > >paste. You can use: > > > >plot(1:5,1:5, type = "n") > >text(2,4,expression(paste("Slope : ", 3.45%+-%0.34, sep = "")), pos = 4) > >text(2,3.8,expression(paste("Intercept : ", -10.43%+-%1.42)), pos = 4) > >text(2,3.6,expression(paste(R^2,": ", "0.78", sep = "")), pos = 4) > >I do not have an elegant solution for the alignment. On Thu, 2005-07-21 at 19:55 +0100, Dan Bolser wrote: > Cheers for this. > > I was trying to get it to work, but the problem is that I need to replace > the values above with variables, from the following code... > > > dat.lm <- lm(dat$AVG_CH_PAIRS ~ dat$CHAINS) > dat.lm.sum <- summary(dat.lm) > > my.slope.1 <- round(dat.lm.sum$coefficients[2],2) > my.slope.2 <- round(dat.lm.sum$coefficients[4],2) > > my.inter.1 <- round(dat.lm.sum$coefficients[1],2) > my.inter.2 <- round(dat.lm.sum$coefficients[3],2) > > my.Rsqua.1 <- round(dat.lm.sum$r.squared,2) > > > Anything I try results in either the words 'paste("Slope:", my.slope.1, > %+-%my.slope.2,sep="")' being written to the plot, or just > 'my.slope.1+-my.slope2' (where the +- is correctly written). > > I want to script it up and write all three lines to the plot with > 'sep="\n"', rather than deciding three different heights. > Thanks very much for what you gave, its a good start for me to figure out > how I am supposed to be telling R what to do! > > Any way to just get fixed width fonts with text? (for the alignment > problem) Dan, Here is one approach. It may not be the best, but it gets the job done. You can certainly take this and encapsulate it in a function to automate the text/box placement and to pass values as arguments. A couple of quick concepts: 1. As far as I know, plotmath cannot do multiple lines, so each line in your box needs to be done separately. 2. The horizontal alignment is a bit problematic when using expression() or bquote() since I don't believe that multiple spaces are honored as such after parsing. Thus I break up each component (label, ":" and values) into separate text() calls. The labels are left justified. 3. The alignment for the numeric values are done with right justification. So, as long as you use a consistent number of decimals in the value outputs (2 here), you should be OK. This means you might need to use formatC() or sprintf() to control the numeric output values on either side of the +/- sign. 4. In the variable replacement, note the use of substitute() and the list of x and y arguments as replacement values in the expressions. # Set your values my.slope.1 <- 3.45 my.slope.2 <- 0.34 my.inter.1 <- -10.43 my.inter.2 <- 1.42 my.Rsqua <- 0.78 # Create the initial plot as per Christoph's post plot(1:5, 1:5, type = "n") #- # Do the Slope #- text(1, 4.5, "Slope", pos = 4) text(2, 4.5, ":") text(3, 4.5, substitute(x %+-% y, list(x = my.slope.1, y = my.slope.2)), pos =
Re: [R] R:plot and dots
On Thu, 2005-07-21 at 16:18 +0200, Clark Allan wrote: > hi all > > a very simple question. > > i have plot(x,y) > > but i would like to add in on the plot the observation number associated > with each point. > > how can this be done? > > / > allan If you mean the unique observation number associated with each x,y pair, you can use: text(x, y, labels = ObsNumberVector, pos = 3) after the plot(x, y) call: df <- data.frame(x = rnorm(10), y = rnorm(10), ID = 1:10) with(df, plot(x, y)) with(df, text(x, y, labels = ID, pos = 3)) See ?text for more information. Note that I used pos = 3 which places the label above the data point. There are other positioning parameters available, which are noted in the help file. Note also that you might have to adjust the plot axis limits depending upon where you place the text and your extreme points. If you mean the frequency of each x,y pair (if there is more than one observation per x,y pair), you might want to review Deepayan's recent post here: https://stat.ethz.ch/pipermail/r-help/2005-July/074042.html HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] .gct file
On Tue, 2005-07-19 at 13:08 -0500, Marc Schwartz (via MN) wrote: > For the TAB delimited columns, adjust the 'sep' argument to: > > read.table("data.gct", skip = 2, header = TRUE, sep = "\t") > > The 'quote' argument is by default: > > quote = "\"'" > > which should take care of the quoted strings and bring them in as a > single value. > > The above presumes that the header row is also TAB delimited. If not, > you may have to set 'skip = 3' to skip over the header row and manually > set the column names. One correction. If the final para applies and you need to use 'skip = 3', you would also need to leave out the 'header = TRUE' argument, which defaults to FALSE. Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] .gct file
For the TAB delimited columns, adjust the 'sep' argument to: read.table("data.gct", skip = 2, header = TRUE, sep = "\t") The 'quote' argument is by default: quote = "\"'" which should take care of the quoted strings and bring them in as a single value. The above presumes that the header row is also TAB delimited. If not, you may have to set 'skip = 3' to skip over the header row and manually set the column names. HTH, Marc Schwartz On Tue, 2005-07-19 at 13:52 -0400, mark salsburg wrote: > This is all extremely helpful. > > The data turns out is a little atypical, the columns are tab-delemited > except for the description columns > > > DATA1.gct looks like this > > #1.2 > 23 3423 > NAME DESCRIPTION VALUE > gene1 "a protein inducer" 1123 > . . .. > > How do I get R to read the data as tab delemited, but read in the 2nd > coloumn as one value based on the quotation marks.. > > thanks.. > > On 7/19/05, Marc Schwartz (via MN) <[EMAIL PROTECTED]> wrote: > > On Tue, 2005-07-19 at 13:16 -0400, mark salsburg wrote: > > > ok so the gct file looks like this: > > > > > > #1.2 (version number) > > > 7283 19 (matrix size) > > > Name Description Values > > > ... .. > > > > > > How can I tell R to disregard the first two lines and start reading > > > the 3rd line in this gct file. I would just delete them, but I do not > > > know how to open a gct. file > > > > > > thank you > > > > > > On 7/19/05, Duncan Murdoch <[EMAIL PROTECTED]> wrote: > > > > On 7/19/2005 12:10 PM, mark salsburg wrote: > > > > > I have two files to compare, one is a regular txt file that I can read > > > > > in no prob. > > > > > > > > > > The other is a .gct file (How do I read in this one?) > > > > > > > > > > I tried a simple > > > > > > > > > > read.table("data.gct", header = T) > > > > > > > > > > How do you suggest reading in this file?? > > > > > > > > > > > > > .gct is not a standard filename extension. You need to know what is in > > > > that file. Where did you get it? What program created it? > > > > > > > > Chances are the easiest thing to do is to get the program that created > > > > it to export in a well known format, e.g. .csv. > > > > > > > > Duncan Murdoch > > > > > > The above would be consistent with the info in my reply. > > > > I guess if the format is consistent, as per Mark's example above, you > > can use: > > > > read.table("data.gct", skip = 2, header = TRUE) > > > > which will start by skipping the first two lines and then reading in the > > header row and then the data. > > > > See ?read.table > > > > HTH, > > > > Marc Schwartz > > > > > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] .gct file
On Tue, 2005-07-19 at 13:16 -0400, mark salsburg wrote: > ok so the gct file looks like this: > > #1.2 (version number) > 7283 19 (matrix size) > Name Description Values > ... .. > > How can I tell R to disregard the first two lines and start reading > the 3rd line in this gct file. I would just delete them, but I do not > know how to open a gct. file > > thank you > > On 7/19/05, Duncan Murdoch <[EMAIL PROTECTED]> wrote: > > On 7/19/2005 12:10 PM, mark salsburg wrote: > > > I have two files to compare, one is a regular txt file that I can read > > > in no prob. > > > > > > The other is a .gct file (How do I read in this one?) > > > > > > I tried a simple > > > > > > read.table("data.gct", header = T) > > > > > > How do you suggest reading in this file?? > > > > > > > .gct is not a standard filename extension. You need to know what is in > > that file. Where did you get it? What program created it? > > > > Chances are the easiest thing to do is to get the program that created > > it to export in a well known format, e.g. .csv. > > > > Duncan Murdoch The above would be consistent with the info in my reply. I guess if the format is consistent, as per Mark's example above, you can use: read.table("data.gct", skip = 2, header = TRUE) which will start by skipping the first two lines and then reading in the header row and then the data. See ?read.table HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] .gct file
On Tue, 2005-07-19 at 12:28 -0400, Duncan Murdoch wrote: > On 7/19/2005 12:10 PM, mark salsburg wrote: > > I have two files to compare, one is a regular txt file that I can read > > in no prob. > > > > The other is a .gct file (How do I read in this one?) > > > > I tried a simple > > > > read.table("data.gct", header = T) > > > > How do you suggest reading in this file?? > > > > .gct is not a standard filename extension. You need to know what is in > that file. Where did you get it? What program created it? > > Chances are the easiest thing to do is to get the program that created > it to export in a well known format, e.g. .csv. > > Duncan Murdoch A quick Google search would suggest "Gene Cluster Text" file: http://www.broad.mit.edu/cancer/software/genepattern/tutorial/gp_tutorial_fileformats.html#gct produced by Gene Pattern: http://www.broad.mit.edu/cancer/software/genepattern/ If correct, I would point Mark to the Bioconductor folks for more information and assistance: http://www.bioconductor.org/ HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] using argument names (of indeterminate number) within a function
On Tue, 2005-07-19 at 17:29 +0200, Dirk Enzmann wrote: > Although I tried to find an answer in the manuals and archives, I cannot > solve this (please excuse that my English and/or R programming skills > are not good enough to state my problem more clearly): > > I want to write a function with an indeterminate (not pre-defined) > number of arguments and think that I should use the "..." construct and > the match.call() function. The goal is to write a function that (among > other things) uses cbind() to combine a not pre-defined number of > vectors specified in the function call. For example, if my vectors are > x1, x2, x3, ... xn, within the function I want to use cbind(x1, x2) or > cbind(x1, x3, x5) or ... depending on the vector names I use in the > funcion call. Additionally, the function has other arguments. > > In the archives I found the following thread (followed by Marc Schwartz) > > http://finzi.psych.upenn.edu/R/Rhelp02a/archive/15186.html > [R] returning argument names from Peter Dalgaard BSA on 2003-04-10 (stdin) > > that seems to contain the solution to my problem, but I am stuck because > sapply(match.call()[-1], deparse) gives me a vector of strings and I > don't know how to use the names in this vector in the cbind() function. > > Up to now my (clearly deficit) function looks like: > > test <- function(..., mvalid=1) > { >args = sapply(match.call()[-1], deparse) > # and here, I don't know how the vector names in args > # can be used in the cbind() function to follow: > # > # temp <- cbind( ??? >if (mvalid > 1) >{ > # here it goes on >} > } > > Ultimately, I want that the function can be called like > test(x1,x2,mvalid=1) > or > test(x1,x3,x5,mavlid=2) > and that within the function > cbind(x1,x2) > or cbind(x1,x3,x5) > will be used. > > Can someone give and explain an example / a solution on how to proceed? Hi Dirk, How about this: my.cbind <- function(...) { do.call("cbind", list(...)) } > a <- 1:10 > b <- 11:20 > c <- 21:30 > d <- 31:40 > my.cbind(a, b) [,1] [,2] [1,]1 11 [2,]2 12 [3,]3 13 [4,]4 14 [5,]5 15 [6,]6 16 [7,]7 17 [8,]8 18 [9,]9 19 [10,] 10 20 > my.cbind(b, c, d) [,1] [,2] [,3] [1,] 11 21 31 [2,] 12 22 32 [3,] 13 23 33 [4,] 14 24 34 [5,] 15 25 35 [6,] 16 26 36 [7,] 17 27 37 [8,] 18 28 38 [9,] 19 29 39 [10,] 20 30 40 > my.cbind(a, b, c, d) [,1] [,2] [,3] [,4] [1,]1 11 21 31 [2,]2 12 22 32 [3,]3 13 23 33 [4,]4 14 24 34 [5,]5 15 25 35 [6,]6 16 26 36 [7,]7 17 27 37 [8,]8 18 28 38 [9,]9 19 29 39 [10,] 10 20 30 40 The use of list(...) in the function allows you to use list based functions such as do.call() or lapply() against the argument objects directly without having to deparse and re-parse the character names of the arguments, which is the approach that Peter used in his response in the thread you referenced. The OP in that thread wanted the argument names as character vectors, as opposed to the argument objects themselves, which is what you need here. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Can't get sample function from "An Introduction to R" to work
On Fri, 2005-07-15 at 15:40 -0500, David Groos wrote: > I'm trying to figure out R, a piece at a time, hours at a time... I > was trying to copy the sample function in, "An Introduction to R" (for > version 2.1.0) by W. N. Venables, D. M. Smith, page 42. Section 10.1 > "Simple examples" provides a sample function which I tried to duplicate > (I'm using Mac OS X 10.3.9, and "R for Mac OS X Aqua GUI v1.11). The > following is what I typed and the last line is R's response when I hit > the return key after the penultimate line. I've re-checked and > re-typed the code many times to no avail. I wasn't able to find this > issue using search options, either. Any help is GREATLY appreciated! > > > twosam<-function(y1, y2) { > + n1<-length(y1);n2 <-length(y2) > + yb1<-mean(y1); yb2<-mean(y2) > + s1<-var(y1);s2<-var(y2) > + s<-((n1-1)*s1 + (n2-1)*s2)/(n1+n2-2) > + tst<-(yb1-yb2)/sqrt(s*(1/n1+1/n2)) > Error: syntax error > > David The code as you have above (without the "+" on each line) works for me both in ESS and in the R console under Linux. There should be another "+" on the next line, in anticipation of the remaining two lines: tst } Try to copy and paste the following into the console as is: twosam<-function(y1, y2) { n1 <- length(y1);n2 <- length(y2) yb1 <- mean(y1); yb2 <- mean(y2) s1 <- var(y1);s2 <- var(y2) s <- ((n1-1)*s1 + (n2-1)*s2)/(n1+n2-2) tst <- (yb1-yb2)/sqrt(s*(1/n1+1/n2)) and see what happens. You should be left at a: + on a new line again. It is possible that there is a bug in the Aqua GUI, but not using a Mac, I cannot replicate it. You might want to consider subscribing and posting to the R-SIG-Mac e-mail list, which is focused on Mac users of R. More information is here: https://stat.ethz.ch/mailman/listinfo/r-sig-mac HTH, Marc Schwartz P.S. Greetings from Eden Prairie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plot the number of replicates at the same point
On Thu, 2005-07-14 at 15:08 -0500, Deepayan Sarkar wrote: > On 7/14/05, Marc Schwartz (via MN) <[EMAIL PROTECTED]> wrote: > > On Thu, 2005-07-14 at 12:30 -0500, Deepayan Sarkar wrote: > > > On 7/14/05, Kerry Bush <[EMAIL PROTECTED]> wrote: > > > > Thank you for thinking about the problem for me. > > > > However, I have found that your method doesn't work at > > > > all. > > > > > > > > You may test the following example: > > > > > > > > x1=c(0.6,0.4,.4,.4,.2,.2,.2,0,0) > > > > x2=c(0.4,.2,.4,.6,0,.2,.4,0,.2) > > > > x1=rep(x1,4) > > > > x2=rep(x2,4) > > > > temp=data.frame(x1,x2) > > > > temp1=table(temp) > > > > plot(temp$x1,temp$x2,cex=0) > > > > text(as.numeric(rownames(temp1)), > > > > as.numeric(colnames(temp1)), temp1) > > > > > > > > what I got here is not what I wanted. You may compare > > > > with > > > > plot(x1,x2) > > > > > > > > I actually want some plots similar to what SAS proc > > > > plot produced. > > > > > > > > Does anybody have a clue of how to do this easily in > > > > R? > > > > > > Try this: > > > > > > > > > x1=c(0.6,0.4,.4,.4,.2,.2,.2,0,0) > > > x2=c(0.4,.2,.4,.6,0,.2,.4,0,.2) > > > x1=rep(x1,4) > > > x2=rep(x2,4) > > > temp=data.frame(x1,x2) > > > > > > foo <- subset(as.data.frame(table(temp)), Freq > 0) > > > foo$x1 <- as.numeric(as.character(foo$x1)) > > > foo$x2 <- as.numeric(as.character(foo$x2)) > > > > > > with(foo, plot(x1, x2, type = "n")) > > > with(foo, text(x1, x2, lab = Freq)) > > > > > > > > > -Deepayan > > > > > > Great solution Deepayan. I was just in the process of working on > > something similar. > > > > One quick modification is that the two plot()/text() functions can be > > combined into: > > > > with(foo, plot(x1, x2, pch = as.character(Freq))) > > Only as long as all(Freq < 10) (I've been bitten by this before. :-) ). > > Deepayan D'oh! Indeed you are correct, since 'pch' is a single character: plot(1:20, pch = as.character(1:20)) as opposed to: plot(1:20, type = "n") text(1:20, lab = 1:20) Thanks for the correction. Marc __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plot the number of replicates at the same point
On Thu, 2005-07-14 at 12:30 -0500, Deepayan Sarkar wrote: > On 7/14/05, Kerry Bush <[EMAIL PROTECTED]> wrote: > > Thank you for thinking about the problem for me. > > However, I have found that your method doesn't work at > > all. > > > > You may test the following example: > > > > x1=c(0.6,0.4,.4,.4,.2,.2,.2,0,0) > > x2=c(0.4,.2,.4,.6,0,.2,.4,0,.2) > > x1=rep(x1,4) > > x2=rep(x2,4) > > temp=data.frame(x1,x2) > > temp1=table(temp) > > plot(temp$x1,temp$x2,cex=0) > > text(as.numeric(rownames(temp1)), > > as.numeric(colnames(temp1)), temp1) > > > > what I got here is not what I wanted. You may compare > > with > > plot(x1,x2) > > > > I actually want some plots similar to what SAS proc > > plot produced. > > > > Does anybody have a clue of how to do this easily in > > R? > > Try this: > > > x1=c(0.6,0.4,.4,.4,.2,.2,.2,0,0) > x2=c(0.4,.2,.4,.6,0,.2,.4,0,.2) > x1=rep(x1,4) > x2=rep(x2,4) > temp=data.frame(x1,x2) > > foo <- subset(as.data.frame(table(temp)), Freq > 0) > foo$x1 <- as.numeric(as.character(foo$x1)) > foo$x2 <- as.numeric(as.character(foo$x2)) > > with(foo, plot(x1, x2, type = "n")) > with(foo, text(x1, x2, lab = Freq)) > > > -Deepayan Great solution Deepayan. I was just in the process of working on something similar. One quick modification is that the two plot()/text() functions can be combined into: with(foo, plot(x1, x2, pch = as.character(Freq))) Best regards, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] simple question: reading lower triangular matrix
On Tue, 2005-07-12 at 14:37 -0300, Rogério Rosa da Silva wrote: > Dear list, > > I will like to learn how to read a lower triangular matrix in R. The > input file *.txt have the following format: > > A B C D E > A 0 > B 10 > C 250 > D 3680 > E 479 100 > > > How this can be done? > > Thanks in advance for your help > > Rogério I don't know that this is the easiest way of doing it, but here is one approach: # I saved your data above in a file called "test.txt" # Read the first line of test.txt to get the colnames as chars col.names <- unlist(read.table("test.txt", nrow = 1, as.is = TRUE)) # now read the rest of the file using 'fill = TRUE' to pad lines # with NAs # skip the first line # set the row.names as the first column in the text file # coerce to a matrix df <- as.matrix(read.table("test.txt", fill = TRUE, skip = 1, row.names = 1)) # Now set the colnames of df colnames(df) <- col.names > df A B C D E A 0 NA NA NA NA B 1 0 NA NA NA C 2 5 0 NA NA D 3 6 8 0 NA E 4 7 9 10 0 If you should further want to set the diagonal to NA: > diag(df) <- NA > df A B C D E A NA NA NA NA NA B 1 NA NA NA NA C 2 5 NA NA NA D 3 6 8 NA NA E 4 7 9 10 NA See ?read.table for more information on the file reading part. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] question for IF ELSE usage
On Tue, 2005-07-12 at 16:22 +0200, ecoinfo wrote: > Hi R users, > Maybe the question is too simple. > In a IF ... ELSE ... statement "if(cond) cons.expr else alt.expr", IF and > ELSE should be at the same line? > For example, > if (x1==12) > { > y1 <- 5 > }else > { > y1 <- 3 > } > is right, while > if (x1==12) > { > y1 <- 5 > } > else # Error: syntax error > { > y1 <- 3 > } > is wrong? > Thanks Note the following from the Details section of ?"if" "Note that it is a common mistake to forget to put braces ({ .. }) around your statements, e.g., after if(..) or for(). In particular, you should not have a newline between } and else to avoid a syntax error in entering a if ... else construct at the keyboard or via source. For that reason, one (somewhat extreme) attitude of defensive programming is to always use braces, e.g., for if clauses." One other approach is the following: if (x1 == 12) { y1 <- 5 } else { y1 <- 3 } Note the presence of both braces on the 'else' line. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] xmat[1, 2:3] <- NULL
On Thu, 2005-07-07 at 10:20 -0700, Mikkel Grum wrote: > I have a situation where I'm filling out a dataframe > from a database. Sometimes the database query doesn't > get anything, so I end up trying to place NULL in the > dataframe like below. > > > temp <- NULL > > xmat <- as.data.frame(matrix(NA, 2, 3)) > > xmat[1, 2:3] <- temp > Error in if (m < n * p && (n * p)%%m) > stop(gettextf("replacement has %d items, need %d", : > missing value where TRUE/FALSE needed > > I can't get the programme to accept that sometimes > what the query looks for just doesn't exist, and I > just want to move on to the next calculation leaving > the dataframe with a missing value in the given cell. > It's a real show stopper and I haven't found a way > round it. > > Best wishes, > Mikkel > > PS. I'm using dbGetQuery to query an SQLite database. NULL represents a zero length object in R. Thus, trying to set only the first row in a data frame to NULL makes no sense, since you cannot have a 0 length object that also has a single row (as you seem to be trying to do above). Since a data frame is a series of lists, you could do the following: > temp <- NULL > xmat <- as.data.frame(matrix(NA, 2, 3)) > xmat V1 V2 V3 1 NA NA NA 2 NA NA NA > xmat[, 1] <- temp > xmat V2 V3 1 NA NA 2 NA NA which removes the first column in the data frame. This is the same as: > xmat[, -1] V2 V3 1 NA NA 2 NA NA You could also set the entire xmat to NULL as follows: > xmat V1 V2 V3 1 NA NA NA 2 NA NA NA > xmat <- NULL > xmat NULL You can then test to see if 'xmat' is a NULL: > is.null(xmat) [1] TRUE and base a boolean expression and resultant action on that result: if (!is.null(xmat)) { do_calculations... } If your calculations are on a row by row basis, where NA's represent missing data, you can also use one of several functions to eliminate those rows. See ?na.action, ?na.omit and ?complete.cases for more information and examples. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Plotting Character Variable
On Thu, 2005-07-07 at 11:47 -0400, [EMAIL PROTECTED] wrote: > Any ideas about the following problem: > > I have a matrix (A) that looks like this: > > gene_names values > hsa-mir-124 0.3 > hsa-mir-234 0.1 > hsa-mir-344 0.4 > hsa-mir-333 0.7 > . ... > > (This is a 2 by 22283 matrix: quite large) To split hairs, it would be a 22283 by 2 object in R's [row, column] approach to indexing. I am also presuming that "A" is a data frame, since you seem to have two different data types above, with gene_names being a factor? > I would like to plot the values, but output the gene_names as the plotting > symbol. I have tried regular x,y plots, but since the gene_names are quite > large and there are 22283 of them, it's impossible to fit them on the x-axis. > > Basically, can I plot the above matrix > > plot(gene_names, value) where the gene_names are used as the plotting symbol. > > thank you, Well... I would defer to those with more experience in plotting genetic data, but from a practical standpoint, it seems to me to be highly problematic to plot >20,000 data points with labels and have them be human readable without an STMunless you have _very_ wide paper on a large format plotter ;-) That being said, one approach is to rotate the x axis labels vertically, to make more room, while using points for the plotting symbols: # Adjust bottom margin to make room for vertical labels par(mar = c(7, 4, 4, 2)) plot(1:nrow(A), A$values, xaxt = "n", ann = FALSE, las = 2) # use 'las = 3' to rotate the labels axis(1, at = 1:nrow(A), labels = as.character(A$gene_names), las = 3) You might want to review some of the tools available at the Bioconductor site to see if there are specialized plotting functions for this type of data: http://www.bioconductor.org/ HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] the format of the result
On Fri, 2005-07-01 at 19:40 +0800, ronggui wrote: > I write a function to get the frequency and prop of a variable. > > freq<-function(x,digits=3) > {naa<-is.na(x) > nas<-sum(naa) > if (any(naa)) > x<-x[!naa] > n<-length(x) > ta<-table(x) > prop<-prop.table(ta)*100 > res<-rbind(ta,prop) > rownames(res)<-c("Freq","Prop") > cat("Missing value(s) are",nas,".\n") > cat("Valid case(s) are",n,".\n") > cat("Total case(s) are",(n+nas),".\n\n") > print(res,digits=(digits+2)) > cat("\n") > } > > > freq(sample(letters[1:3],48,T),2) > Missing value(s) are 0 . > Valid case(s) are 48 . > Total case(s) are 48 . > > a b c > Freq 11.00 20.00 17.00 > Prop 22.92 41.67 35.42 > > and i want the result to be like > a b c > Freq 11.00 20.00 17.00 > Prop 22.92% 41.67% 35.42% > > how should i change my function to get what i want? Here is a modification of the function that I think should work. Note that part of the output formatting process has to take into account the a priori unknowns involving your 'digits' argument, the lengths of the dimnames resulting from the table and the lengths of the frequency counts in the table. Thus, a fair amount of the code is establishing the 'width' argument, which is then used in formatC() so that the output can be column aligned properly. Note that by default, table() will exclude "NA", so you do not need to subset 'x' before using table(). Also, note that I change "Prop" to "Pct". freq <- function(x, digits = 3) { n <- length(x) missing <- sum(is.na(x)) ta <- table(x) pct <- prop.table(ta) * 100 width <- max(nchar(unlist(dimnames(ta))) + 1, nchar(ta) + digits + 1, 5 + digits) Vals <- paste(formatC(unlist(dimnames(ta)), format = "s", width = width), collapse = " ") Freq <- paste(formatC(ta, format = "f", digits = digits, width = width), collapse = " ") Pct <- paste(formatC(pct, format = "f", digits = digits, width = width), "%", sep = "", collapse = " ") cat("Missing value(s) are", missing, ".\n") cat("Valid case(s) are", n - missing,".\n") cat("Total case(s) are", n, ".\n\n") cat("", Vals, "\n") cat("Freq", Freq, "\n") cat("Pct ", Pct, "\n") cat("\n") } Thus: > freq(sample(letters[1:3], 48, TRUE), 2) Missing value(s) are 0 . Valid case(s) are 48 . Total case(s) are 48 . abc Freq 28.00 8.0012.00 Pct58.33% 16.67% 25.00% > freq(sample(c(letters[1:3], NA), 1000, TRUE), 2) Missing value(s) are 257 . Valid case(s) are 743 . Total case(s) are 1000 . abc Freq 250.00 218.00 275.00 Pct33.65% 29.34% 37.01% > freq(iris$Species) Missing value(s) are 0 . Valid case(s) are 150 . Total case(s) are 150 . setosa versicolorvirginica Freq 50.000 50.000 50.000 Pct 33.333% 33.333% 33.333% > freq(iris$Species, 0) Missing value(s) are 0 . Valid case(s) are 150 . Total case(s) are 150 . setosa versicolorvirginica Freq 50 50 50 Pct 33% 33% 33% HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to rotate the axisnames in a BARPLOT
On Thu, 2005-06-30 at 23:05 +0200, [EMAIL PROTECTED] wrote: > Hi all, > > - how can I do a barplot with rotated axis labels? I've seen the example for > just a plot in the FAQ, but I'll missing the coordinates to plot my text at > the right position beneath the bars. > Is there any (easy?) solution? > > - how can I set the y-axis in a barplot to logarithmic scale? > > Many thanks in advance! > > Best Regards > Tom The same concept for rotating axis labels in a regular plot per the FAQ can be used in a barplot, as long as you know that barplot() returns the bar midpoints. So: ## Increase bottom margin to make room for rotated labels par(mar = c(7, 4, 4, 2) + 0.1) ## Create plot and get bar midpoints in 'mp' mp <- barplot(1:10) ## Set up x axis with tick marks alone axis(1, at = mp, labels = FALSE) ## Create some text labels labels <- paste("Label", 1:10, sep = " ") ## Plot x axis labels at mp text(mp, par("usr")[3] - 0.5, srt = 45, adj = 1, labels = labels, xpd = TRUE) ## Plot x axis label at line 4 mtext(1, text = "X Axis Label", line = 4) See ?barplot for more information. With respect to the log scales, you would need to use barplot2(), which is in the 'gplots' package on CRAN. barplot2() supports log scales using the same 'log' argument as plot(). HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] conversion
On Tue, 2005-06-28 at 17:23 -0400, Anna Oganyan wrote: > Dear List, > How can I convert a list with elements being character strings, like: > "c(1,2,3,4)", “c(1,3,4,2) … to a list with elements as numerical > vectors: c(1,2,3,4), c(1,3,4,2)…? > Thanks! > Anna > l <- list("c(1,2,3,4)", "c(1,3,4,2)") > l [[1]] [1] "c(1,2,3,4)" [[2]] [1] "c(1,3,4,2)" Now use lapply() over each list element in 'l', converting the character vectors to R expressions and then evaluating them: > lapply(l, function(x) eval(parse(text = x))) [[1]] [1] 1 2 3 4 [[2]] [1] 1 3 4 2 See ?lapply, ?eval and ?parse. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] faster algorithm for Kendall's tau
On Tue, 2005-06-28 at 13:03 -0400, ferdinand principia wrote: > Hi, > > I need to calculate Kendall's tau for large data > vectors (length > 100'000). > Is somebody aware of a faster algorithm or package > function than "cor(, method="kendall")"? > There are ties in the data to be considered (Kendall's > tau-b). > > Any suggestions? > > Regards > Ferdinand The time intensive part of the process is typically the ranking/ordering of the vector pairs to calculate the numbers of concordant and discordant pairs. If the number of _unique pairs_ in your data is substantially less than the number of total pairs (in other words, creating a smaller 2d contingency table from a pair of your vectors makes sense), then the following may be of help. # Calculate CONcordant Pairs in a table # cycle through x[r, c] and multiply by # sum(x elements below and to the right of x[r, c]) # x = table concordant <- function(x) { x <- matrix(as.numeric(x), dim(x)) # get sum(matrix values > r AND > c) # for each matrix[r, c] mat.lr <- function(r, c) { lr <- x[(r.x > r) & (c.x > c)] sum(lr) } # get row and column index for each # matrix element r.x <- row(x) c.x <- col(x) # return the sum of each matrix[r, c] * sums # using mapply to sequence thru each matrix[r, c] sum(x * mapply(mat.lr, r = r.x, c = c.x)) } # Calculate DIScordant Pairs in a table # cycle through x[r, c] and multiply by # sum(x elements below and to the left of x[r, c]) # x = table discordant <- function(x) { x <- matrix(as.numeric(x), dim(x)) # get sum(matrix values > r AND < c) # for each matrix[r, c] mat.ll <- function(r, c) { ll <- x[(r.x > r) & (c.x < c)] sum(ll) } # get row and column index for each # matrix element r.x <- row(x) c.x <- col(x) # return the sum of each matrix[r, c] * sums # using mapply to sequence thru each matrix[r, c] sum(x * mapply(mat.ll, r = r.x, c = c.x)) } # Calculate Kendall's Tau-b # x = table calc.KTb <- function(x) { x <- matrix(as.numeric(x), dim(x)) c <- concordant(x) d <- discordant(x) n <- sum(x) SumR <- rowSums(x) SumC <- colSums(x) KTb <- (2 * (c - d)) / sqrt(((n ^ 2) - (sum(SumR ^ 2))) * ((n ^ 2) - (sum(SumC ^ 2 KTb } Note that I made some modifications of the above, relative to prior versions that I have posted to handle large numbers of pairs to avoid integer overflows in summations. Hence the: x <- matrix(as.numeric(x), dim(x)) conversion in each function. Now, create some random test data, with 100,000 elements in each vector, sampling from 'letters', which would yield a 26 x 26 table: a <- sample(letters, 10, replace = TRUE) b <- sample(letters, 10, replace = TRUE) > dim(table(a, b)) [1] 26 26 > system.time(print(calc.KTb(table(a, b [1] 0.0006906088 [1] 0.77 0.02 0.83 0.00 0.00 Note that in the above, the initial table takes most of the time: > system.time(table(a, b)) [1] 0.55 0.00 0.56 0.00 0.00 Hence: > tab.ab <- table(a, b) > system.time(print(calc.KTb(tab.ab))) [1] 0.0006906088 [1] 0.25 0.01 0.27 0.00 0.00 I should note that I also ran: > system.time(print(cor(a, b, method = "kendall"))) [1] 0.0006906088 [1] 694.80 7.72 931.89 0.00 0.00 Nice to know the results work out at least... :-) I have not tested with substantially larger 2d matrices, but would envision that as the dimensions of the resultant tabulation increases, my method probably approaches and may even become less efficient than the approach implemented in cor(). Some testing would validate this and perhaps point to coding the concordant() and discordant() functions in C for improvement in timing. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html