Re: [R] two questions for R beginners
On 03/26/2010 02:58 PM, Steve Powell wrote: For psychologists like me (possibly for others) by far the most time-consuming detail is variable labels. I need them for just about every analysis I do. We can use special packages like Hmisc and its function spss.get to import the labels, but then nearly all the other packages don't respect the labels, even simple things like list. So I find myself either adding them back in at every step or making my own versions of the functions. People coming from SPSS just expect the output of basic functions like factanal to display the labels, or at least to have the option of doing so. Respecting/preserving variable labels in more core functions would be an enormous help for social scientists IMHO. Hi Steve, From another psychologist, this is one reason that I have been rewriting a number of functions to read and display the "variable.labels" attribute produced by the read.spss function in the foreign package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
For psychologists like me (possibly for others) by far the most time-consuming detail is variable labels. I need them for just about every analysis I do. We can use special packages like Hmisc and its function spss.get to import the labels, but then nearly all the other packages don't respect the labels, even simple things like list. So I find myself either adding them back in at every step or making my own versions of the functions. People coming from SPSS just expect the output of basic functions like factanal to display the labels, or at least to have the option of doing so. Respecting/preserving variable labels in more core functions would be an enormous help for social scientists IMHO. What helped? Lots of things - r-seek and quick-R are my favourites, along with amazing people who reply to problems on r-help. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Patrick, 1. Implicit intercepts. Implicit intercepts are not too bad for the main model, but they creep in occasionally in strange places where they might not be expected. For example, in some of the variance structures specified in lme, (~x) automatically expands to (~1+x). Venables said in the "Exegeses" paper: "For teaching purposes it would be useful to have a switch that required users to include the intercept term in formulae if it is needed. This would deï¬nitely help more students than it would hinder. In other words it should be possible to override the automatic intercept term." 2. Working with colors. There are a number of functions in R for working with colors and since colors can be specified by palette number, name, hexadecimal string, values between 0 and 1, or values between 0 and 256, things can be confusing. One problem is that not all functions accept the same type of arguments or produce the same type of return values. For example, the awkward need of "t" and conversion to [0,255] in adding alpha levels to a color: rgb(t(col2rgb(c("navy","maroon"))),alpha=120,max=255) 3. Factors. R tries to convert everything that it possibly can into a factor. Except, occasionally, it doesn't try. Further, after sub-setting data so that some factor levels have no data, too many functions fail. I shouldn't need to use "drop.levels" from gdata package all over the place to keep automated scripts running smoothly. Let's not forget: R> as.numeric(factor(c(NA,0,1))) [1] NA 1 2 4. is.list(list(1)[1]) [1] TRUE is.matrix(matrix(1)[1,]) [1] FALSE Ouch. Ouch. Ouch. 5. Most useful: "apropos" and Rseek. Best, Kevin On Thu, Feb 25, 2010 at 11:31 AM, Patrick Burns wrote: > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > pbu...@pburns.seanet.com > http://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Kevin Wright [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
To me, as a biologist recycled to biostats, I have always worked with Excel and then SPSS and moving to R was difficult (and still is, since I am still learning). Being a self-taught person, I learn R looking for examples in Google, which many times takes me to Rwiki or other. I sometimes post questions and most of the answers were helpful, but I have found that sometimes the answers have been too short or didn´t give enough hints as to how to follow, and that has stopped me from asking again in order not to annoy experts. I have not answered too many questions from newbies but I have tried to explain as much as I could. Sometimes I find it better not to answer rather than just answering a short vague answer. Please, examples, examples, examples! I found most difficult the different data types, since I understand excel as a data frame with columns and rows, and that´s it. Then as someone has already commented, the class, mode and str functions helped a lot. But I think that to me, examples are the way to let people learn. >From that, I moved to use loops, and am still nervous when people suggest >ussing *apply functions, I can´t get down to use them!. I find loops more >logical, and can´t see the way of moving them to *apply. Finally, I am not a Linux expert , and I cannot get round to install and organise a proper R directory and keep updated. I have once tried to use a package that needed the development R version and was only prepared for Linux R, but couldn´t keep the R-devel versions updated. Some more step-by-step would help sometimes. Thanks for a great tool! > Date: Tue, 2 Mar 2010 12:44:23 -0600 > From: keo.orms...@gmail.com > To: landronim...@gmail.com > CC: r-help@r-project.org; pbu...@pburns.seanet.com > Subject: Re: [R] two questions for R beginners > > Liviu Andronic escribió: > > On Mon, Mar 1, 2010 at 11:49 PM, Liviu Andronic > > wrote: > > > >> On 3/1/10, Keo Ormsby wrote: > >> > >>> Perhaps my biggest problem was that I couldn't (and still haven't) seen > >>> *absolute beginners* documents. > >>> > >>> > >> there was once a link posted on r-sig-teaching that would probably fit > >> your needs, but I cannot find it now. > >> > >> > > > > OK, I found it. Below is an excerpt of that r-sig-teaching e-mail. > > Liviu > > > > On Thu, Jul 2, 2009 at 2:19 PM, Robert W. Hayden wrote: > > > >> I think such a website would be a real asset. It would be most useful > >> if it either were restricted to intro. stats. OR organized so that > >> materials for real beginners were easy to extract from all the > >> materials for programmers and Ph.D. statisticians. As a relative > >> beginner myself, I find the usual resources useless. In self defense, > >> I created materials for my own beginning students: > >> > >> http://courses.statistics.com/software/R/Rhome.htm > >> > Hi Liviu, > This is indeed the best site for introduction I have seen. Although it > still assumes some things that at first might seem unintuitive to the > absolute beginner I talk about. For instance, in the first page, it > shows that you can do sqrt(x), where x can be a vector, and return a > vector of the square roots of each number. Although this is high school > matrix algebra, most users expect that the input to square root function > to be a single number, not a matrix, as in Excel or a calculator. Other > concepts that are not explicitly introduced are "R workspace", the use > of arguments in functions (with or without the "="), etc. Others are > things like diff(range(rainfall)) , where you have the output of one > function used as the input to another, all in the same command line. All > these things seem very basic, but can be difficult if you are trying to > learn on your own with no prior experience in programming. > I hope I am not sounding too difficult and contrarian, I am just trying > to share my experience with starting with R, and in trying to convey > this learning to my colleagues and students. In the end, I did find > everything I needed to learn, and now I feel at ease with R, and I > believe that almost anybody that can use Excel or something like it, > could learn R. > > Thank you for the information, > Best wishes, > Keo. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
John, I felt a short, somewhat strong reply was in order. One of the inherent aspects of the language is that R demands more of an understanding from users about what is taking place. Model formulae, for example, are close to what one would use if they were to write the model on paper. I consider this a strong feature. The confusing aspects that you point out are not the result of syntax. Syntax in R is well specified and, I believe, far easier to work with than many programming languages. English is a confusing language. C++ is a confusing language. One may have far more success learning, say, French if he/she does not like the syntax or grammar of English, or visual Pascal if the syntax of C++ is not preferred, rather than changing the language. If one wants to do business in a particular area, then it generally behooves one to suck it up and learn the native tongue or hire someone for that part. If one wants the program that is the standard for other world class statistics packages, which also happens to have a very amendable license agreement, then it behooves one to suck it up and learn R. R is what it is. If someone does not like it, he/she can use something else, pay far more for an inferior product which will also take longer to do a calculation and handle less data at once, while risking that the content of their understanding of statistics is diminished for it. Not that there is not room for development in R, but the sort of development you demand will evolve according to similar laws as those that govern economics and/or change in spoken language. You'd need major financial backing, and a strong influence over the culture of those who use R to pull this off. Other than that, you'll have to wait for the dialect to change over time from the cumulative effect of contributions from people the world over who all want something different out of the language. If someone wants to take on the R challenge for him/herself, however, then there is likely no better technical support in the world than the R community, albeit perhaps after dispensing with some of the niceties. Sincerely, KeithC. -Original Message- From: John Sorkin [mailto:jsor...@grecc.umaryland.edu] Sent: Tuesday, March 02, 2010 4:46 AM To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch Subject: Re: [R] two questions for R beginners Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would not be compensated in the traditional sense. This does not mean that we should not acknowledge the confusion. If we what R to be the de facto lingua franca of statistical analysis doesn't it make sense to strive for syntax that is as straight forward and consistent as possible? Again, please understand that my comment is made with deepest respect for the many people who have unselfishly contributed to the R project. Many thanks to each and every one of you. John >>> Karl Ove Hufthammer 3/2/2010 4:00 AM >>> On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch wrote: > Suppose X is a dataframe or a matrix. What would you expect to get > from X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix, even though the underlying structure is completely different. (This doesn't work on a normal list; there you have to type the whole 'X[[2]][3]'.) The behaviour of the 'as.' functions may sometimes be surprising, at least for me. For example, 'as.data.frame' on a named vector gives a single-column data frame, instead of a single-row data frame. (I'm not sure what's the recommended way of converting a named vector to row data frame, but 'as.data.frame(t(X))' works, even though both 'X' and 't(X)' looks like a row of numbers.) > The point is that a dataframe is a list, and a matrix isn't. If users > don't understand that, then they'll be confused somewhere. Making > matrices more list-like in one respect
Re: [R] two questions for R beginners
On 03/04/2010 08:20 AM, David Winsemius wrote: ... Perhaps the print methods for data.frame and matrix should announce the class of the object being printed. Yes! An enthusiastic vote for highlighting this fundamental distinction. There is already quite enough conflation of these two very dissimilar object classes. If so, please make it an option with an argument like "show.class" or "print.fancy" that can be set globally in options. Otherwise those of us who depend upon the sparse displays of R objects in our functions (e.g. in the prettyR package) will suffer the results. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Mar 3, 2010, at 12:15 PM, William Dunlap wrote: If R made matrix$columnName mean the same as matrix[, "columnName"] (a vector) so matrices looked more like data.frames, would we also want the following to work as they do with data.frames? with(matrix, log(columnName)) # log of that column as a vector matrix["columnName"] # 1-column matrix matrix[["columnName"]] # vector equivalent of that 1-column matrix lm(responseColumn~predictorColumn, data=matrix) eval(quote(columnName), envir=matrix) The last 2 bump into the rule allowing envir to be a frame number (since a 1x1 matrix is currently taken as the frame number now). Perhaps the print methods for data.frame and matrix should announce the class of the object being printed. Yes! An enthusiastic vote for highlighting this fundamental distinction. There is already quite enough conflation of these two very dissimilar object classes. -- David Winsemius Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick Burns Sent: Wednesday, March 03, 2010 2:44 AM To: r-help@r-project.org Subject: Re: [R] two questions for R beginners I think Duncan's example of a list that is a matrix is a compelling argument not to do the change. A matrix that is a list with both names and dimnames *is* probably rare (but certainly imaginable). A matrix that is a list is not so rare, and the proposed double meaning of '$' would certainly be confusing in that case. Pat On 02/03/2010 17:55, Duncan Murdoch wrote: On 02/03/2010 11:53 AM, William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin Sent: Tuesday, March 02, 2010 3:46 AM To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch Subject: Re: [R] two questions for R beginners Please take what follows not as an ad hominem statement, but > rather as an attempt to improve what is already an excellent > program, that has been built as a result of many, many hours > of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is > pointed out the response is not to try to improve the > language so as to avoid the confusion, but rather to state > that the confusion is inherent in the language. I understand > that to make changes that would avoid the confusing aspect of > the language that has been discussed in this thread would > take time and effort by an R wizard (which I am not), time > and effort that would not be compensated in the traditional > sense. This does not mean that we should not acknowledge the > confusion. If we what R to be the de facto lingua franca of > statistical analysis doesn't it make sense to strive for > syntax that is as straight forward and consistent as possible? Whenever one changes the language that way old code will break. I think in this case not much code would break. Mostly when people have a matrix M and ask for M$column they'll get an error; the proposal is that they'll get the requested column. (It is possible to have a list with names that is also a matrix with dimnames, but I think that is a pretty unusual construction.) But I haven't been convinced that the proposal is a net improvement to the language. Duncan Murdoch The developers can, with a lot of effort, fix their own code, and perhaps even user-written code on CRAN, but code that thousands of users have written will break. There is a lot of code out there that was written by trial and error and by folks who no longer work at an institution: the code works but no one knows exactly why it works. Telling folks they need to change that code because we have a cleaner but different syntax now is not good. Why would one spend time writing a package that might stop working when R is "upgraded"? I think the solution is not to change current semantics but to write functions that behave better and encourage users to use them, gradually abandoning the old constructs. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Again, please understand that my comment is made with deepest > respect for the many people who have unselfishly contributed > to the R project. Many thanks to each and every one of you. John Karl Ove Hufthammer 3/2/2010 4:00 AM >>> On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch > > wrote: Suppose X is a dataframe or a matrix. What would you > expect to get from > > X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking > of. There > are plenty of surprises available, and it's best to use the > most logical > way of extracting. E.g., to extract the top-left element of a 2D >
Re: [R] two questions for R beginners
Bill, The points you make are well taken; one needs to know when to stop. I would suggest standardizing the methods used to refer to elements of a matrix and a dataframe and going no further. Why do I say this? A beginner, even a more experienced R users, probably envisions a dataframe and a matrix has having the same structure, but not the same contents. Both appear to be multi-dimensional structures that can store data, albeit data of different types. A matrix stores numerical values, a dataframe stores data of mixed types. This being the case it makes sense to assume that A%*%B will work when A and B are matrices, but C%*% D will not work when C and D are dataframes. This is quite logical and intuitive. It is an extension of the truism that one can perform the following arithmetic operation 2*3, but can't perform the following operation "Bill"*"John" (I use quotes to indicate that the names are proper names and not variable names). Despite the observation that on can reasonably expect that there are certain operations that one can perform on matrices, but not on dataframes (and conversely), the apparent similarity in structure of the two objects makes one assume (incorrectly at this time) that the syntax used to access elements of an array and a dataframe should be the same. I submit that having similar syntax for accessing elements of the two structures will assist users learn R. It will not cause them to assume that one can perform the exactly the same operations on the two structures. I apologize to other members of the listserver for the length of this subthread. It appears that I have lost the argument, and have not convinced those who would need to make the changes to allow matrices and dataframes to have similar syntax for addressing elements of the respective structures. I do not expect I will be adding any additional comments to this thread, but will continue to follow contributions other people make. Perhaps I will learn that I am not the only person who feels that the syntax should be consistent, but given what I have read so far, I doubt it. I thank everyone who has contributed to the discussion. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> "William Dunlap" 3/3/2010 1:15 PM >>> If R made matrix$columnName mean the same as matrix[, "columnName"] (a vector) so matrices looked more like data.frames, would we also want the following to work as they do with data.frames? with(matrix, log(columnName)) # log of that column as a vector matrix["columnName"] # 1-column matrix matrix[["columnName"]] # vector equivalent of that 1-column matrix lm(responseColumn~predictorColumn, data=matrix) eval(quote(columnName), envir=matrix) The last 2 bump into the rule allowing envir to be a frame number (since a 1x1 matrix is currently taken as the frame number now). Perhaps the print methods for data.frame and matrix should announce the class of the object being printed. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick Burns > Sent: Wednesday, March 03, 2010 2:44 AM > To: r-help@r-project.org > Subject: Re: [R] two questions for R beginners > > I think Duncan's example of a list that is > a matrix is a compelling argument not to do > the change. > > A matrix that is a list with both names and > dimnames *is* probably rare (but certainly > imaginable). A matrix that is a list is not > so rare, and the proposed double meaning of > '$' would certainly be confusing in that case. > > Pat > > > On 02/03/2010 17:55, Duncan Murdoch wrote: > > On 02/03/2010 11:53 AM, William Dunlap wrote: > >> > -Original Message- > >> > From: r-help-boun...@r-project.org > > >> [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin > >> > Sent: Tuesday, March 02, 2010 3:46 AM > >> > To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch > >> > Subject: Re: [R] two questions for R beginners > >> > > Please take what follows not as an ad hominem statement, but > > >> rather as an attempt to improve what is already an excellent > > >> program, that has been built as a result of many, many hours > of > >> dedicated work by many, many unpaid, unsung volunteers. > >> > > It troubles me a bit that when a confusing aspect of R is > > >> pointed
Re: [R] two questions for R beginners
If R made matrix$columnName mean the same as matrix[, "columnName"] (a vector) so matrices looked more like data.frames, would we also want the following to work as they do with data.frames? with(matrix, log(columnName)) # log of that column as a vector matrix["columnName"] # 1-column matrix matrix[["columnName"]] # vector equivalent of that 1-column matrix lm(responseColumn~predictorColumn, data=matrix) eval(quote(columnName), envir=matrix) The last 2 bump into the rule allowing envir to be a frame number (since a 1x1 matrix is currently taken as the frame number now). Perhaps the print methods for data.frame and matrix should announce the class of the object being printed. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick Burns > Sent: Wednesday, March 03, 2010 2:44 AM > To: r-help@r-project.org > Subject: Re: [R] two questions for R beginners > > I think Duncan's example of a list that is > a matrix is a compelling argument not to do > the change. > > A matrix that is a list with both names and > dimnames *is* probably rare (but certainly > imaginable). A matrix that is a list is not > so rare, and the proposed double meaning of > '$' would certainly be confusing in that case. > > Pat > > > On 02/03/2010 17:55, Duncan Murdoch wrote: > > On 02/03/2010 11:53 AM, William Dunlap wrote: > >> > -Original Message- > >> > From: r-help-boun...@r-project.org > > >> [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin > >> > Sent: Tuesday, March 02, 2010 3:46 AM > >> > To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch > >> > Subject: Re: [R] two questions for R beginners > >> > > Please take what follows not as an ad hominem statement, but > > >> rather as an attempt to improve what is already an excellent > > >> program, that has been built as a result of many, many hours > of > >> dedicated work by many, many unpaid, unsung volunteers. > >> > > It troubles me a bit that when a confusing aspect of R is > > >> pointed out the response is not to try to improve the > > language so as > >> to avoid the confusion, but rather to state > that the confusion is > >> inherent in the language. I understand > that to make changes that > >> would avoid the confusing aspect of > the language that has been > >> discussed in this thread would > take time and effort by > an R wizard > >> (which I am not), time > and effort that would not be > compensated in > >> the traditional > sense. This does not mean that we should not > >> acknowledge the > confusion. If we what R to be the de facto lingua > >> franca of > statistical analysis doesn't it make sense to > strive for > > >> syntax that is as straight forward and consistent as possible? > >> Whenever one changes the language that way old code > >> will break. > > I think in this case not much code would break. Mostly when > people have > > a matrix M and ask for M$column they'll get an error; the > proposal is > > that they'll get the requested column. (It is possible to > have a list > > with names that is also a matrix with dimnames, but I think > that is a > > pretty unusual construction.) But I haven't been convinced that the > > proposal is a net improvement to the language. > > Duncan Murdoch > > > >> The developers can, with a lot of effort, > >> fix their own code, and perhaps even user-written code > >> on CRAN, but code that thousands of users have written > >> will break. There is a lot of code out there that was > >> written by trial and error and by folks who no longer > >> work at an institution: the code works but no one knows > >> exactly why it works. Telling folks they need to change > >> that code because we have a cleaner but different syntax > >> now is not good. Why would one spend time writing a > >> package that might stop working when R is "upgraded"? > >> > >> I think the solution is not to change current semantics > >> but to write functions that behave better and encourage > >> users to use them, gradually abandoning the old constructs. > >> > >> Bill Dunlap > >> Spotfire, TIBCO Software > >> wdunlap tibco.com > >> > > Again, please understand that my comment is made with deepest > > >> res
Re: [R] two questions for R beginners
Hi that is why I consider matrix is just a vector with dimensions and data.frame is a rectangular structure similar to Excel table. That saved me a lot of surprises. But I must admit I am not a real beginner nowadays although I still learn when using R, reading help list and trying sometimes to help others. Regards Petr "John Sorkin" napsal dne 03.03.2010 16:30:39: > Petr, > On the other hand . . . > > > mat<-matrix(1:12, 3,4) > > dat<-as.data.frame(mat) > > mat > [,1] [,2] [,3] [,4] > [1,]147 10 > [2,]258 11 > [3,]369 12 > > dat > V1 V2 V3 V4 > 1 1 4 7 10 > 2 2 5 8 11 > 3 3 6 9 12 > > What you are demonstrating by your example is the manner in which the data are > organized deep in the guts of R, not the way people, especially R beginners > visualize objects in their mind. When I think of the integer sixty-nine, I > visualize 69, not 1000101 despite the fact that 69, as an integer is > represented in the computer as 1000101. > John > > > > > > > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Petr > PIKAL 3/3/2010 9:44 AM >>> > "John Sorkin" napsal dne 01.03.2010 > 15:19:10: > > > If it looks like a duck and quacks like a duck, it ought to behave like > a duck. > > > > To the user a matrix and a dataframe look alike . . . except a dataframe > can > > Well, matrix looks like a data.frame only on the first sight. > > mat<-matrix(1:12, 3,4) > dat<-as.data.frame(mat) > > > str(dat) > 'data.frame': 3 obs. of 4 variables: > $ V1: int 1 2 3 > $ V2: int 4 5 6 > $ V3: int 7 8 9 > $ V4: int 10 11 12 > > str(mat) > int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ... > > seems to me a pretty different look like. > > Regards > Petr > > > > hold non-numeric values. Thus to the users, a matrix looks like a > special case > > of a DF, or perhaps conversely. If you can address elements of one > structure > > using a given syntax, you should be able to address elements of the > other > > structure using the same syntax. To do otherwise leads to confusion and > is > > counter intuitive. > > John > > > > > > > > > > John David Sorkin M.D., Ph.D. > > Chief, Biostatistics and Informatics > > University of Maryland School of Medicine Division of Gerontology > > Baltimore VA Medical Center > > 10 North Greene Street > > GRECC (BT/18/GR) > > Baltimore, MD 21201-1524 > > (Phone) 410-605-7119 > > (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> > Petr > > PIKAL 3/1/2010 8:57 AM >>> > > Hi > > > > r-help-boun...@r-project.org napsal dne 01.03.2010 13:03:24: > > > > < snip> > > > > > > > > > > I understand that 2 dimensional rectangular matrix looks quite > > > > similar to data frame however it is only a vector with dimensions. > > > > As such it can have items of only one type (numeric, character, > ...). > > > > And you can easily change dimensions of matrix. > > > > > > > > matrix<-1:12 > > > > dim(matrix) <- c(2,6) > > > > matrix > > > > dim(matrix) <- c(2,2,3) > > > > matrix > > > > dim(matrix) <-NULL > > > > matrix > > > > > > > > So rectangular structure of printed matrix is a kind of coincidence > > > > only, whereas rectangular structure of data frame is its main > feature. > > > > > > > > Regards > > > > Petr > > > >> > > > >> -- > > > >> Karl Ove Hufthammer > > > > > > Petr, I think that could be confusing! The way I see it is that > > > a matrix is a special case of an array, whose "dimension" attribute > > > is of length 2 (number of "rows", number of "columns"); and "row" > > > and "column" refer to the rectangular display which you see when > > > R prints to matrix. And this, of course, derives directly from > > > the historic rectangular view of a matrix when written down. > > > > > > When you went from "dim(matrix)<-c(2,6)" to "dim(matrix)<-c(2,2,3)" > > > you stripped it of its special title of "matrix" and cast it out > > > into the motley mob of arrays (some of whom are matrices, but > > > "matrix" no longer is). > > > > > > So the "rectangular structure of printed matrix" is not a coincidence, > > > but is its main feature! > > > > Ok. Point taken. However I feel that possibility to manipulate > > matrix/array dimensions by simple changing them as I showed above > > together with perceiving matrix as a **vector with dimensions** > prevented > > me especially in early days from using matrices instead of data frames > and > > vice versa. > > > > Consider cbind and rbind confusing results for vectors with unequal > mode. > > Far to often we can see something like that > > > > > cbind(1:2,letters[1:2]) > > [,1] [,2] > > [1,] "1" "a" > > [2,] "2" "b" > >
Re: [R] two questions for R beginners
Petr, On the other hand . . . > mat<-matrix(1:12, 3,4) > dat<-as.data.frame(mat) > mat [,1] [,2] [,3] [,4] [1,]147 10 [2,]258 11 [3,]369 12 > dat V1 V2 V3 V4 1 1 4 7 10 2 2 5 8 11 3 3 6 9 12 What you are demonstrating by your example is the manner in which the data are organized deep in the guts of R, not the way people, especially R beginners visualize objects in their mind. When I think of the integer sixty-nine, I visualize 69, not 1000101 despite the fact that 69, as an integer is represented in the computer as 1000101. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Petr PIKAL 3/3/2010 9:44 AM >>> "John Sorkin" napsal dne 01.03.2010 15:19:10: > If it looks like a duck and quacks like a duck, it ought to behave like a duck. > > To the user a matrix and a dataframe look alike . . . except a dataframe can Well, matrix looks like a data.frame only on the first sight. mat<-matrix(1:12, 3,4) dat<-as.data.frame(mat) str(dat) 'data.frame': 3 obs. of 4 variables: $ V1: int 1 2 3 $ V2: int 4 5 6 $ V3: int 7 8 9 $ V4: int 10 11 12 str(mat) int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ... seems to me a pretty different look like. Regards Petr > hold non-numeric values. Thus to the users, a matrix looks like a special case > of a DF, or perhaps conversely. If you can address elements of one structure > using a given syntax, you should be able to address elements of the other > structure using the same syntax. To do otherwise leads to confusion and is > counter intuitive. > John > > > > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Petr > PIKAL 3/1/2010 8:57 AM >>> > Hi > > r-help-boun...@r-project.org napsal dne 01.03.2010 13:03:24: > > < snip> > > > > > > > I understand that 2 dimensional rectangular matrix looks quite > > > similar to data frame however it is only a vector with dimensions. > > > As such it can have items of only one type (numeric, character, ...). > > > And you can easily change dimensions of matrix. > > > > > > matrix<-1:12 > > > dim(matrix) <- c(2,6) > > > matrix > > > dim(matrix) <- c(2,2,3) > > > matrix > > > dim(matrix) <-NULL > > > matrix > > > > > > So rectangular structure of printed matrix is a kind of coincidence > > > only, whereas rectangular structure of data frame is its main feature. > > > > > > Regards > > > Petr > > >> > > >> -- > > >> Karl Ove Hufthammer > > > > Petr, I think that could be confusing! The way I see it is that > > a matrix is a special case of an array, whose "dimension" attribute > > is of length 2 (number of "rows", number of "columns"); and "row" > > and "column" refer to the rectangular display which you see when > > R prints to matrix. And this, of course, derives directly from > > the historic rectangular view of a matrix when written down. > > > > When you went from "dim(matrix)<-c(2,6)" to "dim(matrix)<-c(2,2,3)" > > you stripped it of its special title of "matrix" and cast it out > > into the motley mob of arrays (some of whom are matrices, but > > "matrix" no longer is). > > > > So the "rectangular structure of printed matrix" is not a coincidence, > > but is its main feature! > > Ok. Point taken. However I feel that possibility to manipulate > matrix/array dimensions by simple changing them as I showed above > together with perceiving matrix as a **vector with dimensions** prevented > me especially in early days from using matrices instead of data frames and > vice versa. > > Consider cbind and rbind confusing results for vectors with unequal mode. > Far to often we can see something like that > > > cbind(1:2,letters[1:2]) > [,1] [,2] > [1,] "1" "a" > [2,] "2" "b" > > instead of > > > data.frame(1:2,letters[1:2]) > X1.2 letters.1.2. > 11a > 22b > > and then a question why does not the result behave as expected. Each type > of object has some features which is good for some type of > manipulation/analysis/plotting bud quite detrimental for others. > > Regards > Petr > > > > > > To come back to Karl's query about why "$" works for a dataframe > > but not for a matrix, note that "$" is the extractor for getting > > a named component of a list. So, Karl, when you did > > > > d=head(iris[1:4]) > > > > you created a dataframe: > > > > str(d) > > # 'data.frame': 6 obs. of 4 variables: > > # $ Sepal.Length: n
Re: [R] two questions for R beginners
"John Sorkin" napsal dne 01.03.2010 15:19:10: > If it looks like a duck and quacks like a duck, it ought to behave like a duck. > > To the user a matrix and a dataframe look alike . . . except a dataframe can Well, matrix looks like a data.frame only on the first sight. mat<-matrix(1:12, 3,4) dat<-as.data.frame(mat) str(dat) 'data.frame': 3 obs. of 4 variables: $ V1: int 1 2 3 $ V2: int 4 5 6 $ V3: int 7 8 9 $ V4: int 10 11 12 str(mat) int [1:3, 1:4] 1 2 3 4 5 6 7 8 9 10 ... seems to me a pretty different look like. Regards Petr > hold non-numeric values. Thus to the users, a matrix looks like a special case > of a DF, or perhaps conversely. If you can address elements of one structure > using a given syntax, you should be able to address elements of the other > structure using the same syntax. To do otherwise leads to confusion and is > counter intuitive. > John > > > > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Petr > PIKAL 3/1/2010 8:57 AM >>> > Hi > > r-help-boun...@r-project.org napsal dne 01.03.2010 13:03:24: > > < snip> > > > > > > > I understand that 2 dimensional rectangular matrix looks quite > > > similar to data frame however it is only a vector with dimensions. > > > As such it can have items of only one type (numeric, character, ...). > > > And you can easily change dimensions of matrix. > > > > > > matrix<-1:12 > > > dim(matrix) <- c(2,6) > > > matrix > > > dim(matrix) <- c(2,2,3) > > > matrix > > > dim(matrix) <-NULL > > > matrix > > > > > > So rectangular structure of printed matrix is a kind of coincidence > > > only, whereas rectangular structure of data frame is its main feature. > > > > > > Regards > > > Petr > > >> > > >> -- > > >> Karl Ove Hufthammer > > > > Petr, I think that could be confusing! The way I see it is that > > a matrix is a special case of an array, whose "dimension" attribute > > is of length 2 (number of "rows", number of "columns"); and "row" > > and "column" refer to the rectangular display which you see when > > R prints to matrix. And this, of course, derives directly from > > the historic rectangular view of a matrix when written down. > > > > When you went from "dim(matrix)<-c(2,6)" to "dim(matrix)<-c(2,2,3)" > > you stripped it of its special title of "matrix" and cast it out > > into the motley mob of arrays (some of whom are matrices, but > > "matrix" no longer is). > > > > So the "rectangular structure of printed matrix" is not a coincidence, > > but is its main feature! > > Ok. Point taken. However I feel that possibility to manipulate > matrix/array dimensions by simple changing them as I showed above > together with perceiving matrix as a **vector with dimensions** prevented > me especially in early days from using matrices instead of data frames and > vice versa. > > Consider cbind and rbind confusing results for vectors with unequal mode. > Far to often we can see something like that > > > cbind(1:2,letters[1:2]) > [,1] [,2] > [1,] "1" "a" > [2,] "2" "b" > > instead of > > > data.frame(1:2,letters[1:2]) > X1.2 letters.1.2. > 11a > 22b > > and then a question why does not the result behave as expected. Each type > of object has some features which is good for some type of > manipulation/analysis/plotting bud quite detrimental for others. > > Regards > Petr > > > > > > To come back to Karl's query about why "$" works for a dataframe > > but not for a matrix, note that "$" is the extractor for getting > > a named component of a list. So, Karl, when you did > > > > d=head(iris[1:4]) > > > > you created a dataframe: > > > > str(d) > > # 'data.frame': 6 obs. of 4 variables: > > # $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 > > # $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 > > # $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 > > # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 > > > > (with named components "Sepal.Length", ... , "Petal.Width"), > > and a dataframe is a special case of a general list. In a > > general list, the separate components can each be anything. > > In a dataframe, each component is a vector; the different > > vectors may be of different types (logical, numeric, ... ) > > but of course the elements of any single vector must be > > of the same type; and, in a dataframe, all the vectors must > > have the same length (otherwise it is a general list, not > > a dataframe). > > > > So, when you print a dataframe, R chooses to display it > > as a rectangular structure. On the other hand, when you > > print a general list, R displays it quite differently: > > > > d > > # Sepal.Length Sepal.Width Petal.Length Pe
Re: [R] two questions for R beginners
I think Duncan's example of a list that is a matrix is a compelling argument not to do the change. A matrix that is a list with both names and dimnames *is* probably rare (but certainly imaginable). A matrix that is a list is not so rare, and the proposed double meaning of '$' would certainly be confusing in that case. Pat On 02/03/2010 17:55, Duncan Murdoch wrote: On 02/03/2010 11:53 AM, William Dunlap wrote: > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin > Sent: Tuesday, March 02, 2010 3:46 AM > To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch > Subject: Re: [R] two questions for R beginners > > Please take what follows not as an ad hominem statement, but > rather as an attempt to improve what is already an excellent > program, that has been built as a result of many, many hours > of dedicated work by many, many unpaid, unsung volunteers. > > It troubles me a bit that when a confusing aspect of R is > pointed out the response is not to try to improve the > language so as to avoid the confusion, but rather to state > that the confusion is inherent in the language. I understand > that to make changes that would avoid the confusing aspect of > the language that has been discussed in this thread would > take time and effort by an R wizard (which I am not), time > and effort that would not be compensated in the traditional > sense. This does not mean that we should not acknowledge the > confusion. If we what R to be the de facto lingua franca of > statistical analysis doesn't it make sense to strive for > syntax that is as straight forward and consistent as possible? Whenever one changes the language that way old code will break. I think in this case not much code would break. Mostly when people have a matrix M and ask for M$column they'll get an error; the proposal is that they'll get the requested column. (It is possible to have a list with names that is also a matrix with dimnames, but I think that is a pretty unusual construction.) But I haven't been convinced that the proposal is a net improvement to the language. Duncan Murdoch The developers can, with a lot of effort, fix their own code, and perhaps even user-written code on CRAN, but code that thousands of users have written will break. There is a lot of code out there that was written by trial and error and by folks who no longer work at an institution: the code works but no one knows exactly why it works. Telling folks they need to change that code because we have a cleaner but different syntax now is not good. Why would one spend time writing a package that might stop working when R is "upgraded"? I think the solution is not to change current semantics but to write functions that behave better and encourage users to use them, gradually abandoning the old constructs. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > Again, please understand that my comment is made with deepest > respect for the many people who have unselfishly contributed > to the R project. Many thanks to each and every one of you. > > John > > > >>> Karl Ove Hufthammer 3/2/2010 4:00 AM >>> > On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch > > wrote: > > Suppose X is a dataframe or a matrix. What would you > expect to get from > > X[1]? What about as.vector(X), or as.numeric(X)? > > All this of course depends on type of object one is speaking > of. There > are plenty of surprises available, and it's best to use the > most logical > way of extracting. E.g., to extract the top-left element of a 2D > structure (data frame or matrix), use 'X[1,1]'. > > Luckily, R provides some shortcuts. For example, you can > write 'X[2,3]' > on a data frame, just as if it was a matrix, even though the > underlying > structure is completely different. (This doesn't work on a > normal list; > there you have to type the whole 'X[[2]][3]'.) > > The behaviour of the 'as.' functions may sometimes be surprising, at > least for me. For example, 'as.data.frame' on a named vector gives a > single-column data frame, instead of a single-row data frame. > > (I'm not sure what's the recommended way of converting a > named vector to > row data frame, but 'as.data.frame(t(X))' works, even though both 'X' > and 't(X)' looks like a row of numbers.) > > > The point is that a dataframe is a list, and a matrix > isn't. If users > > don't understand that, then they'll be confused somewhere. Making > > matrices more list-like in one respect will just move the confusion > > elsewhere. The solution is to understand
Re: [R] two questions for R beginners
Liviu Andronic escribió: On Mon, Mar 1, 2010 at 11:49 PM, Liviu Andronic wrote: On 3/1/10, Keo Ormsby wrote: Perhaps my biggest problem was that I couldn't (and still haven't) seen *absolute beginners* documents. there was once a link posted on r-sig-teaching that would probably fit your needs, but I cannot find it now. OK, I found it. Below is an excerpt of that r-sig-teaching e-mail. Liviu On Thu, Jul 2, 2009 at 2:19 PM, Robert W. Hayden wrote: I think such a website would be a real asset. It would be most useful if it either were restricted to intro. stats. OR organized so that materials for real beginners were easy to extract from all the materials for programmers and Ph.D. statisticians. As a relative beginner myself, I find the usual resources useless. In self defense, I created materials for my own beginning students: http://courses.statistics.com/software/R/Rhome.htm Hi Liviu, This is indeed the best site for introduction I have seen. Although it still assumes some things that at first might seem unintuitive to the absolute beginner I talk about. For instance, in the first page, it shows that you can do sqrt(x), where x can be a vector, and return a vector of the square roots of each number. Although this is high school matrix algebra, most users expect that the input to square root function to be a single number, not a matrix, as in Excel or a calculator. Other concepts that are not explicitly introduced are "R workspace", the use of arguments in functions (with or without the "="), etc. Others are things like diff(range(rainfall)) , where you have the output of one function used as the input to another, all in the same command line. All these things seem very basic, but can be difficult if you are trying to learn on your own with no prior experience in programming. I hope I am not sounding too difficult and contrarian, I am just trying to share my experience with starting with R, and in trying to convey this learning to my colleagues and students. In the end, I did find everything I needed to learn, and now I feel at ease with R, and I believe that almost anybody that can use Excel or something like it, could learn R. Thank you for the information, Best wishes, Keo. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
William, I agree that changing syntax can lead to problems. I don't, however think extending the language will break existing code. Providing a common syntax for accessing matrices and dataframes will not change the way things have been done to date, but rather how things will be done in the future. John John Sorkin jsor...@grecc.umaryland.edu -Original Message- From: "William Dunlap" To: John Sorkin To: Karl Ove Hufthammer To: Sent: 3/2/2010 11:53:45 AM Subject: RE: [R] two questions for R beginners > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin > Sent: Tuesday, March 02, 2010 3:46 AM > To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch > Subject: Re: [R] two questions for R beginners > > Please take what follows not as an ad hominem statement, but > rather as an attempt to improve what is already an excellent > program, that has been built as a result of many, many hours > of dedicated work by many, many unpaid, unsung volunteers. > > It troubles me a bit that when a confusing aspect of R is > pointed out the response is not to try to improve the > language so as to avoid the confusion, but rather to state > that the confusion is inherent in the language. I understand > that to make changes that would avoid the confusing aspect of > the language that has been discussed in this thread would > take time and effort by an R wizard (which I am not), time > and effort that would not be compensated in the traditional > sense. This does not mean that we should not acknowledge the > confusion. If we what R to be the de facto lingua franca of > statistical analysis doesn't it make sense to strive for > syntax that is as straight forward and consistent as possible? Whenever one changes the language that way old code will break. The developers can, with a lot of effort, fix their own code, and perhaps even user-written code on CRAN, but code that thousands of users have written will break. There is a lot of code out there that was written by trial and error and by folks who no longer work at an institution: the code works but no one knows exactly why it works. Telling folks they need to change that code because we have a cleaner but different syntax now is not good. Why would one spend time writing a package that might stop working when R is "upgraded"? I think the solution is not to change current semantics but to write functions that behave better and encourage users to use them, gradually abandoning the old constructs. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > Again, please understand that my comment is made with deepest > respect for the many people who have unselfishly contributed > to the R project. Many thanks to each and every one of you. > > John > > > >>> Karl Ove Hufthammer 3/2/2010 4:00 AM >>> > On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch > > wrote: > > Suppose X is a dataframe or a matrix. What would you > expect to get from > > X[1]? What about as.vector(X), or as.numeric(X)? > > All this of course depends on type of object one is speaking > of. There > are plenty of surprises available, and it's best to use the > most logical > way of extracting. E.g., to extract the top-left element of a 2D > structure (data frame or matrix), use 'X[1,1]'. > > Luckily, R provides some shortcuts. For example, you can > write 'X[2,3]' > on a data frame, just as if it was a matrix, even though the > underlying > structure is completely different. (This doesn't work on a > normal list; > there you have to type the whole 'X[[2]][3]'.) > > The behaviour of the 'as.' functions may sometimes be surprising, at > least for me. For example, 'as.data.frame' on a named vector gives a > single-column data frame, instead of a single-row data frame. > > (I'm not sure what's the recommended way of converting a > named vector to > row data frame, but 'as.data.frame(t(X))' works, even though both 'X' > and 't(X)' looks like a row of numbers.) > > > The point is that a dataframe is a list, and a matrix > isn't. If users > > don't understand that, then they'll be confused somewhere. Making > > matrices more list-like in one respect will just move the confusion > > elsewhere. The solution is to understand the difference. > > My main problem is not understanding the difference, which is > easy, but > knowing which type of I have when I get the output a function in a > package. If I know the object is a named vector or a matrix > with
Re: [R] two questions for R beginners
On 02/03/2010 11:53 AM, William Dunlap wrote: > -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin > Sent: Tuesday, March 02, 2010 3:46 AM > To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch > Subject: Re: [R] two questions for R beginners > > Please take what follows not as an ad hominem statement, but > rather as an attempt to improve what is already an excellent > program, that has been built as a result of many, many hours > of dedicated work by many, many unpaid, unsung volunteers. > > It troubles me a bit that when a confusing aspect of R is > pointed out the response is not to try to improve the > language so as to avoid the confusion, but rather to state > that the confusion is inherent in the language. I understand > that to make changes that would avoid the confusing aspect of > the language that has been discussed in this thread would > take time and effort by an R wizard (which I am not), time > and effort that would not be compensated in the traditional > sense. This does not mean that we should not acknowledge the > confusion. If we what R to be the de facto lingua franca of > statistical analysis doesn't it make sense to strive for > syntax that is as straight forward and consistent as possible? Whenever one changes the language that way old code will break. I think in this case not much code would break. Mostly when people have a matrix M and ask for M$column they'll get an error; the proposal is that they'll get the requested column. (It is possible to have a list with names that is also a matrix with dimnames, but I think that is a pretty unusual construction.) But I haven't been convinced that the proposal is a net improvement to the language. Duncan Murdoch The developers can, with a lot of effort, fix their own code, and perhaps even user-written code on CRAN, but code that thousands of users have written will break. There is a lot of code out there that was written by trial and error and by folks who no longer work at an institution: the code works but no one knows exactly why it works. Telling folks they need to change that code because we have a cleaner but different syntax now is not good. Why would one spend time writing a package that might stop working when R is "upgraded"? I think the solution is not to change current semantics but to write functions that behave better and encourage users to use them, gradually abandoning the old constructs. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > Again, please understand that my comment is made with deepest > respect for the many people who have unselfishly contributed > to the R project. Many thanks to each and every one of you. > > John > > > >>> Karl Ove Hufthammer 3/2/2010 4:00 AM >>> > On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch > > wrote: > > Suppose X is a dataframe or a matrix. What would you > expect to get from > > X[1]? What about as.vector(X), or as.numeric(X)? > > All this of course depends on type of object one is speaking > of. There > are plenty of surprises available, and it's best to use the > most logical > way of extracting. E.g., to extract the top-left element of a 2D > structure (data frame or matrix), use 'X[1,1]'. > > Luckily, R provides some shortcuts. For example, you can > write 'X[2,3]' > on a data frame, just as if it was a matrix, even though the > underlying > structure is completely different. (This doesn't work on a > normal list; > there you have to type the whole 'X[[2]][3]'.) > > The behaviour of the 'as.' functions may sometimes be surprising, at > least for me. For example, 'as.data.frame' on a named vector gives a > single-column data frame, instead of a single-row data frame. > > (I'm not sure what's the recommended way of converting a > named vector to > row data frame, but 'as.data.frame(t(X))' works, even though both 'X' > and 't(X)' looks like a row of numbers.) > > > The point is that a dataframe is a list, and a matrix > isn't. If users > > don't understand that, then they'll be confused somewhere. Making > > matrices more list-like in one respect will just move the confusion > > elsewhere. The solution is to understand the difference. > > My main problem is not understanding the difference, which is > easy, but > knowing which type of I have when I get the output a function in a > package. If I know the object is a named vector or a matrix > with column > names, it's easy enough to type 'X[
Re: [R] two questions for R beginners
> -Original Message- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin > Sent: Tuesday, March 02, 2010 3:46 AM > To: Karl Ove Hufthammer; r-h...@stat.math.ethz.ch > Subject: Re: [R] two questions for R beginners > > Please take what follows not as an ad hominem statement, but > rather as an attempt to improve what is already an excellent > program, that has been built as a result of many, many hours > of dedicated work by many, many unpaid, unsung volunteers. > > It troubles me a bit that when a confusing aspect of R is > pointed out the response is not to try to improve the > language so as to avoid the confusion, but rather to state > that the confusion is inherent in the language. I understand > that to make changes that would avoid the confusing aspect of > the language that has been discussed in this thread would > take time and effort by an R wizard (which I am not), time > and effort that would not be compensated in the traditional > sense. This does not mean that we should not acknowledge the > confusion. If we what R to be the de facto lingua franca of > statistical analysis doesn't it make sense to strive for > syntax that is as straight forward and consistent as possible? Whenever one changes the language that way old code will break. The developers can, with a lot of effort, fix their own code, and perhaps even user-written code on CRAN, but code that thousands of users have written will break. There is a lot of code out there that was written by trial and error and by folks who no longer work at an institution: the code works but no one knows exactly why it works. Telling folks they need to change that code because we have a cleaner but different syntax now is not good. Why would one spend time writing a package that might stop working when R is "upgraded"? I think the solution is not to change current semantics but to write functions that behave better and encourage users to use them, gradually abandoning the old constructs. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > Again, please understand that my comment is made with deepest > respect for the many people who have unselfishly contributed > to the R project. Many thanks to each and every one of you. > > John > > > >>> Karl Ove Hufthammer 3/2/2010 4:00 AM >>> > On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch > > wrote: > > Suppose X is a dataframe or a matrix. What would you > expect to get from > > X[1]? What about as.vector(X), or as.numeric(X)? > > All this of course depends on type of object one is speaking > of. There > are plenty of surprises available, and it's best to use the > most logical > way of extracting. E.g., to extract the top-left element of a 2D > structure (data frame or matrix), use 'X[1,1]'. > > Luckily, R provides some shortcuts. For example, you can > write 'X[2,3]' > on a data frame, just as if it was a matrix, even though the > underlying > structure is completely different. (This doesn't work on a > normal list; > there you have to type the whole 'X[[2]][3]'.) > > The behaviour of the 'as.' functions may sometimes be surprising, at > least for me. For example, 'as.data.frame' on a named vector gives a > single-column data frame, instead of a single-row data frame. > > (I'm not sure what's the recommended way of converting a > named vector to > row data frame, but 'as.data.frame(t(X))' works, even though both 'X' > and 't(X)' looks like a row of numbers.) > > > The point is that a dataframe is a list, and a matrix > isn't. If users > > don't understand that, then they'll be confused somewhere. Making > > matrices more list-like in one respect will just move the confusion > > elsewhere. The solution is to understand the difference. > > My main problem is not understanding the difference, which is > easy, but > knowing which type of I have when I get the output a function in a > package. If I know the object is a named vector or a matrix > with column > names, it's easy enough to type 'X[,"colname"]', and if it's a data > frame one may use the shortcut 'X$colname'. > > Usually, it *is* documented what the return value of a > function is, but > just looking at the output is much faster, and *usually* gives the > correct answer. > > For example, 'mean' applied on a data frame gives a named > vector, not a > data frame, which is somewhat surprising (given that the columns of a > data fra
Re: [R] two questions for R beginners
On Tue, Mar 2, 2010 at 7:27 AM, Duncan Murdoch wrote: > John Sorkin wrote: >> >> Please take what follows not as an ad hominem statement, but rather as an >> attempt to improve what is already an excellent program, that has been built >> as a result of many, many hours of dedicated work by many, many unpaid, >> unsung volunteers. >> >> It troubles me a bit that when a confusing aspect of R is pointed out the >> response is not to try to improve the language so as to avoid the confusion, >> but rather to state that the confusion is inherent in the language. I >> understand that to make changes that would avoid the confusing aspect of the >> language that has been discussed in this thread would take time and effort >> by an R wizard (which I am not), time and effort that would not be >> compensated in the traditional sense. This does not mean that we should not >> acknowledge the confusion. If we what R to be the de facto lingua franca of >> statistical analysis doesn't it make sense to strive for syntax that is as >> straight forward and consistent as possible? > > I think you've misunderstood the argument. It would not be hard to make the > suggested change. I don't object to it because it would be too much work, I > object to it because I think it is not an improvement. Dataframes and > matrices are different, and there is no way to avoid that fact. > The arguments in favour of the change seem to be these: Users of zoo have some experience with this since zoo uses matrices to represent 2d time series and originally did not support $ as a column extractor but now does. I was originally opposed to adding it for the reasons you state but it was eventually added and having used it for some time now since it got into the package I must say that it is very convenient and I now regard it as a definite improvement in user experience. Certainly I use the feature all the time. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
John Sorkin wrote: Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would not be compensated in the traditional sense. This does not mean that we should not acknowledge the confusion. If we what R to be the de facto lingua franca of statistical analysis doesn't it make sense to strive for syntax that is as straight forward and consistent as possible? I think you've misunderstood the argument. It would not be hard to make the suggested change. I don't object to it because it would be too much work, I object to it because I think it is not an improvement. Dataframes and matrices are different, and there is no way to avoid that fact. The arguments in favour of the change seem to be these: - Dataframes and matrices are similar in some respects, so they should be similar in more. In fact, I believe that the source of confusion is the fact that the are similar, so this would not improve things. People would still be confused by the differences, which are unavoidable. - Using $ to extract a column of a matrix would be convenient. I agree, it saves 4 keystrokes to type X$column instead of X[,"column"]. But I think it increases confusion, so the savings are not worthwhile. For example, the col2rgb function returns a matrix with rows named red, green and blue. But under your proposal, I'd still need to use X["red",] to extract the red component, because columns are components, but rows are not. You are complaining that the lack of $ for matrices is an unnecessary asymmetry, and unnecessary asymmetries are confusing. But your proposal introduces a new one! - Some functions return matrices when I expect a dataframe, or vice versa. That will continue to be true regardless of whether the proposed change is made. You need to read the documentation. If it is unclear, it should be improved, the language shouldn't be changed so that sloppy documentation is accurate. - You suggested this so anyone who disagrees must be lazy. Which really is an ad hominem argument, despite your disclaimer. I think you should respect the fact that there are people who disagree with the value of your suggestion. (Which is also an ad hominem attack, but isn't central to my argument.) Duncan Murdoch Again, please understand that my comment is made with deepest respect for the many people who have unselfishly contributed to the R project. Many thanks to each and every one of you. John Karl Ove Hufthammer 3/2/2010 4:00 AM >>> On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch wrote: Suppose X is a dataframe or a matrix. What would you expect to get from X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix, even though the underlying structure is completely different. (This doesn't work on a normal list; there you have to type the whole 'X[[2]][3]'.) The behaviour of the 'as.' functions may sometimes be surprising, at least for me. For example, 'as.data.frame' on a named vector gives a single-column data frame, instead of a single-row data frame. (I'm not sure what's the recommended way of converting a named vector to row data frame, but 'as.data.frame(t(X))' works, even though both 'X' and 't(X)' looks like a row of numbers.) The point is that a dataframe is a list, and a matrix isn't. If users don't understand that, then they'll be confused somewhere. Making matrices more list-like in one respect will just move the confusion elsewhere. The solution is to understand the difference. My main problem is not understanding the difference, which is easy, but knowing which type of I have when I get the output a function in a package. If I know the object is a named vector or a matrix with column names, it's easy enough to type 'X[,"colname"]', and if it's a data frame one may use the shortcut 'X$colname'. Usually, it *is* documented what the return value of a function is, but just looking at the output is much faster, and *usually* gives the co
Re: [R] two questions for R beginners
Please take what follows not as an ad hominem statement, but rather as an attempt to improve what is already an excellent program, that has been built as a result of many, many hours of dedicated work by many, many unpaid, unsung volunteers. It troubles me a bit that when a confusing aspect of R is pointed out the response is not to try to improve the language so as to avoid the confusion, but rather to state that the confusion is inherent in the language. I understand that to make changes that would avoid the confusing aspect of the language that has been discussed in this thread would take time and effort by an R wizard (which I am not), time and effort that would not be compensated in the traditional sense. This does not mean that we should not acknowledge the confusion. If we what R to be the de facto lingua franca of statistical analysis doesn't it make sense to strive for syntax that is as straight forward and consistent as possible? Again, please understand that my comment is made with deepest respect for the many people who have unselfishly contributed to the R project. Many thanks to each and every one of you. John >>> Karl Ove Hufthammer 3/2/2010 4:00 AM >>> On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch wrote: > Suppose X is a dataframe or a matrix. What would you expect to get from > X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix, even though the underlying structure is completely different. (This doesn't work on a normal list; there you have to type the whole 'X[[2]][3]'.) The behaviour of the 'as.' functions may sometimes be surprising, at least for me. For example, 'as.data.frame' on a named vector gives a single-column data frame, instead of a single-row data frame. (I'm not sure what's the recommended way of converting a named vector to row data frame, but 'as.data.frame(t(X))' works, even though both 'X' and 't(X)' looks like a row of numbers.) > The point is that a dataframe is a list, and a matrix isn't. If users > don't understand that, then they'll be confused somewhere. Making > matrices more list-like in one respect will just move the confusion > elsewhere. The solution is to understand the difference. My main problem is not understanding the difference, which is easy, but knowing which type of I have when I get the output a function in a package. If I know the object is a named vector or a matrix with column names, it's easy enough to type 'X[,"colname"]', and if it's a data frame one may use the shortcut 'X$colname'. Usually, it *is* documented what the return value of a function is, but just looking at the output is much faster, and *usually* gives the correct answer. For example, 'mean' applied on a data frame gives a named vector, not a data frame, which is somewhat surprising (given that the columns of a data frame may be of different types, while the elements of a vector may not). (And yes, I know that it's *documented* that it returns a named vector.) On the other hand, perhaps it is surprising that 'mean' works on data frames at all. :-) -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Mon, Mar 1, 2010 at 11:49 PM, Liviu Andronic wrote: > On 3/1/10, Keo Ormsby wrote: >> Perhaps my biggest problem was that I couldn't (and still haven't) seen >> *absolute beginners* documents. >> > there was once a link posted on r-sig-teaching that would probably fit > your needs, but I cannot find it now. > OK, I found it. Below is an excerpt of that r-sig-teaching e-mail. Liviu On Thu, Jul 2, 2009 at 2:19 PM, Robert W. Hayden wrote: > I think such a website would be a real asset. It would be most useful > if it either were restricted to intro. stats. OR organized so that > materials for real beginners were easy to extract from all the > materials for programmers and Ph.D. statisticians. As a relative > beginner myself, I find the usual resources useless. In self defense, > I created materials for my own beginning students: > > http://courses.statistics.com/software/R/Rhome.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Mon, 01 Mar 2010 10:00:07 -0500 Duncan Murdoch wrote: > Suppose X is a dataframe or a matrix. What would you expect to get from > X[1]? What about as.vector(X), or as.numeric(X)? All this of course depends on type of object one is speaking of. There are plenty of surprises available, and it's best to use the most logical way of extracting. E.g., to extract the top-left element of a 2D structure (data frame or matrix), use 'X[1,1]'. Luckily, R provides some shortcuts. For example, you can write 'X[2,3]' on a data frame, just as if it was a matrix, even though the underlying structure is completely different. (This doesn't work on a normal list; there you have to type the whole 'X[[2]][3]'.) The behaviour of the 'as.' functions may sometimes be surprising, at least for me. For example, 'as.data.frame' on a named vector gives a single-column data frame, instead of a single-row data frame. (I'm not sure what's the recommended way of converting a named vector to row data frame, but 'as.data.frame(t(X))' works, even though both 'X' and 't(X)' looks like a row of numbers.) > The point is that a dataframe is a list, and a matrix isn't. If users > don't understand that, then they'll be confused somewhere. Making > matrices more list-like in one respect will just move the confusion > elsewhere. The solution is to understand the difference. My main problem is not understanding the difference, which is easy, but knowing which type of I have when I get the output a function in a package. If I know the object is a named vector or a matrix with column names, it's easy enough to type 'X[,"colname"]', and if it's a data frame one may use the shortcut 'X$colname'. Usually, it *is* documented what the return value of a function is, but just looking at the output is much faster, and *usually* gives the correct answer. For example, 'mean' applied on a data frame gives a named vector, not a data frame, which is somewhat surprising (given that the columns of a data frame may be of different types, while the elements of a vector may not). (And yes, I know that it's *documented* that it returns a named vector.) On the other hand, perhaps it is surprising that 'mean' works on data frames at all. :-) -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Tue, 2 Mar 2010 08:58:25 +1300 Peter Alspach wrote: > This brings up another confusion for new users. Simply typing the > object name at the command line gives just one view of the object (that > provided by print()). Good point. Any good introduction to R should include a brief discussion on 'str'. But sometimes even 'str' can fool you from discovering the real underlying structure of an object, e.g. for data frames. The solution is to use 'unclass' first. -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
>I would love to see a text oriented towards someone who has never used anything but Excel, but realizes >that to do science today you have to go beyond the "Data analysis" toolbar from Excel. >(Plese tell me if you know of any) >Best to all, >Keo. Please look at *R through Excel, *the book that Erich Neuwirth and I published last summer. http://www.springer.com/978-1-4419-0051-7 Erich's RExcel seamlessly integrates the entire set of R's statistical and graphical tools into Excel. Our book shows how to use the system in many ways. You can place any R command within the Excel automatic recalculation mode, you can run Rcmdr from the Excel menu bar. You can run arbitrary R scripts from an Excel spreadsheet. And the full R command window is also available. While RExcel can be downloaded from CRAN by the RExcelInstaller package, it is much easier to download all of R, including RExcel and Rcmdr, in a single installer from http://rcom.univie.ac.at Rich [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Background: During my uni days, I was taught to use MAPLE, MATLAB, SPSS, SAS, C++ and Java. Then after uni, several years went by without me ever using any of them again and was told to just use Excel. Then I started my PhD and was told I should start using R instead (something I'd never even heard of before). I would class myself as being just above a beginner, but not by very much. Probably within walking distance. > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? (1) I read a lot about R having awesome graphics capabilities, but when i looked at the the graphs on the R home page I was a little underwhelmed. I thought Excel graphs looked better (though, to be fair, since that first time, i have seen some pretty awesome graphics produced with R, and even a cool animation someone posted on youtube synchronised to music). (2) The whole *apply family of functions just confused me and looking at the examples didn't really help me to be honest. I understood the idea of vectorisation but I couldn't work out how to get what I wanted as the end result. The plyr package has solved that issue for me though and I now appreciate how cool these functions are. (3) There are a lot of cool sounding packages on CRAN. Sometimes I can read the ref manual and still have no idea how they work. A short tutorial on how the author sees the package being used would be helpful. (4) Also, trivial examples are great for conveying the basics of how a function works. Complicated examples give me a headache. (5) I use to have issues trying to find R related material on the web (then i discovered rseek etc). (6) "cannot allocate vector of size..." -- i think this has to be the most asked question ever on r-help. Not so much of a stumbling block for me anymore, but i always got annoyed whenever i saw it. > * What documents helped you the most in this > initial phase? (1) R cheat sheets are fantastic because I can never remember most of them off the top of my head. (2) Rwiki has save me many precious hours by have easy to follow examples. (3) r-help is great for trying to find answers to questions for the most part. I've learned loads just reading responses people have kindly contributed. Some threads can get long and it would be nice if the origin author would summarise at the end once a suitable solution is found (some other lists do this). (4) Random little blog posts that describe how to do a fun* task in R. These short posts are usually the best way for me to learn, because they don't require too much effort, are sort of easy to understand and follow through from beginning to end, and give you a cool** end result. (5) I prefer 'cookbooks' that show you how to do fun stuff (and hence learn from) as opposed to looking at the official R guides (confession time: I haven't looked at the intro to R guide since my 1st month of using R... which was a couple years ago now). > * where did you look for help expecting answers, but did not find them? Often times, the ?[function name] help pages just didn't make sense to me, even after trying to understand the examples. Sometimes it'd be nice to have something like they do on thottbot for World of Warcraft where each quest has a thread for people to discuss how it works and little tricks. So the R equivalent I guess would be to have a link at the bottom of each help page which links to a thread dedicated to a specific function and where users talk discuss it and offer their own examples and points of view about it. Of course that is probably overkill. I just wanted to see if i could mention WoW in my post. > I especially want to hear from people who are > lazy I did a degree in Maths. > and impatient. I sometimes produce graphics in Excel. Cheers, Tony * = fun is a relative term. I still get a buzz out of seeing ascii art. ** = cool is also relative term. I still think Babylon 5 was cooler than Star Trek DS9. Though nowhere near as cool as Doctor Who. On 25 Feb, 17:31, Patrick Burns wrote: > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > pbu...@pburns.seanet.comhttp://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > > __ > r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://ww
Re: [R] two questions for R beginners
Liviu Andronic escribió: On 3/1/10, Keo Ormsby wrote: Perhaps my biggest problem was that I couldn't (and still haven't) seen *absolute beginners* documents. Perhaps http://www.r-tutor.com/? Also recently a webinar on R [2] was held and it hosts complete course notes and recordings. Otherwise, there was once a link posted on r-sig-teaching that would probably fit your needs, but I cannot find it now. [2] http://www.fort.usgs.gov/brdscience/learnR.htm [..snip..] Maybe Rcmdr could help here? It allows performing entry level statistical analyses, while displaying the complete syntax. Liviu Yes, thanks. I will check the webinars. In the beginning I did bump in to Rcmdr, but since it was a *package* that had to be downloaded from *CRAN*, and during installation it asked for a whole lot of options that at the time I had no idea what was meant by all that. help() always gives a description which is very useful if you know what you want to do, but words like *Generic function*, or the ellipsis (...) *further arguments to be passed to other functions*, can be daunting to the uninitiated. What I wanted to convey is that if you start from your own non-programmer non-statistician area, and try to go directly into R, you will quickly find yourself immerse in a lot of terminology and usability logic that is kind of alien to even proficient users of web browsers and Office suites. Of course there is a lot of information out there, many of it very simple indeed, but perhaps what I was looking for then was not an "R for dummies", but an "R for LAZY dummies". By the way, what I meant by the 6 months, was to completely be R dependent, to not even consider using another software. Thanks for the links! Best, Keo. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On 3/1/10, Keo Ormsby wrote: > Perhaps my biggest problem was that I couldn't (and still haven't) seen > *absolute beginners* documents. > Perhaps http://www.r-tutor.com/? Also recently a webinar on R [2] was held and it hosts complete course notes and recordings. Otherwise, there was once a link posted on r-sig-teaching that would probably fit your needs, but I cannot find it now. [2] http://www.fort.usgs.gov/brdscience/learnR.htm > It took me about six months to start using R > for my everyday data analysis, and now I can't imagine life without it! > My problem was that I knew some programming (Java) and had never used a > command line for statistics. All my statistical needs had been accomplished > through the graphical interface of SPSS or similar software (even Excel!). I > have a feeling that almost all "Introduction to R" documents are made for > making the switch from SPSS and SAS scripting, to R. But I have had a very > difficult time using R as an *entry level* statistical scripting language to > Maybe Rcmdr could help here? It allows performing entry level statistical analyses, while displaying the complete syntax. Liviu > help my colleagues (none of us are either programmers nor statisticians, > mostly biology PhDs and a couple of MDs). > I would love to see a text oriented towards someone who has never used > anything but Excel, but realizes that to do science today you have to go > beyond the "Data analysis" toolbar from Excel. > (Plese tell me if you know of any) > Best to all, > Keo. > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Patrick Burns escribió: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? * What documents helped you the most in this initial phase? I especially want to hear from people who are lazy and impatient. Feel free to write to me off-list. Definitely write off-list if you are just confirming what has been said on-list. Perhaps my biggest problem was that I couldn't (and still haven't) seen *absolute beginners* documents. It took me about six months to start using R for my everyday data analysis, and now I can't imagine life without it! My problem was that I knew some programming (Java) and had never used a command line for statistics. All my statistical needs had been accomplished through the graphical interface of SPSS or similar software (even Excel!). I have a feeling that almost all "Introduction to R" documents are made for making the switch from SPSS and SAS scripting, to R. But I have had a very difficult time using R as an *entry level* statistical scripting language to help my colleagues (none of us are either programmers nor statisticians, mostly biology PhDs and a couple of MDs). I would love to see a text oriented towards someone who has never used anything but Excel, but realizes that to do science today you have to go beyond the "Data analysis" toolbar from Excel. (Plese tell me if you know of any) Best to all, Keo. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On 01-Mar-10 22:44:22, Jim Lemon wrote: > On 03/02/2010 02:02 AM, Karl Ove Hufthammer wrote: >>... >> Of course I agree that 'the idea of a list is so fundamental to R that >> it needs to be something learned pretty early', but is there any harm >> in >> slightly 'blur[ing] the distinction between dataframes and matrices', >> as >> a convenience to the user? Or, in other words, what does one *gain* by >> having '$' on named matrices and vectors give a confusing error >> message >> instead of the expected results? Dinstinction for dinstinction's own >> sake is of little use. >> > > Matrices are all one type, from finish back to start, > while lists can manage many just by keeping them apart. > Like foxes, geese and corn that must be ferried 'cross the Styx, > it's not a good idea to let the separate columns mix. > So yield not to temptation when the two appear to match, > for appearances mislead and in the substance lies the catch. > Convenience is a quicksand that can suck the user down. > 'Tis better to avoid the stuff and know one's way around. > > Jim That HAS to be a Fortune! Ted. E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 01-Mar-10 Time: 23:08:20 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On 03/02/2010 02:02 AM, Karl Ove Hufthammer wrote: ... Of course I agree that 'the idea of a list is so fundamental to R that it needs to be something learned pretty early', but is there any harm in slightly 'blur[ing] the distinction between dataframes and matrices', as a convenience to the user? Or, in other words, what does one *gain* by having '$' on named matrices and vectors give a confusing error message instead of the expected results? Dinstinction for dinstinction's own sake is of little use. Matrices are all one type, from finish back to start, while lists can manage many just by keeping them apart. Like foxes, geese and corn that must be ferried 'cross the Styx, it's not a good idea to let the separate columns mix. So yield not to temptation when the two appear to match, for appearances mislead and in the substance lies the catch. Convenience is a quicksand that can suck the user down. 'Tis better to avoid the stuff and know one's way around. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
> On 01/03/2010 9:19 AM, John Sorkin wrote: > If it looks like a duck and quacks like a duck, it ought to > behave like a duck. > This brings up another confusion for new users. Simply typing the object name at the command line gives just one view of the object (that provided by print()). Real ducks fooled by decoy ducks get shot. The consequences of thinking a matrix and dataframe are the same are not quite so severe :-) Hei kona ra . Peter Alspach __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On 01/03/2010 11:33 AM, hadley wickham wrote: > Suppose X is a dataframe or a matrix. What would you expect to get from > X[1]? What about as.vector(X), or as.numeric(X)? > > The point is that a dataframe is a list, and a matrix isn't. If users don't > understand that, then they'll be confused somewhere. Making matrices more > list-like in one respect will just move the confusion elsewhere. The > solution is to understand the difference. What I find more confusing is the behaviour of $ with vectors. In my mind x$a is a shortcut for writing x[["a"]], but: And I still remember being surprised by the x[["a"]] behaviour! Duncan Murdoch > x <- list(a = 1) > x$a [1] 1 > x <- c(a = 1) > x$a Error in x$a : $ operator is invalid for atomic vectors > x[["a"]] [1] 1 Hadley __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
I though duck-typing was about type-independency? I could feed the bird object bread() or carrots(), or any other method, and that's okay as long as the bird doesn't die. And since ducks don't like carrots [at least, afaik] Quaaack! ;-) Albert-Jan ~~ In the face of ambiguity, refuse the temptation to guess. ~~ --- On Mon, 3/1/10, Patrick Burns wrote: From: Patrick Burns Subject: Re: [R] two questions for R beginners To: r-help@r-project.org Date: Monday, March 1, 2010, 5:08 PM If it looks like a duck and quacks like a duck, you ought to treat it like a duck. That is, use two subscripts: x[i, j] If you are an ornithologist, then you will know more precisely what can be done. Pat On 01/03/2010 14:19, John Sorkin wrote: > If it looks like a duck and quacks like a duck, it ought to behave like a > duck. > > To the user a matrix and a dataframe look alike . . . except a dataframe can > hold non-numeric values. Thus to the users, a matrix looks like a special > case of a DF, or perhaps conversely. If you can address elements of one > structure using a given syntax, you should be able to address elements of the > other structure using the same syntax. To do otherwise leads to confusion and > is counter intuitive. > John > > > > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center > 10 North Greene Street > GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Petr > PIKAL 3/1/2010 8:57 AM>>> > Hi > > r-help-boun...@r-project.org napsal dne 01.03.2010 13:03:24: > > < snip> > >>> >>> I understand that 2 dimensional rectangular matrix looks quite >>> similar to data frame however it is only a vector with dimensions. >>> As such it can have items of only one type (numeric, character, ...). >>> And you can easily change dimensions of matrix. >>> >>> matrix<-1:12 >>> dim(matrix)<- c(2,6) >>> matrix >>> dim(matrix)<- c(2,2,3) >>> matrix >>> dim(matrix)<-NULL >>> matrix >>> >>> So rectangular structure of printed matrix is a kind of coincidence >>> only, whereas rectangular structure of data frame is its main feature. >>> >>> Regards >>> Petr >>>> >>>> -- >>>> Karl Ove Hufthammer >> >> Petr, I think that could be confusing! The way I see it is that >> a matrix is a special case of an array, whose "dimension" attribute >> is of length 2 (number of "rows", number of "columns"); and "row" >> and "column" refer to the rectangular display which you see when >> R prints to matrix. And this, of course, derives directly from >> the historic rectangular view of a matrix when written down. >> >> When you went from "dim(matrix)<-c(2,6)" to "dim(matrix)<-c(2,2,3)" >> you stripped it of its special title of "matrix" and cast it out >> into the motley mob of arrays (some of whom are matrices, but >> "matrix" no longer is). >> >> So the "rectangular structure of printed matrix" is not a coincidence, >> but is its main feature! > > Ok. Point taken. However I feel that possibility to manipulate > matrix/array dimensions by simple changing them as I showed above > together with perceiving matrix as a **vector with dimensions** prevented > me especially in early days from using matrices instead of data frames and > vice versa. > > Consider cbind and rbind confusing results for vectors with unequal mode. > Far to often we can see something like that > >> cbind(1:2,letters[1:2]) > [,1] [,2] > [1,] "1" "a" > [2,] "2" "b" > > instead of > >> data.frame(1:2,letters[1:2]) > X1.2 letters.1.2. > 1 1 a > 2 2 b > > and then a question why does not the result behave as expected. Each type > of object has some features which is good for some type of > manipulation/analysis/plotting bud quite detrimental for others. > > Regards > Petr > > >> >> To come back to Karl's query about why "$" works for a dataframe >> but not for a matrix, note that "$" is the extractor for getting >> a named component of a list. So, Karl, when you did >&g
Re: [R] two questions for R beginners
> Suppose X is a dataframe or a matrix. What would you expect to get from > X[1]? What about as.vector(X), or as.numeric(X)? > > The point is that a dataframe is a list, and a matrix isn't. If users don't > understand that, then they'll be confused somewhere. Making matrices more > list-like in one respect will just move the confusion elsewhere. The > solution is to understand the difference. What I find more confusing is the behaviour of $ with vectors. In my mind x$a is a shortcut for writing x[["a"]], but: > x <- list(a = 1) > x$a [1] 1 > x <- c(a = 1) > x$a Error in x$a : $ operator is invalid for atomic vectors > x[["a"]] [1] 1 Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
If it looks like a duck and quacks like a duck, you ought to treat it like a duck. That is, use two subscripts: x[i, j] If you are an ornithologist, then you will know more precisely what can be done. Pat On 01/03/2010 14:19, John Sorkin wrote: If it looks like a duck and quacks like a duck, it ought to behave like a duck. To the user a matrix and a dataframe look alike . . . except a dataframe can hold non-numeric values. Thus to the users, a matrix looks like a special case of a DF, or perhaps conversely. If you can address elements of one structure using a given syntax, you should be able to address elements of the other structure using the same syntax. To do otherwise leads to confusion and is counter intuitive. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Petr PIKAL 3/1/2010 8:57 AM>>> Hi r-help-boun...@r-project.org napsal dne 01.03.2010 13:03:24: < snip> I understand that 2 dimensional rectangular matrix looks quite similar to data frame however it is only a vector with dimensions. As such it can have items of only one type (numeric, character, ...). And you can easily change dimensions of matrix. matrix<-1:12 dim(matrix)<- c(2,6) matrix dim(matrix)<- c(2,2,3) matrix dim(matrix)<-NULL matrix So rectangular structure of printed matrix is a kind of coincidence only, whereas rectangular structure of data frame is its main feature. Regards Petr -- Karl Ove Hufthammer Petr, I think that could be confusing! The way I see it is that a matrix is a special case of an array, whose "dimension" attribute is of length 2 (number of "rows", number of "columns"); and "row" and "column" refer to the rectangular display which you see when R prints to matrix. And this, of course, derives directly from the historic rectangular view of a matrix when written down. When you went from "dim(matrix)<-c(2,6)" to "dim(matrix)<-c(2,2,3)" you stripped it of its special title of "matrix" and cast it out into the motley mob of arrays (some of whom are matrices, but "matrix" no longer is). So the "rectangular structure of printed matrix" is not a coincidence, but is its main feature! Ok. Point taken. However I feel that possibility to manipulate matrix/array dimensions by simple changing them as I showed above together with perceiving matrix as a **vector with dimensions** prevented me especially in early days from using matrices instead of data frames and vice versa. Consider cbind and rbind confusing results for vectors with unequal mode. Far to often we can see something like that cbind(1:2,letters[1:2]) [,1] [,2] [1,] "1" "a" [2,] "2" "b" instead of data.frame(1:2,letters[1:2]) X1.2 letters.1.2. 11a 22b and then a question why does not the result behave as expected. Each type of object has some features which is good for some type of manipulation/analysis/plotting bud quite detrimental for others. Regards Petr To come back to Karl's query about why "$" works for a dataframe but not for a matrix, note that "$" is the extractor for getting a named component of a list. So, Karl, when you did d=head(iris[1:4]) you created a dataframe: str(d) # 'data.frame': 6 obs. of 4 variables: # $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 # $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 # $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 (with named components "Sepal.Length", ... , "Petal.Width"), and a dataframe is a special case of a general list. In a general list, the separate components can each be anything. In a dataframe, each component is a vector; the different vectors may be of different types (logical, numeric, ... ) but of course the elements of any single vector must be of the same type; and, in a dataframe, all the vectors must have the same length (otherwise it is a general list, not a dataframe). So, when you print a dataframe, R chooses to display it as a rectangular structure. On the other hand, when you print a general list, R displays it quite differently: d # Sepal.Length Sepal.Width Petal.Length Petal.Width # 1 5.1 3.5 1.4 0.2 # 2 4.9 3.0 1.4 0.2 # 3 4.7 3.2 1.3 0.2 # 4 4.6 3.1 1.5 0.2 # 5 5.0 3.6 1.4 0.2 # 6 5.4 3.9 1.7 0.4 d3<- list(C1=c(1.1,1.2,1.3), C2=c(2.1,2.2,2.3,2.4)) d3 # $C1 # [1] 1.1 1.2 1.3 # $C2 # [1] 2.1 2.2 2.3 2.4 Notice the similarity (though not identity) between the print of d3 and the output of str(d). There is a bit more
Re: [R] two questions for R beginners
On Mon, 01 Mar 2010 14:50:57 - (GMT) ted.hard...@manchester.ac.uk wrote: > as.character(pi) > # [1] "3.14159265358979" > > That raises a few questions about "expectations" too! Expectations can indeed be dangerous. I have been bitten by this one: as.numeric(as.character(pi)) It works fine in the US, but not in Europe. :) Hint: Try options(OutDec=",") -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Mon, 01 Mar 2010 12:25:20 - (GMT) ted.hard...@manchester.ac.uk wrote: > > A similar type of overloading is used in the 'sp' class functions, > > where you can basically treat a 'SpatialPointsDataFrame', a > > 'SpatialLinesDataFrame' or a 'SpatialPolygonsDataFrame' as a data > > frame, > > with '$colname' indexing and '[' subsetting, even though the > > *internals* > > of the objects have a completely different (and very complex) > > structure. > > But as a convenience to the user, you don't need to bother with the > > internals, and can handle the object *as if* it were a data frame. It's > > a very comfortable way of working. > > I'm not sure that "SpatialPointsDataFrame" is a dataframe (despite > the name)! Is it not simply a list? In which case, using "$" is > what you have to do to get at its components. That it's not a data frame is the point. :-) And it not simply a list, it's a S4 object with the data (frame) stored in a 'data' slot, and '$' overloaded so you can use it *as if* it was a data frame. Example: library(sp) example("SpatialPolygonsDataFrame-class") # Internal structure (warning: not pretty!) str(ex_1.7$x) # Extracting columns from the data frame ex_1.7$z # Both 'nrow' and '[' is overloaded, so you can use '[' # for normal subsetting. For example, to plot 10 random # polygons, you can type ex.sub=ex_1.7[sample(nrow(ex_1.7), 10), ] plot(ex.sub) In most cases you don't have to worry about how everything is stored internally; things just work like you expect them to. -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Mon, Mar 1, 2010 at 4:02 PM, Karl Ove Hufthammer wrote: > On Mon, 01 Mar 2010 09:09:11 -0500 Duncan Murdoch > wrote: >> >> The reason for the difference is that data.frames are lists organized >> >> into columns (so the $ handling comes from the list, where it means >> >> "extract the component") whereas a matrix is a single vector displayed >> >> in columns. >> > >> > Sure, I know that. But is there are reason why the '$' can't be >> > overloaded to handle the extraction, as a *convenience* to the user? >> >> See the second paragraph of my response. > > OK. So I take it that there are no *technical* reasons can't be made to > work for matrices and named vectors? I tried redefining it for matrices > with > > `$.matrix`=function(x, name) ... something ... > > but I still get an error message when trying to use it. > > Of course I agree that 'the idea of a list is so fundamental to R that > it needs to be something learned pretty early', but is there any harm in > slightly 'blur[ing] the distinction between dataframes and matrices', as > a convenience to the user? Or, in other words, what does one *gain* by > having '$' on named matrices and vectors give a confusing error message > instead of the expected results? Dinstinction for dinstinction's own > sake is of little use. > > In case anyone is wondering about the vector case (of which matrices is > of course only a special case), here is an example: > >> d=iris[,1:4] >> d1=head(d,1) >> d2=mean(d) >> >> d1 > Sepal.Length Sepal.Width Petal.Length Petal.Width > 1 5.1 3.5 1.4 0.2 >> d2 > Sepal.Length Sepal.Width Petal.Length Petal.Width > 5.84 3.057333 3.758000 1.199333 >> >> d3$Sepal.Width > [1] 3.5 >> d4$Sepal.Width > Error in d4$Sepal.Width : $ operator is invalid for atomic vectors > > -- > Karl Ove Hufthammer > As a technical excercise, I wrote the following function: '%W%'<-function(e1,e2)e1[,which(colnames(e1)%in%e2)] temp<-matrix(1:6,nrow=2,dimnames=list(a=1:2,b=c("a","b","c"))) temp%W%"b" I assume that the reason you can't use $.matrix , is that $ is a primitive function and doesn't use the UseMethod function. /Gustaf -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
> One of the things about R which many (and that certainly includes > me) have to find out the hard way is that you have to *learn* > what to expect! You can't just import it from prior experience in > other contexts. So, by the time you have learned that a matrix > is such that all its elements must have the same type, whereas > the components of a list (or as special case the columns of a > dataframe) can be of different types, you expect the first result > (your "cbind(1:2,letters[1:2])"): R can coerce the numerical > elements to character type, but letters have too much character > to allow themselves be coerced into numerical. cbind is not the best example, because it has rather complex behaviour: cbind(1:2, letters[1:2]) cbind(1:2, letters[1:2], data.frame(1:2)) cbind(cbind(1:2, letters[1:2]), data.frame(1:2)) Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Mon, 01 Mar 2010 09:09:11 -0500 Duncan Murdoch wrote: > >> The reason for the difference is that data.frames are lists organized > >> into columns (so the $ handling comes from the list, where it means > >> "extract the component") whereas a matrix is a single vector displayed > >> in columns. > > > > Sure, I know that. But is there are reason why the '$' can't be > > overloaded to handle the extraction, as a *convenience* to the user? > > See the second paragraph of my response. OK. So I take it that there are no *technical* reasons can't be made to work for matrices and named vectors? I tried redefining it for matrices with `$.matrix`=function(x, name) ... something ... but I still get an error message when trying to use it. Of course I agree that 'the idea of a list is so fundamental to R that it needs to be something learned pretty early', but is there any harm in slightly 'blur[ing] the distinction between dataframes and matrices', as a convenience to the user? Or, in other words, what does one *gain* by having '$' on named matrices and vectors give a confusing error message instead of the expected results? Dinstinction for dinstinction's own sake is of little use. In case anyone is wondering about the vector case (of which matrices is of course only a special case), here is an example: > d=iris[,1:4] > d1=head(d,1) > d2=mean(d) > > d1 Sepal.Length Sepal.Width Petal.Length Petal.Width 1 5.1 3.5 1.4 0.2 > d2 Sepal.Length Sepal.Width Petal.Length Petal.Width 5.84 3.057333 3.758000 1.199333 > > d3$Sepal.Width [1] 3.5 > d4$Sepal.Width Error in d4$Sepal.Width : $ operator is invalid for atomic vectors -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On 01/03/2010 9:19 AM, John Sorkin wrote: If it looks like a duck and quacks like a duck, it ought to behave like a duck. To the user a matrix and a dataframe look alike . . . except a dataframe can hold non-numeric values. Thus to the users, a matrix looks like a special case of a DF, or perhaps conversely. If you can address elements of one structure using a given syntax, you should be able to address elements of the other structure using the same syntax. To do otherwise leads to confusion and is counter intuitive. Suppose X is a dataframe or a matrix. What would you expect to get from X[1]? What about as.vector(X), or as.numeric(X)? The point is that a dataframe is a list, and a matrix isn't. If users don't understand that, then they'll be confused somewhere. Making matrices more list-like in one respect will just move the confusion elsewhere. The solution is to understand the difference. Duncan Murdoch John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Petr PIKAL 3/1/2010 8:57 AM >>> Hi r-help-boun...@r-project.org napsal dne 01.03.2010 13:03:24: < snip> > > > > I understand that 2 dimensional rectangular matrix looks quite > > similar to data frame however it is only a vector with dimensions. > > As such it can have items of only one type (numeric, character, ...). > > And you can easily change dimensions of matrix. > > > > matrix<-1:12 > > dim(matrix) <- c(2,6) > > matrix > > dim(matrix) <- c(2,2,3) > > matrix > > dim(matrix) <-NULL > > matrix > > > > So rectangular structure of printed matrix is a kind of coincidence > > only, whereas rectangular structure of data frame is its main feature. > > > > Regards > > Petr > >> > >> -- > >> Karl Ove Hufthammer > > Petr, I think that could be confusing! The way I see it is that > a matrix is a special case of an array, whose "dimension" attribute > is of length 2 (number of "rows", number of "columns"); and "row" > and "column" refer to the rectangular display which you see when > R prints to matrix. And this, of course, derives directly from > the historic rectangular view of a matrix when written down. > > When you went from "dim(matrix)<-c(2,6)" to "dim(matrix)<-c(2,2,3)" > you stripped it of its special title of "matrix" and cast it out > into the motley mob of arrays (some of whom are matrices, but > "matrix" no longer is). > > So the "rectangular structure of printed matrix" is not a coincidence, > but is its main feature! Ok. Point taken. However I feel that possibility to manipulate matrix/array dimensions by simple changing them as I showed above together with perceiving matrix as a **vector with dimensions** prevented me especially in early days from using matrices instead of data frames and vice versa. Consider cbind and rbind confusing results for vectors with unequal mode. Far to often we can see something like that > cbind(1:2,letters[1:2]) [,1] [,2] [1,] "1" "a" [2,] "2" "b" instead of > data.frame(1:2,letters[1:2]) X1.2 letters.1.2. 11a 22b and then a question why does not the result behave as expected. Each type of object has some features which is good for some type of manipulation/analysis/plotting bud quite detrimental for others. Regards Petr > > To come back to Karl's query about why "$" works for a dataframe > but not for a matrix, note that "$" is the extractor for getting > a named component of a list. So, Karl, when you did > > d=head(iris[1:4]) > > you created a dataframe: > > str(d) > # 'data.frame': 6 obs. of 4 variables: > # $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 > # $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 > # $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 > # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 > > (with named components "Sepal.Length", ... , "Petal.Width"), > and a dataframe is a special case of a general list. In a > general list, the separate components can each be anything. > In a dataframe, each component is a vector; the different > vectors may be of different types (logical, numeric, ... ) > but of course the elements of any single vector must be > of the same type; and, in a dataframe, all the vectors must > have the same length (otherwise it is a general list, not > a dataframe). > > So, when you print a dataframe, R chooses to display it > as a rectangular structure. On the other hand, when you > print a general list, R displays it quite differently: > > d > # Sepal.Length Sepal.Width Petal.Length Petal.Width > # 1 5.1 3.5 1.4 0.2 > # 2 4.9 3.0 1.4 0.2 > # 3 4.7 3.2 1.3 0.2 > #
Re: [R] two questions for R beginners
On 01-Mar-10 13:57:08, Petr PIKAL wrote: > Hi > r-help-boun...@r-project.org napsal dne 01.03.2010 13:03:24: > < snip> >> > I understand that 2 dimensional rectangular matrix looks quite >> > similar to data frame however it is only a vector with dimensions. >> > As such it can have items of only one type (numeric, character, >> > ...). >> > And you can easily change dimensions of matrix. >> > >> > matrix<-1:12 >> > dim(matrix) <- c(2,6) >> > matrix >> > dim(matrix) <- c(2,2,3) >> > matrix >> > dim(matrix) <-NULL >> > matrix >> > >> > So rectangular structure of printed matrix is a kind of coincidence >> > only, whereas rectangular structure of data frame is its main >> > feature. >> > >> > Regards >> > Petr >> >> >> >> -- >> >> Karl Ove Hufthammer >> >> Petr, I think that could be confusing! The way I see it is that >> a matrix is a special case of an array, whose "dimension" attribute >> is of length 2 (number of "rows", number of "columns"); and "row" >> and "column" refer to the rectangular display which you see when >> R prints to matrix. And this, of course, derives directly from >> the historic rectangular view of a matrix when written down. >> >> When you went from "dim(matrix)<-c(2,6)" to "dim(matrix)<-c(2,2,3)" >> you stripped it of its special title of "matrix" and cast it out >> into the motley mob of arrays (some of whom are matrices, but >> "matrix" no longer is). >> >> So the "rectangular structure of printed matrix" is not a coincidence, >> but is its main feature! > > Ok. Point taken. However I feel that possibility to manipulate > matrix/array dimensions by simple changing them as I showed above > together with perceiving matrix as a **vector with dimensions** > prevented me especially in early days from using matrices instead > of data frames and vice versa. > > Consider cbind and rbind confusing results for vectors with unequal > mode. > Far too often we can see something like that > >> cbind(1:2,letters[1:2]) > [,1] [,2] > [1,] "1" "a" > [2,] "2" "b" > > instead of > >> " > X1.2 letters.1.2. > 11a > 22b > > and then a question why does not the result behave as expected. > Each type of object has some features which is good for some > type of manipulation/analysis/plotting but quite detrimental > for others. > > Regards > Petr > >> [the rest from my previous reply stripped Well, it depends what one means by "as expected"! One of the things about R which many (and that certainly includes me) have to find out the hard way is that you have to *learn* what to expect! You can't just import it from prior experience in other contexts. So, by the time you have learned that a matrix is such that all its elements must have the same type, whereas the components of a list (or as special case the columns of a dataframe) can be of different types, you expect the first result (your "cbind(1:2,letters[1:2])"): R can coerce the numerical elements to character type, but letters have too much character to allow themselves be coerced into numerical. What can be confusing is that the on-screem output of your "data.frame(1:2,letters[1:2])" does not exhibit the "" quotes which identify character type, so that what appears on screen does not inform the user of what is going on. In particular, one can not learn from the display that the 1 and 2 are numbers and not characters. Since a and b are displayed without "", and so are the 1 and 2, the 1 and 2 could be either -- until you check it out by other means. I think it is not that one should object to the behaviours of the different types of objects. What really matters is that one needs to learn what they are and why they behave as they do, and not be misled by appearances. And this includes the somtimes unpredictable behaviours of the print methods. In the context of this thread, I think the issues that have been raised by the current discussion on matrices and dataframes should receive sympathetic attention in Patrick Burns' project. To close: cbind(c(pi,pi),letters[1:2]) # [,1] [,2] # [1,] "3.14159265358979" "a" # [2,] "3.14159265358979" "b" data.frame(c(pi,pi),letters[1:2]) # c.pi..pi. letters.1.2. # 1 3.141593a # 2 3.141593b pi # [1] 3.141593 as.character(pi) # [1] "3.14159265358979" That raises a few questions about "expectations" too! Ted. E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 01-Mar-10 Time: 14:50:50 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
If it looks like a duck and quacks like a duck, it ought to behave like a duck. To the user a matrix and a dataframe look alike . . . except a dataframe can hold non-numeric values. Thus to the users, a matrix looks like a special case of a DF, or perhaps conversely. If you can address elements of one structure using a given syntax, you should be able to address elements of the other structure using the same syntax. To do otherwise leads to confusion and is counter intuitive. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Petr PIKAL 3/1/2010 8:57 AM >>> Hi r-help-boun...@r-project.org napsal dne 01.03.2010 13:03:24: < snip> > > > > I understand that 2 dimensional rectangular matrix looks quite > > similar to data frame however it is only a vector with dimensions. > > As such it can have items of only one type (numeric, character, ...). > > And you can easily change dimensions of matrix. > > > > matrix<-1:12 > > dim(matrix) <- c(2,6) > > matrix > > dim(matrix) <- c(2,2,3) > > matrix > > dim(matrix) <-NULL > > matrix > > > > So rectangular structure of printed matrix is a kind of coincidence > > only, whereas rectangular structure of data frame is its main feature. > > > > Regards > > Petr > >> > >> -- > >> Karl Ove Hufthammer > > Petr, I think that could be confusing! The way I see it is that > a matrix is a special case of an array, whose "dimension" attribute > is of length 2 (number of "rows", number of "columns"); and "row" > and "column" refer to the rectangular display which you see when > R prints to matrix. And this, of course, derives directly from > the historic rectangular view of a matrix when written down. > > When you went from "dim(matrix)<-c(2,6)" to "dim(matrix)<-c(2,2,3)" > you stripped it of its special title of "matrix" and cast it out > into the motley mob of arrays (some of whom are matrices, but > "matrix" no longer is). > > So the "rectangular structure of printed matrix" is not a coincidence, > but is its main feature! Ok. Point taken. However I feel that possibility to manipulate matrix/array dimensions by simple changing them as I showed above together with perceiving matrix as a **vector with dimensions** prevented me especially in early days from using matrices instead of data frames and vice versa. Consider cbind and rbind confusing results for vectors with unequal mode. Far to often we can see something like that > cbind(1:2,letters[1:2]) [,1] [,2] [1,] "1" "a" [2,] "2" "b" instead of > data.frame(1:2,letters[1:2]) X1.2 letters.1.2. 11a 22b and then a question why does not the result behave as expected. Each type of object has some features which is good for some type of manipulation/analysis/plotting bud quite detrimental for others. Regards Petr > > To come back to Karl's query about why "$" works for a dataframe > but not for a matrix, note that "$" is the extractor for getting > a named component of a list. So, Karl, when you did > > d=head(iris[1:4]) > > you created a dataframe: > > str(d) > # 'data.frame': 6 obs. of 4 variables: > # $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 > # $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 > # $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 > # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 > > (with named components "Sepal.Length", ... , "Petal.Width"), > and a dataframe is a special case of a general list. In a > general list, the separate components can each be anything. > In a dataframe, each component is a vector; the different > vectors may be of different types (logical, numeric, ... ) > but of course the elements of any single vector must be > of the same type; and, in a dataframe, all the vectors must > have the same length (otherwise it is a general list, not > a dataframe). > > So, when you print a dataframe, R chooses to display it > as a rectangular structure. On the other hand, when you > print a general list, R displays it quite differently: > > d > # Sepal.Length Sepal.Width Petal.Length Petal.Width > # 1 5.1 3.5 1.4 0.2 > # 2 4.9 3.0 1.4 0.2 > # 3 4.7 3.2 1.3 0.2 > # 4 4.6 3.1 1.5 0.2 > # 5 5.0 3.6 1.4 0.2 > # 6 5.4 3.9 1.7 0.4 > > d3 <- list(C1=c(1.1,1.2,1.3), C2=c(2.1,2.2,2.3,2.4)) > d3 > # $C1 > # [1] 1.1 1.2 1.3 > # $C2 > # [1] 2.1 2.2 2.3 2.4 > > Notice the similarity (though not identity) between the print > of d3 and the output of str(d). There is a bit more hard-wired > stuff built into a dataframe which ma
Re: [R] two questions for R beginners
Karl Ove Hufthammer wrote: On Mon, 01 Mar 2010 06:37:30 -0500 Duncan Murdoch wrote: Some functions output matrices where you would expect them to output data frames, and then this problem occurs. (Is there a reason why '$' could/should not be made to 'work' on matrices too?) The reason for the difference is that data.frames are lists organized into columns (so the $ handling comes from the list, where it means "extract the component") whereas a matrix is a single vector displayed in columns. Sure, I know that. But is there are reason why the '$' can't be overloaded to handle the extraction, as a *convenience* to the user? See the second paragraph of my response. Duncan Murdoch After all, it *is* possible to extract columns by name from matrices too (e.g., using d[,"Sepal.Width"]). A similar type of overloading is used in the 'sp' class functions, where you can basically treat a 'SpatialPointsDataFrame', a 'SpatialLinesDataFrame' or a 'SpatialPolygonsDataFrame' as a data frame, with '$colname' indexing and '[' subsetting, even though the *internals* of the objects have a completely different (and very complex) structure. But as a convenience to the user, you don't need to bother with the internals, and can handle the object *as if* it were a data frame. It's a very comfortable way of working. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Hi r-help-boun...@r-project.org napsal dne 01.03.2010 13:03:24: < snip> > > > > I understand that 2 dimensional rectangular matrix looks quite > > similar to data frame however it is only a vector with dimensions. > > As such it can have items of only one type (numeric, character, ...). > > And you can easily change dimensions of matrix. > > > > matrix<-1:12 > > dim(matrix) <- c(2,6) > > matrix > > dim(matrix) <- c(2,2,3) > > matrix > > dim(matrix) <-NULL > > matrix > > > > So rectangular structure of printed matrix is a kind of coincidence > > only, whereas rectangular structure of data frame is its main feature. > > > > Regards > > Petr > >> > >> -- > >> Karl Ove Hufthammer > > Petr, I think that could be confusing! The way I see it is that > a matrix is a special case of an array, whose "dimension" attribute > is of length 2 (number of "rows", number of "columns"); and "row" > and "column" refer to the rectangular display which you see when > R prints to matrix. And this, of course, derives directly from > the historic rectangular view of a matrix when written down. > > When you went from "dim(matrix)<-c(2,6)" to "dim(matrix)<-c(2,2,3)" > you stripped it of its special title of "matrix" and cast it out > into the motley mob of arrays (some of whom are matrices, but > "matrix" no longer is). > > So the "rectangular structure of printed matrix" is not a coincidence, > but is its main feature! Ok. Point taken. However I feel that possibility to manipulate matrix/array dimensions by simple changing them as I showed above together with perceiving matrix as a **vector with dimensions** prevented me especially in early days from using matrices instead of data frames and vice versa. Consider cbind and rbind confusing results for vectors with unequal mode. Far to often we can see something like that > cbind(1:2,letters[1:2]) [,1] [,2] [1,] "1" "a" [2,] "2" "b" instead of > data.frame(1:2,letters[1:2]) X1.2 letters.1.2. 11a 22b and then a question why does not the result behave as expected. Each type of object has some features which is good for some type of manipulation/analysis/plotting bud quite detrimental for others. Regards Petr > > To come back to Karl's query about why "$" works for a dataframe > but not for a matrix, note that "$" is the extractor for getting > a named component of a list. So, Karl, when you did > > d=head(iris[1:4]) > > you created a dataframe: > > str(d) > # 'data.frame': 6 obs. of 4 variables: > # $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 > # $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 > # $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 > # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 > > (with named components "Sepal.Length", ... , "Petal.Width"), > and a dataframe is a special case of a general list. In a > general list, the separate components can each be anything. > In a dataframe, each component is a vector; the different > vectors may be of different types (logical, numeric, ... ) > but of course the elements of any single vector must be > of the same type; and, in a dataframe, all the vectors must > have the same length (otherwise it is a general list, not > a dataframe). > > So, when you print a dataframe, R chooses to display it > as a rectangular structure. On the other hand, when you > print a general list, R displays it quite differently: > > d > # Sepal.Length Sepal.Width Petal.Length Petal.Width > # 1 5.1 3.5 1.4 0.2 > # 2 4.9 3.0 1.4 0.2 > # 3 4.7 3.2 1.3 0.2 > # 4 4.6 3.1 1.5 0.2 > # 5 5.0 3.6 1.4 0.2 > # 6 5.4 3.9 1.7 0.4 > > d3 <- list(C1=c(1.1,1.2,1.3), C2=c(2.1,2.2,2.3,2.4)) > d3 > # $C1 > # [1] 1.1 1.2 1.3 > # $C2 > # [1] 2.1 2.2 2.3 2.4 > > Notice the similarity (though not identity) between the print > of d3 and the output of str(d). There is a bit more hard-wired > stuff built into a dataframe which makes it more than simply > a "list with all components vectors of equal length). However, > one could also say that "the rectangular structure is its > main feature". > > As to why "$" will not work on matrices: a matrix, as Petr > points out, is a vector with a "dimensions" attribute which > has length 2 (as opposed to a general array where the length > of the dimensions attribute could be anything). Hence it is > not a list of named components in the sense of "list". > > Hence "$" will not work with a matrix, since "$" will not > be able to find any list-components. which is basically what > the error message > > d2$Sepal.Width > # Error in d2$Sepal.Width : $ operator is invalid for atomic vectors > > is telling you: d2 is an atomic vector with a length-2 dimensions > attribute. It has no list-type components for "$" to get its
Re: [R] two questions for R beginners
On 01-Mar-10 12:07:52, Karl Ove Hufthammer wrote: > On Mon, 01 Mar 2010 06:37:30 -0500 Duncan Murdoch > > wrote: >> > Some functions output matrices where you would expect them to output >> > data frames, and then this problem occurs. (Is there a reason why >> > '$' >> > could/should not be made to 'work' on matrices too?) >> > >> The reason for the difference is that data.frames are lists organized >> into columns (so the $ handling comes from the list, where it means >> "extract the component") whereas a matrix is a single vector displayed >> in columns. > > Sure, I know that. But is there are reason why the '$' can't be > overloaded to handle the extraction, as a *convenience* to the user? > After all, it *is* possible to extract columns by name from matrices > too (e.g., using d[,"Sepal.Width"]). > > A similar type of overloading is used in the 'sp' class functions, > where you can basically treat a 'SpatialPointsDataFrame', a > 'SpatialLinesDataFrame' or a 'SpatialPolygonsDataFrame' as a data > frame, > with '$colname' indexing and '[' subsetting, even though the > *internals* > of the objects have a completely different (and very complex) > structure. > But as a convenience to the user, you don't need to bother with the > internals, and can handle the object *as if* it were a data frame. It's > a very comfortable way of working. > > -- > Karl Ove Hufthammer I'm not sure that "SpatialPointsDataFrame" is a dataframe (despite the name)! Is it not simply a list? In which case, using "$" is what you have to do to get at its components. Ted. E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 01-Mar-10 Time: 12:25:17 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Jack Siegrist wrote: My biggest impediment, as a scientist without previous programming experience, is that the R help is not beginner-friendly. I think it is probably great for experienced programmers and for the people who helped to create the software, to help them remember what they did, but I think it is very difficult for a newcomer without a strong programming background to learn about a new function or to discover the name of a function that you are pretty sure should already exist. Maybe this wouldn’t matter for most programming languages, but as free statistics software R is obviously going to attract many scientists who want to get an analysis done and have varying levels of experience with programming. Hi Jack, A problem more or less is that the R community consists primarily of volunteers. People who answer questions to the help list in their spare time or during company time. This also holds for many of the online material. A program like Mathematica has a company providing the online material, they hire people to do this work. I don't use this as an excuse for R, but it might explain why the R community is what it is. In reply to the 'bashing' of new users. I agree that sometimes the experts answering the questions can be blunt, but most often it is in response to questions that are very hard to answer. As I said earlier in this mail thread, asking the right question already involves some of the knowledge to answer the question. So to get good, informative responses a user needs some level already. I do want to point out that there is a posting guide for the mailing list that gives a quite detailed instructions, like give the exact error (don't just say, R crashes). Provide traceback() and sessionInfo() etc, etc. And a lot of posters do not adhere to these rules. cheers, Paul I found it much easier to learn how to use Mathematica, using only the online help. With R I had to buy several books to get a handle on it, which is fine, but even the books that I have found to be most useful tend to be didactically lacking—either too cursory or mired in unexplained programming jargon. They are OK just not great. What I think would be very helpful is an introduction to programming using R, preferably a big thick college textbook that takes at least a semester to go through, which should be a prerequisite for going through the Introduction to R available on CRAN. Also to do any analysis on real data you have to use the apply family of functions to perform different functions by groups. A long introduction to these functions, with lots of comparisons and contrasts between them would be very helpful. A few random examples concerning the R help: In my version of R (2.7.0 on Windows XP) typing ?+ doesn’t do anything, but then if you type in the next line + ?sum you get the “Arithmetic Operators” help page. If you had just typed ?sum in the first place you get the “Sum of Vector Elements” help page. Most examples in the R help pages use way to many other functions to be useful to a beginner. If an example uses 10 other functions besides the one being described, chances are a beginner won’t know what one of them does, which can set off a chain of having to look up other irrelevant functions. Some function names in the base package are goofy, such as “rowsum” which is used to “compute column sums across rows”, not to be confused with “rowSums” which computes row sums. -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +3130 274 3113 Mon-Tue Phone: +3130 253 5773 Wed-Fri http://intamap.geo.uu.nl/~paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Mon, 01 Mar 2010 06:37:30 -0500 Duncan Murdoch wrote: > > Some functions output matrices where you would expect them to output > > data frames, and then this problem occurs. (Is there a reason why '$' > > could/should not be made to 'work' on matrices too?) > > > The reason for the difference is that data.frames are lists organized > into columns (so the $ handling comes from the list, where it means > "extract the component") whereas a matrix is a single vector displayed > in columns. Sure, I know that. But is there are reason why the '$' can't be overloaded to handle the extraction, as a *convenience* to the user? After all, it *is* possible to extract columns by name from matrices too (e.g., using d[,"Sepal.Width"]). A similar type of overloading is used in the 'sp' class functions, where you can basically treat a 'SpatialPointsDataFrame', a 'SpatialLinesDataFrame' or a 'SpatialPolygonsDataFrame' as a data frame, with '$colname' indexing and '[' subsetting, even though the *internals* of the objects have a completely different (and very complex) structure. But as a convenience to the user, you don't need to bother with the internals, and can handle the object *as if* it were a data frame. It's a very comfortable way of working. -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On 01-Mar-10 11:09:51, Petr PIKAL wrote: > Hi > r-help-boun...@r-project.org napsal dne 01.03.2010 11:26:40: >> On Mon, 1 Mar 2010 11:02:59 +0100 Karl Ove Hufthammer >> >> wrote: >> > > * What were your biggest misconceptions or >> > > stumbling blocks to getting up and running >> > > with R? >> > >> > Also I found it quite confusing that >> >> One more thing that still trips me up sometimes. '$' works >> on data frames but not on matrices (with dimnames/colnames). >> Even though the two objects *look* exactly the same, '$' on >> one of them works while '$' on the other gives a *very* >> confusing error message. Example: >> >> d=head(iris[1:4]) >> d2=as.matrix(d) >> >> d >> d2 >> >> d$Sepal.Width >> d2$Sepal.Width >> >> Some functions output matrices where you would expect them to >> output data frames, and then this problem occurs. (Is there a >> reason why '$' could/should not be made to 'work' on matrices too?) > > I understand that 2 dimensional rectangular matrix looks quite > similar to data frame however it is only a vector with dimensions. > As such it can have items of only one type (numeric, character, ...). > And you can easily change dimensions of matrix. > > matrix<-1:12 > dim(matrix) <- c(2,6) > matrix > dim(matrix) <- c(2,2,3) > matrix > dim(matrix) <-NULL > matrix > > So rectangular structure of printed matrix is a kind of coincidence > only, whereas rectangular structure of data frame is its main feature. > > Regards > Petr >> >> -- >> Karl Ove Hufthammer Petr, I think that could be confusing! The way I see it is that a matrix is a special case of an array, whose "dimension" attribute is of length 2 (number of "rows", number of "columns"); and "row" and "column" refer to the rectangular display which you see when R prints to matrix. And this, of course, derives directly from the historic rectangular view of a matrix when written down. When you went from "dim(matrix)<-c(2,6)" to "dim(matrix)<-c(2,2,3)" you stripped it of its special title of "matrix" and cast it out into the motley mob of arrays (some of whom are matrices, but "matrix" no longer is). So the "rectangular structure of printed matrix" is not a coincidence, but is its main feature! To come back to Karl's query about why "$" works for a dataframe but not for a matrix, note that "$" is the extractor for getting a named component of a list. So, Karl, when you did d=head(iris[1:4]) you created a dataframe: str(d) # 'data.frame': 6 obs. of 4 variables: # $ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4 # $ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9 # $ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7 # $ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4 (with named components "Sepal.Length", ... , "Petal.Width"), and a dataframe is a special case of a general list. In a general list, the separate components can each be anything. In a dataframe, each component is a vector; the different vectors may be of different types (logical, numeric, ... ) but of course the elements of any single vector must be of the same type; and, in a dataframe, all the vectors must have the same length (otherwise it is a general list, not a dataframe). So, when you print a dataframe, R chooses to display it as a rectangular structure. On the other hand, when you print a general list, R displays it quite differently: d # Sepal.Length Sepal.Width Petal.Length Petal.Width # 1 5.1 3.5 1.4 0.2 # 2 4.9 3.0 1.4 0.2 # 3 4.7 3.2 1.3 0.2 # 4 4.6 3.1 1.5 0.2 # 5 5.0 3.6 1.4 0.2 # 6 5.4 3.9 1.7 0.4 d3 <- list(C1=c(1.1,1.2,1.3), C2=c(2.1,2.2,2.3,2.4)) d3 # $C1 # [1] 1.1 1.2 1.3 # $C2 # [1] 2.1 2.2 2.3 2.4 Notice the similarity (though not identity) between the print of d3 and the output of str(d). There is a bit more hard-wired stuff built into a dataframe which makes it more than simply a "list with all components vectors of equal length). However, one could also say that "the rectangular structure is its main feature". As to why "$" will not work on matrices: a matrix, as Petr points out, is a vector with a "dimensions" attribute which has length 2 (as opposed to a general array where the length of the dimensions attribute could be anything). Hence it is not a list of named components in the sense of "list". Hence "$" will not work with a matrix, since "$" will not be able to find any list-components. which is basically what the error message d2$Sepal.Width # Error in d2$Sepal.Width : $ operator is invalid for atomic vectors is telling you: d2 is an atomic vector with a length-2 dimensions attribute. It has no list-type components for "$" to get its hands on. Ted. E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Dat
Re: [R] two questions for R beginners
Karl Ove Hufthammer wrote: On Mon, 1 Mar 2010 11:02:59 +0100 Karl Ove Hufthammer wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? Also I found it quite confusing that One more thing that still trips me up sometimes. '$' works on data frames but not on matrices (with dimnames/colnames). Even though the two objects *look* exactly the same, '$' on one of them works while '$' on the other gives a *very* confusing error message. Example: d=head(iris[1:4]) d2=as.matrix(d) d d2 d$Sepal.Width d2$Sepal.Width Some functions output matrices where you would expect them to output data frames, and then this problem occurs. (Is there a reason why '$' could/should not be made to 'work' on matrices too?) The reason for the difference is that data.frames are lists organized into columns (so the $ handling comes from the list, where it means "extract the component") whereas a matrix is a single vector displayed in columns. Of course, the problem is that a beginner only knows that they both look the same. But I think the idea of a list is so fundamental to R that it needs to be something learned pretty early, so I'd rather not blur the distinction between dataframes and matrices. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Karl Ove Hufthammer wrote: On Fri, 26 Feb 2010 11:56:10 -0800 (PST) Jack Siegrist wrote: What I think would be very helpful is an introduction to programming using R Here you are: A First Course in Statistical Programming with R http://www.cambridge.org/uk/catalogue/catalogue.asp?isbn=9780521694247 Jack also asked for it to be "a big thick college textbook that takes at least a semester to go through, which should be a prerequisite for going through the Introduction to R available on CRAN". That book (of which I am an author) is not big or thick. But it is aimed at an audience who don't have programming experience. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Hi r-help-boun...@r-project.org napsal dne 01.03.2010 11:26:40: > On Mon, 1 Mar 2010 11:02:59 +0100 Karl Ove Hufthammer > wrote: > > > * What were your biggest misconceptions or > > > stumbling blocks to getting up and running > > > with R? > > > > Also I found it quite confusing that > > One more thing that still trips me up sometimes. '$' works on data > frames but not on matrices (with dimnames/colnames). Even though the two > objects *look* exactly the same, '$' on one of them works while '$' on > the other gives a *very* confusing error message. Example: > > d=head(iris[1:4]) > d2=as.matrix(d) > > d > d2 > > d$Sepal.Width > d2$Sepal.Width > > Some functions output matrices where you would expect them to output > data frames, and then this problem occurs. (Is there a reason why '$' > could/should not be made to 'work' on matrices too?) I understand that 2 dimensional rectangular matrix looks quite similar to data frame however it is only a vector with dimensions. As such it can have items of only one type (numeric, character, ...). And you can easily change dimensions of matrix. matrix<-1:12 dim(matrix) <- c(2,6) matrix dim(matrix) <- c(2,2,3) matrix dim(matrix) <-NULL matrix So rectangular structure of printed matrix is a kind of coincidence only, whereas rectangular structure of data frame is its main feature. Regards Petr > > -- > Karl Ove Hufthammer > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Mon, 1 Mar 2010 11:02:59 +0100 Karl Ove Hufthammer wrote: > > * What were your biggest misconceptions or > > stumbling blocks to getting up and running > > with R? > > Also I found it quite confusing that One more thing that still trips me up sometimes. '$' works on data frames but not on matrices (with dimnames/colnames). Even though the two objects *look* exactly the same, '$' on one of them works while '$' on the other gives a *very* confusing error message. Example: d=head(iris[1:4]) d2=as.matrix(d) d d2 d$Sepal.Width d2$Sepal.Width Some functions output matrices where you would expect them to output data frames, and then this problem occurs. (Is there a reason why '$' could/should not be made to 'work' on matrices too?) -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Fri, 26 Feb 2010 11:56:10 -0800 (PST) Jack Siegrist wrote: > What I think would be very helpful is an introduction to programming using > R Here you are: A First Course in Statistical Programming with R http://www.cambridge.org/uk/catalogue/catalogue.asp?isbn=9780521694247 -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Thu, 25 Feb 2010 17:31:19 + Patrick Burns wrote: > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? I didn't have any major stumbling blocks, but even after years of using R I didn't have a clear concept of what exactly a vector, a list and a data frame was, and what was the difference and similarities between them (and stuff like why does x[i] return a different result than x[[i]]). Some things that have tripped my up is reassigning the value of T or F and getting very strange results afterwards (I now use only TRUE and FALSE). FAQ 7.31 and 7.22 have also been troublesome at times, especially 7.31 when used in 'for' loops. Also I found it quite confusing that ?ifelse works, but not ?if (you have to type ?"if") Also, why ?plot didn't give me the information I was looking for but ?plot.default did was rather confusing. I still experience similar problems with other functions. Usually 'methods' help, but some packages use S4 methods, which makes finding the correct help package quite challenging at times. > * What documents helped you the most in this > initial phase? In the initial phase I found the Rtips "http://pj.freefaculty.org/R/Rtips.html"; extremely useful. For understanding the difference between the various data types in R, Phil Spector's wonderful book 'Data Manipulation with R' was a great help. When reading it I finally understood things I have been wondering about for years. It really like the book. It's short, crystal clear and immensely useful. Another very useful document of a more advanced nature is the R Inferno. Best read after you've been using R for some time, though. I'm over the initial phase now, but two resources which continue to be of great help is http://www.rseek.org/ (mainly for searching the mailing list) and the 'sos' package (for finding the functions and packages I need). 'sos' really is great. There have been other packages/functions trying to do the same thing, but they have been to time-consuming and difficult to use (and learn), typically requiring you to first do a search, and then do some advanced subsetting to get useful results. This is similar to older search engines requiring many boolean terms to give the needed search results. With 'sos' I just choose some simple search terms describing what I'm looking for, and immediately get relevant results. 'sos' really is the Google of the R world. It has made a great impact on the discoverability of the various R functions and packages. Lastly, the 'demo' function is seldom mentioned, and easy to overlook, but gives a nice (and sometimes impressive) overview of what type of graphics is possible to create with a given packages. I wish more packages would have well-written demos. Also, I think some of the examples from the 'example' sections of help pages for functions could very well be copied to the demo of the corresponding package, e.g. a few of the examples of the 'xyplot' function in 'lattice'. -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Fri, Feb 26, 2010 at 8:00 AM, Robert Baer wrote: [...] > The things that led from "frustration" to "independence" was understanding > the difference between data types like matrix and dataframe and learning > there were commands to tell what you were working with at any given time. > Did the data read in as character, numeric, or factor, etc. Commands > like: str, class, mode, ls, search, help, help.search, etc can help you > figure out what you are doing. Yes! I think this is really key. When I started R I had no programming experience and thought of projects in terms of statistical procedures and printed output (cut teeth w/ Minitab --> SPSS --> SAS). If I wanted to analyze data using R I looked for examples of using an analysis function of interest (e.g, lm, princomp, rpart...) and did my best to adapt to my project. What was of interest was the printed output rather than understanding the objects that I was passing and creating. It wasn't until I buckled down and read the (admittedly quite dry and often dense) materials describing the language that the sailing became smooth (or at least much more rapid and took me to more interesting places). Important resources I recall using were An Introduction to R (which I avoided for about the first 6mo because of language I wasn't yet familiar with), r-help archives, man pages, and particularly the early chapters of MASS and S Programming by V&R. But I think the real 'a-ha' moments came by interactively exploring objects within R. This was vastly facilitated by the use of str and indexing tools ([, [[, $, @). A mantra for R beginners might be "In R we work with objects, and str reveals their essence" ;-) Kingsford Jones > > Rob > > > > > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Patrick Burns > Sent: Thursday, February 25, 2010 11:31 AM > To: r-help@r-project.org > Subject: [R] two questions for R beginners > > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > pbu...@pburns.seanet.com > http://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Hi, I don't think you should split the list for beginners. On the SAS list we get questions from novices such as secretaries, janitorial services, human resources and even top executives. They often approach SAS from a very intuitive standpoint. These questions often shake the experts to the core. They ask themselves, why didn't I allow R to do this. For instance I novice might ask of the SAS datastep language: Why can't I just Array X[3] ("A",1."ROGER",26) You can do the above in several other integrated SAS languages (MACRO,SCL,SAS-C,IML-sort of) at ~$5000+ per year for each except macro) A user asked recently array x[2,3,4,5] x1-x120; Do i=1 to 2; Do j=1 to 3; Do k=1 to 4; Do l=1 to 5; X&i&j&k&l = i*j*k*l; End; End; End; End; R can do this nicely with lists but SAS can do it with SCL,Macro,IML and C. I think SAS-IML has the most intuitive solution. I read Nabble, perl and SAS lists religiously, what I would like to see is one list that somehow integrated R, SAS and perl solutions. SAS users are trying to create integrated 'DROP DOWN' capabilties that allow programmers to switch languages mid stream to get the best solution. I often want to respond with SAS solutions, just so R and perl can think about adding functionality. ie data new; set data; perl on; perl code; ... perl off; sas code; . R on; R code; ; R off; run; I am trying to get SAS users to do some of their processing in R(within SAS). I am toying with a set of tips that show SAS intuitive code beside R code, so SAS users can become more comfortable with R. SAS is much more intuitive than R for instance R 'for' loops with funny '}s' next to the more intuitive SAS do/ends. I could expound on the type of problems perl handles better than SAS or R, problems R handles better than SAS or perl and problems SAS handles better than R or perl. -- View this message in context: http://n4.nabble.com/two-questions-for-R-beginners-tp1569384p1572165.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Hi, I don't think you should split the list for beginners. On the SAS list we get questions from novices such as secretaries, janitorial services, human resources and even top executives. They often approach SAS from a very intuitive standpoint. These questions often shake the experts to the core. They ask themselves, why didn't I allow R to do this. For instance I novice might ask of the SAS datastep language: Why can't I just Array X[3] ("A",1."ROGER",26) You can do the above in several other integrated SAS languages (MACRO,SCL,SAS-C,IML-sort of) at ~$5000+ per year for each except macro) A user asked recently array x[2,3,4,5] x1-x120; Do i=1 to 2; Do j=1 to 3; Do k=1 to 4; Do l=1 to 5; X&i&j&k&l = i*j*k*l; End; End; End; End; R can do this nicely with lists but SAS can do it with SCL,Macro,IML and C. I think SAS-IML has the most intuitive solution. I read Nabble, perl and SAS lists religiously, what I would like to see is one list that somehow integrated R, SAS and perl solutions. SAS users are trying to create integrated 'DROP DOWN' capabilties that allow programmers to switch languages mid stream to get the best solution. I often want to respond with SAS solutions, just so R and perl can think about adding functionality. ie data new; set data; perl on; perl code; ... perl off; sas code; . R on; R code; ; R off; run; I am trying to get SAS users to do some of their processing in R(within SAS). I am toying with a set of tips that show SAS intuitive code beside R code, so SAS users can become more comfortable with R. SAS is much more intuitive than R for instance R 'for' loops with funny '}s' next to the more intuitive SAS do/ends. I could expound on the type of problems perl handles better than SAS or R, problems R handles better than SAS or perl and problems SAS handles better than R or perl. -- View this message in context: http://n4.nabble.com/two-questions-for-R-beginners-tp1569384p1572149.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
I don't think I am a tyro but neither am I a wizard. This being said R has a number of aspects that make it difficult. Error messages that are not helpful Manual pages that are written in Martin. Lack of examples on some manual pages Lack of comments in code There are other hurdles. The concept of vectorization and its related syntax took a long time to understand. John John Sorkin jsor...@grecc.umaryland.edu -Original Message- From: Saeed Abu Nimeh Cc: To: Sent: 2/26/2010 11:36:38 PM Subject: Re: [R] two questions for R beginners Hi Ivan, On 2/26/10 6:30 AM, Ivan Calandra wrote: > You are definitely right... > What to do with bad beginner's questions is not a simple issue. > > If a "beginner's mailing list" is created, who will answer to such > questions? If I subscribe to the beginners mailing list, then I have to expect novice questions and I should be willing to help. Otherwise, I should not be there. And moreover, the beginners won't take advantage of the other > questions (I've personally learned a lot trying to understand the > questions and answers to other's problems). They can still subscribe to the advanced, but they will know that they are here to observe and learn, not to ask novice questions. You want to ask basic stuff, go to the beginners list :) Not sure if you guys have been on some of the linux mailing lists out there, but man let me tell you, some of these lists have a RTFM attitude and they will fry you if you ask novice questions. Frankly, that is understandable, as most of the members are geeks and they have higher expectations. This mailing list is different, I have seen posts from different disciplines; biology, biostats, stats, computer science, oceanography, etc. So, IMO, there should be a beginners list to cope with such broad committee. Thanks, Saeed And also, as you said, the > problems might persist. > The beginner's mailing list might be good in one aspect though: the > "experts" who subscribe to it would be willing to help the beginners to > get started with R, knowing that the questions might not be clearly stated. > > As you pointed out, the mailing list is not the best for basic stuff > (the question is of course "what is basic?"). Not everybody knows some > colleagues who work with R (I'm personally the 1st one to use R in my lab). > I think, somehow and I have no idea how, documentation and guidance to > search for help should be more accessible as soon as you start with R. > Maybe a _*clear*_ section on the R homepage or in the "introduction to > R" manual like "where to find help", including all of the most common > and useful resources available (from "?" and RSiteSearch() to R Wiki and > Crantastic). > > I hope that this whole discussion might help to make the R world better. > Thank you Patrick for initiating it! > Regards, > Ivan > > Le 2/26/2010 15:09, Paul Hiemstra a écrit : >> Ivan Calandra wrote: >>> Since you want input from beginners, here are some thoughts >>> >>> I had and still have two big problems with R: >>> - this vectorization thing. I've read many manuals (including R >>> inferno), but I'm still not completely clear about it. In simple >>> examples, it's fine. But when it gets a bit more complex, then... >>> Related to it, the *apply functions are still a bit difficult to >>> understand. When I have to use them, I just try one and see what >>> happens. I don't understand them well enough to know which one I need. >>> - the second problem is where to find the functions/packages I need. >>> There are many options, and that's actually the problem. R Wiki, >>> Rseek, RSiteSearch, Crantastic, etc... When you start with R, you >>> discover that the capabilities of R are almost unlimited and you >>> don't really know where to start, where to find what you need. >>> >>> As noted in earlier posts, the mailing list is really great, but some >>> people are really hard with beginners. It was noted in a discussion a >>> few days ago, but it looks like some don't realize how difficult it >>> is at the beginning to formulate a good question, clear, with >>> self-contained example and so on. Moreover, not everybody speaks >>> English natively. I don't mean that you must help, even when the >>> question is really vague and not clear and whatever. I'm just saying >>> that if you don't want to help (whatever the reason), you don't have >>> to say it badly. But in any cases, the mailing list is still really >>> helpful. As someone noted (sorry I erased the email so
Re: [R] two questions for R beginners
Dieter Menne [Fri, Feb 26, 2010 at 08:39:14AM CET]: > > > Patrick Burns wrote: > > > > * What were your biggest misconceptions or > > stumbling blocks to getting up and running > > with R? > > > > > (This derives partly from teaching) > [...] > > The concept of environment. With S it was worse, though. > Agreed, though a beginner shouldn't be exposed to this aspect. In the beginning you can analyse away before you are drowning in variable names if you start with simple examples. Which plotting parameters can be passed with basic plot functions, and which ones have to be declared with par()? How do I set the min and max values for the x and y axis? (This aspect is drowned among all the options under ?par.) Generally, the help pages are built like man pages, where all options are given more or less equal consideration, even if one option is used almost always and one only for esoteric purposes. Given that help() is the most intuitive thing to look for, it may be nice to include references to other sources like rwiki if the respective page is good, even if it may be disruptive wrt display device. -- Johannes Hüsing There is something fascinating about science. One gets such wholesale returns of conjecture mailto:johan...@huesing.name from such a trifling investment of fact. http://derwisch.wikidot.com (Mark Twain, "Life on the Mississippi") __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Lazy and impatient? That's me! I find it hard to say what my biggest misconceptions were. Here's one thing: What I realized very early on: - many data analysis functions return a bunch of stuff, not all of which you see when you print() it what I *failed* to realize: - The bunch of stuff such functions return is just a *list* that has follow-on implications: - even if you're just doing some simple analysis like a linear regression, if you want to be able to see/get all the information, you really need to learn how to examine what's in a list and how to operate on the list. I had seen lists as "potentially useful but not something I need to worry about right now, since I'm having enough trouble just grokking why dataframes look different to matrices", whereas I needed to know that lists were absolutely central to what I was trying to achieve. While I have no doubt this information can be found in a dozen places, I read a bunch of introductory documents at the time, and I don't recall it being stated explicitly like that in any of the places I looked. It made a big difference to me when I realized that so many functions just return a list. I mean, it's obvious, and I should have seen that's all it was the first time, but I didn't. -- View this message in context: http://n4.nabble.com/two-questions-for-R-beginners-tp1569384p1571715.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Fri, Feb 26, 2010 at 1:28 PM, Saeed Abu Nimeh wrote: > Pat, > Off the bat, beginners and advanced. In addition, splitting by domain > would be very helpful -- something along the lines of: > http://cran.r-project.org/web/views/. But we should be careful, we do > not want to create 20 other mailing lists :) We have to group things. Note that there are already 24 mailing lists here: http://www.r-project.org/mail.html > This will help splitting the volume of the list and will help in > targeting lists by expertise. > Thanks, > Saeed > > On Fri, Feb 26, 2010 at 2:08 AM, Patrick Burns > wrote: >> Saeed, >> >> If the R-help list were split, what do you >> see as the pieces? >> >> Pat >> >> On 26/02/2010 01:53, Saeed Abu Nimeh wrote: >>> >>> On Thu, Feb 25, 2010 at 9:31 AM, Patrick Burns >>> wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? >>> >>> 1- Compared to other programming languages it is hard to learn R by >>> example, because it is hard to find code on the web that will do the >>> exact thing you are looking for, sometimes you might get lucky though. >>> By contrast, take Perl for example, it is an easy language to learn by >>> example. >>> >>> 2- The R mailing list. Beginners get frustrated after they struggle >>> for a long time to solve a problem and the easiest thing then is to >>> send an email to the R mailing list. I did this in the past. The best >>> thing that happened was that my request was neglected and I had to >>> spend more time on the problem and find a solution by myself >>> eventually. Do not get me wrong, I am not saying that the mailing list >>> is bad, but it should be more organized. Maybe broken down into couple >>> of other mailing lists. This might bring up a good discussion thread. >>> * What documents helped you the most in this initial phase? >>> >>> An Introduction to R by Venables >>> simpleR – Using R for Introductory Statistics by Verzani >>> >> >> -- >> Patrick Burns >> pbu...@pburns.seanet.com >> http://www.burns-stat.com >> (home of 'The R Inferno' and 'A Guide for the Unwilling S User') >> > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
sorry meant community not committee On 2/26/10 8:36 PM, Saeed Abu Nimeh wrote: Hi Ivan, On 2/26/10 6:30 AM, Ivan Calandra wrote: You are definitely right... What to do with bad beginner's questions is not a simple issue. If a "beginner's mailing list" is created, who will answer to such questions? If I subscribe to the beginners mailing list, then I have to expect novice questions and I should be willing to help. Otherwise, I should not be there. And moreover, the beginners won't take advantage of the other questions (I've personally learned a lot trying to understand the questions and answers to other's problems). They can still subscribe to the advanced, but they will know that they are here to observe and learn, not to ask novice questions. You want to ask basic stuff, go to the beginners list :) Not sure if you guys have been on some of the linux mailing lists out there, but man let me tell you, some of these lists have a RTFM attitude and they will fry you if you ask novice questions. Frankly, that is understandable, as most of the members are geeks and they have higher expectations. This mailing list is different, I have seen posts from different disciplines; biology, biostats, stats, computer science, oceanography, etc. So, IMO, there should be a beginners list to cope with such broad committee. Thanks, Saeed And also, as you said, the problems might persist. The beginner's mailing list might be good in one aspect though: the "experts" who subscribe to it would be willing to help the beginners to get started with R, knowing that the questions might not be clearly stated. As you pointed out, the mailing list is not the best for basic stuff (the question is of course "what is basic?"). Not everybody knows some colleagues who work with R (I'm personally the 1st one to use R in my lab). I think, somehow and I have no idea how, documentation and guidance to search for help should be more accessible as soon as you start with R. Maybe a _*clear*_ section on the R homepage or in the "introduction to R" manual like "where to find help", including all of the most common and useful resources available (from "?" and RSiteSearch() to R Wiki and Crantastic). I hope that this whole discussion might help to make the R world better. Thank you Patrick for initiating it! Regards, Ivan Le 2/26/2010 15:09, Paul Hiemstra a écrit : Ivan Calandra wrote: Since you want input from beginners, here are some thoughts I had and still have two big problems with R: - this vectorization thing. I've read many manuals (including R inferno), but I'm still not completely clear about it. In simple examples, it's fine. But when it gets a bit more complex, then... Related to it, the *apply functions are still a bit difficult to understand. When I have to use them, I just try one and see what happens. I don't understand them well enough to know which one I need. - the second problem is where to find the functions/packages I need. There are many options, and that's actually the problem. R Wiki, Rseek, RSiteSearch, Crantastic, etc... When you start with R, you discover that the capabilities of R are almost unlimited and you don't really know where to start, where to find what you need. As noted in earlier posts, the mailing list is really great, but some people are really hard with beginners. It was noted in a discussion a few days ago, but it looks like some don't realize how difficult it is at the beginning to formulate a good question, clear, with self-contained example and so on. Moreover, not everybody speaks English natively. I don't mean that you must help, even when the question is really vague and not clear and whatever. I'm just saying that if you don't want to help (whatever the reason), you don't have to say it badly. But in any cases, the mailing list is still really helpful. As someone noted (sorry I erased the email so I don't remember who), it might be a good idea to split it. Hi everyone, My 2ct about the mailing list :). I understand that beginners have a hard time formulating a good question. But the problem is that we can't answer the question when it is unclear. So either I: - Don't bother answering - Try do discuss with the author of the question, taking lots of time to find out what exactly is the question. - Send a "read the posting guide" answer I mostly do the first, as I have to get things done during my PhD :). So this leaves us with kind of a problem, the person mailing the list doesn't have the knowledge to ask the right question, the list can't answer properly and consequently, the person mailing the list still doesn't get the information he/she needs. We could start an R-beginner mailing list, but this would also suffer from this problem. What do you guys think? Maybe the mailing list is not the right medium for really basic stuff. For that I would recommend a good R-book or (better) a course in R or (even better) some colleagues who work with R that you can ask questions to. cheer
Re: [R] two questions for R beginners
Hi Ivan, On 2/26/10 6:30 AM, Ivan Calandra wrote: You are definitely right... What to do with bad beginner's questions is not a simple issue. If a "beginner's mailing list" is created, who will answer to such questions? If I subscribe to the beginners mailing list, then I have to expect novice questions and I should be willing to help. Otherwise, I should not be there. And moreover, the beginners won't take advantage of the other questions (I've personally learned a lot trying to understand the questions and answers to other's problems). They can still subscribe to the advanced, but they will know that they are here to observe and learn, not to ask novice questions. You want to ask basic stuff, go to the beginners list :) Not sure if you guys have been on some of the linux mailing lists out there, but man let me tell you, some of these lists have a RTFM attitude and they will fry you if you ask novice questions. Frankly, that is understandable, as most of the members are geeks and they have higher expectations. This mailing list is different, I have seen posts from different disciplines; biology, biostats, stats, computer science, oceanography, etc. So, IMO, there should be a beginners list to cope with such broad committee. Thanks, Saeed And also, as you said, the problems might persist. The beginner's mailing list might be good in one aspect though: the "experts" who subscribe to it would be willing to help the beginners to get started with R, knowing that the questions might not be clearly stated. As you pointed out, the mailing list is not the best for basic stuff (the question is of course "what is basic?"). Not everybody knows some colleagues who work with R (I'm personally the 1st one to use R in my lab). I think, somehow and I have no idea how, documentation and guidance to search for help should be more accessible as soon as you start with R. Maybe a _*clear*_ section on the R homepage or in the "introduction to R" manual like "where to find help", including all of the most common and useful resources available (from "?" and RSiteSearch() to R Wiki and Crantastic). I hope that this whole discussion might help to make the R world better. Thank you Patrick for initiating it! Regards, Ivan Le 2/26/2010 15:09, Paul Hiemstra a écrit : Ivan Calandra wrote: Since you want input from beginners, here are some thoughts I had and still have two big problems with R: - this vectorization thing. I've read many manuals (including R inferno), but I'm still not completely clear about it. In simple examples, it's fine. But when it gets a bit more complex, then... Related to it, the *apply functions are still a bit difficult to understand. When I have to use them, I just try one and see what happens. I don't understand them well enough to know which one I need. - the second problem is where to find the functions/packages I need. There are many options, and that's actually the problem. R Wiki, Rseek, RSiteSearch, Crantastic, etc... When you start with R, you discover that the capabilities of R are almost unlimited and you don't really know where to start, where to find what you need. As noted in earlier posts, the mailing list is really great, but some people are really hard with beginners. It was noted in a discussion a few days ago, but it looks like some don't realize how difficult it is at the beginning to formulate a good question, clear, with self-contained example and so on. Moreover, not everybody speaks English natively. I don't mean that you must help, even when the question is really vague and not clear and whatever. I'm just saying that if you don't want to help (whatever the reason), you don't have to say it badly. But in any cases, the mailing list is still really helpful. As someone noted (sorry I erased the email so I don't remember who), it might be a good idea to split it. Hi everyone, My 2ct about the mailing list :). I understand that beginners have a hard time formulating a good question. But the problem is that we can't answer the question when it is unclear. So either I: - Don't bother answering - Try do discuss with the author of the question, taking lots of time to find out what exactly is the question. - Send a "read the posting guide" answer I mostly do the first, as I have to get things done during my PhD :). So this leaves us with kind of a problem, the person mailing the list doesn't have the knowledge to ask the right question, the list can't answer properly and consequently, the person mailing the list still doesn't get the information he/she needs. We could start an R-beginner mailing list, but this would also suffer from this problem. What do you guys think? Maybe the mailing list is not the right medium for really basic stuff. For that I would recommend a good R-book or (better) a course in R or (even better) some colleagues who work with R that you can ask questions to. cheers, Paul Hope that's what you wanted Ivan Le 2/26/2010 08:39, Di
Re: [R] two questions for R beginners
My biggest impediment, as a scientist without previous programming experience, is that the R help is not beginner-friendly. I think it is probably great for experienced programmers and for the people who helped to create the software, to help them remember what they did, but I think it is very difficult for a newcomer without a strong programming background to learn about a new function or to discover the name of a function that you are pretty sure should already exist. Maybe this wouldn’t matter for most programming languages, but as free statistics software R is obviously going to attract many scientists who want to get an analysis done and have varying levels of experience with programming. I found it much easier to learn how to use Mathematica, using only the online help. With R I had to buy several books to get a handle on it, which is fine, but even the books that I have found to be most useful tend to be didactically lacking—either too cursory or mired in unexplained programming jargon. They are OK just not great. What I think would be very helpful is an introduction to programming using R, preferably a big thick college textbook that takes at least a semester to go through, which should be a prerequisite for going through the Introduction to R available on CRAN. Also to do any analysis on real data you have to use the apply family of functions to perform different functions by groups. A long introduction to these functions, with lots of comparisons and contrasts between them would be very helpful. A few random examples concerning the R help: In my version of R (2.7.0 on Windows XP) typing > ?+ doesn’t do anything, but then if you type in the next line + ?sum you get the “Arithmetic Operators” help page. If you had just typed > ?sum in the first place you get the “Sum of Vector Elements” help page. Most examples in the R help pages use way to many other functions to be useful to a beginner. If an example uses 10 other functions besides the one being described, chances are a beginner won’t know what one of them does, which can set off a chain of having to look up other irrelevant functions. Some function names in the base package are goofy, such as “rowsum” which is used to “compute column sums across rows”, not to be confused with “rowSums” which computes row sums. -- View this message in context: http://n4.nabble.com/two-questions-for-R-beginners-tp1569384p1571243.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Patrick Burns > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? I came into R from SAS, with its powerful data step language and very simplified data types. Most of my work is data manipulation prior to a variety of univariate statistical calculations. The vector-based nature of R, and thus the variety of indexing schemes used, was a big conceptual hurdle. The often unhelpful attitude of several list respondents, while not unique to this list, was and continues to be another block to advancement. This does not occur on the list for SAS, in which asking 'dumb' questions is generally supported as an inevitable part of learning. Having aggregate() pointed out to me by one kind soul, hidden amidst the assortment of by()/apply() functions, became the basis for much success. I am currently trying to wrap my mind around how missing values are handled; the defaults are quite different than SAS, and mostly in a good way. However the handling of NA values in a slicing statements does not seem quite proper, even if it is addressed in the R documents. aa <- data.frame('id'=letters[1:5], 'x'=1:5, stringsAsFactors=FALSE) aa[aa$x == 3,]$x <- NA aa[aa$x == '4',]# 2 rows instead of 1. aa[aa$x %in% '4',] # 1 row as expected. I am also looking for concise methods for building up dataframes for our unit tests. While there are several ways to accomplish this, depending on what is needed, none are elegant though expand.grid() comes close. next: The R inferno. I *will* understand more than the first few pages. And all those apply()-ish functions, as I'm already good friends with aggregate(). > * What documents helped you the most in this > initial phase? RSeek.org was and continues to be a big source of help. I've looked at several texts aimed at beginners, and all provided simple examples that were useful. The most consistent source of instruction has been to make up my own small projects that were either fun or slightly relevant to my job. The ability to make up toy problems, or simplify a complex process have been unexpectedly important skills. Developing unit tests for functions, initially seen as an irritant by some, has become an important tool for honing our advances. > I especially want to hear from people who are > lazy and impatient. And, I hope, incompetent. I've found incompetence to be as professionally important as hubris. I wouldn't want one without the other. cur -- Curt Seeliger, Data Ranger Raytheon Information Services - Contractor to ORD seeliger.c...@epa.gov 541/754-4638 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Pat, Off the bat, beginners and advanced. In addition, splitting by domain would be very helpful -- something along the lines of: http://cran.r-project.org/web/views/. But we should be careful, we do not want to create 20 other mailing lists :) We have to group things. This will help splitting the volume of the list and will help in targeting lists by expertise. Thanks, Saeed On Fri, Feb 26, 2010 at 2:08 AM, Patrick Burns wrote: > Saeed, > > If the R-help list were split, what do you > see as the pieces? > > Pat > > On 26/02/2010 01:53, Saeed Abu Nimeh wrote: >> >> On Thu, Feb 25, 2010 at 9:31 AM, Patrick Burns >> wrote: >>> >>> * What were your biggest misconceptions or >>> stumbling blocks to getting up and running >>> with R? >> >> 1- Compared to other programming languages it is hard to learn R by >> example, because it is hard to find code on the web that will do the >> exact thing you are looking for, sometimes you might get lucky though. >> By contrast, take Perl for example, it is an easy language to learn by >> example. >> >> 2- The R mailing list. Beginners get frustrated after they struggle >> for a long time to solve a problem and the easiest thing then is to >> send an email to the R mailing list. I did this in the past. The best >> thing that happened was that my request was neglected and I had to >> spend more time on the problem and find a solution by myself >> eventually. Do not get me wrong, I am not saying that the mailing list >> is bad, but it should be more organized. Maybe broken down into couple >> of other mailing lists. This might bring up a good discussion thread. >> >>> >>> * What documents helped you the most in this >>> initial phase? >> >> An Introduction to R by Venables >> simpleR – Using R for Introductory Statistics by Verzani >> > > -- > Patrick Burns > pbu...@pburns.seanet.com > http://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Dear Patrick (and all) I'm now working with R a couple of years, before working mostly in Matlab Lazy & impatient is both true for me :-) * What were your biggest misconceptions or stumbling blocks to getting up and running with R? > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > Stumbling: * It took me long to remember getwd () and setwd () (instead of pwd and cd / chdir or the like) * I still discover very useful functions that I would have needed for a long time. Latest discoveries: mapply and ave I knew aggregate. And was always a little angry that it needs a grouping list. I even decided that the aggregate method for my hyperSpec class should work with factors as well as with lists. Some day I read in this mailing list that ave does what I need... I like the crosslinks in the help (see also) very much. Maybe I rely too much on them. So: not lazy today, I attach a patch for aggregate.Rd that adds the seealso to ave. Reading this mailing list once in a while gives me nice new ideas. However, > 50 emails / d is somewhat scary for me, so I read only occasionally. * Vecorization: I like the *apply functions. but I'd really appreciate a comprehensive page/vignette here. I remember that it took me a while to realize that the rule for MARGIN in sweep is "use the same number as in the apply that created the STATS" * I never found the pdf manuals helpful (help pages are easier to access, and there is nothing in the pdf that the help doesn't have. At the beginning I expected the pdf manual to be something that the vignettes are. * I did not arrive at a comfortable debugging cycle for a long time. But now there's the debug package and setBreakpoint and I'm happy * As I now start teaching I notice that many students react to error messages "uhh! an error!" (panic). Few realizing that the error message actually gives information on what went wrong. A list with common causes of different error messages would be helpful here, I think. In case someone agrees: I started one at the Wiki: http://rwiki.sciviews.org/doku.php?id=tips:errormessages Cheers, Claudia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Thomas Adams wrote: Paul, I think your point "you need [to] spend at least a few hours a week on it" is key. Since I am not doing statistics daily, more in fits & starts as my latest project -may- require, my approach has been more task oriented. A less-than-ideal approach. So, I think your suggestion is on-the-mark. Tom I also see co-workers who would like to work with R, see the benefit of R etc, but don't have the time to learn and maintain R. But I'm not really sure how to fix this, it seems impossible to have both easy, intuitive to use and power and flexibility. cheers, Paul Paul Hiemstra wrote: Ivan Calandra wrote: You are definitely right... What to do with bad beginner's questions is not a simple issue. If a "beginner's mailing list" is created, who will answer to such questions? And moreover, the beginners won't take advantage of the other questions (I've personally learned a lot trying to understand the questions and answers to other's problems). And also, as you said, the problems might persist. The beginner's mailing list might be good in one aspect though: the "experts" who subscribe to it would be willing to help the beginners to get started with R, knowing that the questions might not be clearly stated. As you pointed out, the mailing list is not the best for basic stuff (the question is of course "what is basic?"). Not everybody knows some colleagues who work with R (I'm personally the 1st one to use R in my lab). I think, somehow and I have no idea how, documentation and guidance to search for help should be more accessible as soon as you start with R. Maybe a _*clear*_ section on the R homepage or in the "introduction to R" manual like "where to find help", including all of the most common and useful resources available (from "?" and RSiteSearch() to R Wiki and Crantastic). Hi Ivan (and list), I think the main problem is not as much that there isn't structure in the way R provides documentation / tutorials, but that people have a hard time finding the structure. There are task views for certain specific fields, but I think a lot of beginners do not know that they exist. There are separate mailing lists for specific fields, but I often see geographical (my field of expertise) oriented questions on R-help that would fit much better on R-sig-geo. So I think a "O my God, I've downloaded R and what now" tutorial might be a good idea to put very close to the download button of R on CRAN. This tutorial would focus not on how to do things in R, but would provide guidance to the most obvious sources of information such as Task views, specific mailing lists, ways to search list archives, information for beginners how to write a good e-mail etc. I think for a lot of beginners it is not as much the answer to a specific question that they need, but more guidance how to look for answers themselves. But at the end of the day, R is still not very easy to learn when coming from GUI oriented stats programs. In addition, to become reasonably fluent in R, you need spend at least a few hours a week on it. SO I think we can ease the pain for beginners, but not take away that it takes quite some time to become fluent in R. cheers, Paul I hope that this whole discussion might help to make the R world better. Thank you Patrick for initiating it! Regards, Ivan Le 2/26/2010 15:09, Paul Hiemstra a écrit : Ivan Calandra wrote: Since you want input from beginners, here are some thoughts I had and still have two big problems with R: - this vectorization thing. I've read many manuals (including R inferno), but I'm still not completely clear about it. In simple examples, it's fine. But when it gets a bit more complex, then... Related to it, the *apply functions are still a bit difficult to understand. When I have to use them, I just try one and see what happens. I don't understand them well enough to know which one I need. - the second problem is where to find the functions/packages I need. There are many options, and that's actually the problem. R Wiki, Rseek, RSiteSearch, Crantastic, etc... When you start with R, you discover that the capabilities of R are almost unlimited and you don't really know where to start, where to find what you need. As noted in earlier posts, the mailing list is really great, but some people are really hard with beginners. It was noted in a discussion a few days ago, but it looks like some don't realize how difficult it is at the beginning to formulate a good question, clear, with self-contained example and so on. Moreover, not everybody speaks English natively. I don't mean that you must help, even when the question is really vague and not clear and whatever. I'm just saying that if you don't want to help (whatever the reason), you don't have to say it badly. But in any cases, the mailing list is still really helpful. As someone noted (sorry I erased the email so I don't remember who), it mig
Re: [R] two questions for R beginners
Paul, I think your point "you need [to] spend at least a few hours a week on it" is key. Since I am not doing statistics daily, more in fits & starts as my latest project -may- require, my approach has been more task oriented. A less-than-ideal approach. So, I think your suggestion is on-the-mark. Tom Paul Hiemstra wrote: Ivan Calandra wrote: You are definitely right... What to do with bad beginner's questions is not a simple issue. If a "beginner's mailing list" is created, who will answer to such questions? And moreover, the beginners won't take advantage of the other questions (I've personally learned a lot trying to understand the questions and answers to other's problems). And also, as you said, the problems might persist. The beginner's mailing list might be good in one aspect though: the "experts" who subscribe to it would be willing to help the beginners to get started with R, knowing that the questions might not be clearly stated. As you pointed out, the mailing list is not the best for basic stuff (the question is of course "what is basic?"). Not everybody knows some colleagues who work with R (I'm personally the 1st one to use R in my lab). I think, somehow and I have no idea how, documentation and guidance to search for help should be more accessible as soon as you start with R. Maybe a _*clear*_ section on the R homepage or in the "introduction to R" manual like "where to find help", including all of the most common and useful resources available (from "?" and RSiteSearch() to R Wiki and Crantastic). Hi Ivan (and list), I think the main problem is not as much that there isn't structure in the way R provides documentation / tutorials, but that people have a hard time finding the structure. There are task views for certain specific fields, but I think a lot of beginners do not know that they exist. There are separate mailing lists for specific fields, but I often see geographical (my field of expertise) oriented questions on R-help that would fit much better on R-sig-geo. So I think a "O my God, I've downloaded R and what now" tutorial might be a good idea to put very close to the download button of R on CRAN. This tutorial would focus not on how to do things in R, but would provide guidance to the most obvious sources of information such as Task views, specific mailing lists, ways to search list archives, information for beginners how to write a good e-mail etc. I think for a lot of beginners it is not as much the answer to a specific question that they need, but more guidance how to look for answers themselves. But at the end of the day, R is still not very easy to learn when coming from GUI oriented stats programs. In addition, to become reasonably fluent in R, you need spend at least a few hours a week on it. SO I think we can ease the pain for beginners, but not take away that it takes quite some time to become fluent in R. cheers, Paul I hope that this whole discussion might help to make the R world better. Thank you Patrick for initiating it! Regards, Ivan Le 2/26/2010 15:09, Paul Hiemstra a écrit : Ivan Calandra wrote: Since you want input from beginners, here are some thoughts I had and still have two big problems with R: - this vectorization thing. I've read many manuals (including R inferno), but I'm still not completely clear about it. In simple examples, it's fine. But when it gets a bit more complex, then... Related to it, the *apply functions are still a bit difficult to understand. When I have to use them, I just try one and see what happens. I don't understand them well enough to know which one I need. - the second problem is where to find the functions/packages I need. There are many options, and that's actually the problem. R Wiki, Rseek, RSiteSearch, Crantastic, etc... When you start with R, you discover that the capabilities of R are almost unlimited and you don't really know where to start, where to find what you need. As noted in earlier posts, the mailing list is really great, but some people are really hard with beginners. It was noted in a discussion a few days ago, but it looks like some don't realize how difficult it is at the beginning to formulate a good question, clear, with self-contained example and so on. Moreover, not everybody speaks English natively. I don't mean that you must help, even when the question is really vague and not clear and whatever. I'm just saying that if you don't want to help (whatever the reason), you don't have to say it badly. But in any cases, the mailing list is still really helpful. As someone noted (sorry I erased the email so I don't remember who), it might be a good idea to split it. Hi everyone, My 2ct about the mailing list :). I understand that beginners have a hard time formulating a good question. But the problem is that we can't answer the question when it is unclear. So either I: - Don't bother answering - Try do discuss wi
Re: [R] two questions for R beginners
Hi again Paul, Hi Ivan (and list), I think the main problem is not as much that there isn't structure in the way R provides documentation / tutorials, but that people have a hard time finding the structure. There are task views for certain specific fields, but I think a lot of beginners do not know that they exist. You're definitely right... what is it?! where to find them? So I think a "O my God, I've downloaded R and what now" tutorial might be a good idea to put very close to the download button of R on CRAN. This tutorial would focus not on how to do things in R, but would provide guidance to the most obvious sources of information such as Task views, specific mailing lists, ways to search list archives, information for beginners how to write a good e-mail etc. I think for a lot of beginners it is not as much the answer to a specific question that they need, but more guidance how to look for answers themselves. I think that would indeed help a lot. I can only agree with your last sentence. Is someone already working on this kind of manual? Is it planed? etc? cheers, Paul Regards, Ivan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Ivan Calandra wrote: You are definitely right... What to do with bad beginner's questions is not a simple issue. If a "beginner's mailing list" is created, who will answer to such questions? And moreover, the beginners won't take advantage of the other questions (I've personally learned a lot trying to understand the questions and answers to other's problems). And also, as you said, the problems might persist. The beginner's mailing list might be good in one aspect though: the "experts" who subscribe to it would be willing to help the beginners to get started with R, knowing that the questions might not be clearly stated. As you pointed out, the mailing list is not the best for basic stuff (the question is of course "what is basic?"). Not everybody knows some colleagues who work with R (I'm personally the 1st one to use R in my lab). I think, somehow and I have no idea how, documentation and guidance to search for help should be more accessible as soon as you start with R. Maybe a _*clear*_ section on the R homepage or in the "introduction to R" manual like "where to find help", including all of the most common and useful resources available (from "?" and RSiteSearch() to R Wiki and Crantastic). Hi Ivan (and list), I think the main problem is not as much that there isn't structure in the way R provides documentation / tutorials, but that people have a hard time finding the structure. There are task views for certain specific fields, but I think a lot of beginners do not know that they exist. There are separate mailing lists for specific fields, but I often see geographical (my field of expertise) oriented questions on R-help that would fit much better on R-sig-geo. So I think a "O my God, I've downloaded R and what now" tutorial might be a good idea to put very close to the download button of R on CRAN. This tutorial would focus not on how to do things in R, but would provide guidance to the most obvious sources of information such as Task views, specific mailing lists, ways to search list archives, information for beginners how to write a good e-mail etc. I think for a lot of beginners it is not as much the answer to a specific question that they need, but more guidance how to look for answers themselves. But at the end of the day, R is still not very easy to learn when coming from GUI oriented stats programs. In addition, to become reasonably fluent in R, you need spend at least a few hours a week on it. SO I think we can ease the pain for beginners, but not take away that it takes quite some time to become fluent in R. cheers, Paul I hope that this whole discussion might help to make the R world better. Thank you Patrick for initiating it! Regards, Ivan Le 2/26/2010 15:09, Paul Hiemstra a écrit : Ivan Calandra wrote: Since you want input from beginners, here are some thoughts I had and still have two big problems with R: - this vectorization thing. I've read many manuals (including R inferno), but I'm still not completely clear about it. In simple examples, it's fine. But when it gets a bit more complex, then... Related to it, the *apply functions are still a bit difficult to understand. When I have to use them, I just try one and see what happens. I don't understand them well enough to know which one I need. - the second problem is where to find the functions/packages I need. There are many options, and that's actually the problem. R Wiki, Rseek, RSiteSearch, Crantastic, etc... When you start with R, you discover that the capabilities of R are almost unlimited and you don't really know where to start, where to find what you need. As noted in earlier posts, the mailing list is really great, but some people are really hard with beginners. It was noted in a discussion a few days ago, but it looks like some don't realize how difficult it is at the beginning to formulate a good question, clear, with self-contained example and so on. Moreover, not everybody speaks English natively. I don't mean that you must help, even when the question is really vague and not clear and whatever. I'm just saying that if you don't want to help (whatever the reason), you don't have to say it badly. But in any cases, the mailing list is still really helpful. As someone noted (sorry I erased the email so I don't remember who), it might be a good idea to split it. Hi everyone, My 2ct about the mailing list :). I understand that beginners have a hard time formulating a good question. But the problem is that we can't answer the question when it is unclear. So either I: - Don't bother answering - Try do discuss with the author of the question, taking lots of time to find out what exactly is the question. - Send a "read the posting guide" answer I mostly do the first, as I have to get things done during my PhD :). So this leaves us with kind of a problem, the person mailing the list doesn't have the knowledge to ask the right question, th
Re: [R] two questions for R beginners
I don't want to sound bad but the first thing beginners should do is to look at the manual "An Introduction to R" because most of the simple questions have their answers into it. In the same idea, before posting to this mailing list, people should (must?) follow the posting guide. Indeed it is written there to use some functions like help.search(), RSiteSearch() or to read "An Introduction to R" before posting. Too often I think how much I would like people to do their homeworks before posting. I would like to add that I don't feel an R expert but I don't like to waste my time answering questions which have an answer you can find easily if you respect the posting guide. Regards, Alain On 26-Feb-10 15:30, Ivan Calandra wrote: You are definitely right... What to do with bad beginner's questions is not a simple issue. If a "beginner's mailing list" is created, who will answer to such questions? And moreover, the beginners won't take advantage of the other questions (I've personally learned a lot trying to understand the questions and answers to other's problems). And also, as you said, the problems might persist. The beginner's mailing list might be good in one aspect though: the "experts" who subscribe to it would be willing to help the beginners to get started with R, knowing that the questions might not be clearly stated. As you pointed out, the mailing list is not the best for basic stuff (the question is of course "what is basic?"). Not everybody knows some colleagues who work with R (I'm personally the 1st one to use R in my lab). I think, somehow and I have no idea how, documentation and guidance to search for help should be more accessible as soon as you start with R. Maybe a _*clear*_ section on the R homepage or in the "introduction to R" manual like "where to find help", including all of the most common and useful resources available (from "?" and RSiteSearch() to R Wiki and Crantastic). I hope that this whole discussion might help to make the R world better. Thank you Patrick for initiating it! Regards, Ivan Le 2/26/2010 15:09, Paul Hiemstra a écrit : Ivan Calandra wrote: Since you want input from beginners, here are some thoughts I had and still have two big problems with R: - this vectorization thing. I've read many manuals (including R inferno), but I'm still not completely clear about it. In simple examples, it's fine. But when it gets a bit more complex, then... Related to it, the *apply functions are still a bit difficult to understand. When I have to use them, I just try one and see what happens. I don't understand them well enough to know which one I need. - the second problem is where to find the functions/packages I need. There are many options, and that's actually the problem. R Wiki, Rseek, RSiteSearch, Crantastic, etc... When you start with R, you discover that the capabilities of R are almost unlimited and you don't really know where to start, where to find what you need. As noted in earlier posts, the mailing list is really great, but some people are really hard with beginners. It was noted in a discussion a few days ago, but it looks like some don't realize how difficult it is at the beginning to formulate a good question, clear, with self-contained example and so on. Moreover, not everybody speaks English natively. I don't mean that you must help, even when the question is really vague and not clear and whatever. I'm just saying that if you don't want to help (whatever the reason), you don't have to say it badly. But in any cases, the mailing list is still really helpful. As someone noted (sorry I erased the email so I don't remember who), it might be a good idea to split it. Hi everyone, My 2ct about the mailing list :). I understand that beginners have a hard time formulating a good question. But the problem is that we can't answer the question when it is unclear. So either I: - Don't bother answering - Try do discuss with the author of the question, taking lots of time to find out what exactly is the question. - Send a "read the posting guide" answer I mostly do the first, as I have to get things done during my PhD :). So this leaves us with kind of a problem, the person mailing the list doesn't have the knowledge to ask the right question, the list can't answer properly and consequently, the person mailing the list still doesn't get the information he/she needs. We could start an R-beginner mailing list, but this would also suffer from this problem. What do you guys think? Maybe the mailing list is not the right medium for really basic stuff. For that I would recommend a good R-book or (better) a course in R or (even better) some colleagues who work with R that you can ask questions to. cheers, Paul Hope that's what you wanted Ivan Le 2/26/2010 08:39, Dieter Menne a écrit : Patrick Burns wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R?
Re: [R] two questions for R beginners
Honestly what I remember as the most difficult thing when I 'first' started using R was figuring out how to read in my own datasets. I eventually discovered the R import/export manual, but somehow this alluded me initially. All the R "tutorials" I was working from simply "generated" data or used the built in datasets, and "I" was ready to work on my own datasets. The things that led from "frustration" to "independence" was understanding the difference between data types like matrix and dataframe and learning there were commands to tell what you were working with at any given time. Did the data read in as character, numeric, or factor, etc. Commands like: str, class, mode, ls, search, help, help.search, etc can help you figure out what you are doing. Rob -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick Burns Sent: Thursday, February 25, 2010 11:31 AM To: r-help@r-project.org Subject: [R] two questions for R beginners * What were your biggest misconceptions or stumbling blocks to getting up and running with R? * What documents helped you the most in this initial phase? I especially want to hear from people who are lazy and impatient. Feel free to write to me off-list. Definitely write off-list if you are just confirming what has been said on-list. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'The R Inferno' and 'A Guide for the Unwilling S User') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Ivan Calandra writes: > Related to it, the *apply functions are still a bit difficult to > understand. When I have to use them, I just try one and see what > happens. I don't understand them well enough to know which one I > need. Ditto. I have ended up with a small collection of "black magic" invocations copied from other folks' code, designed to do things like "I wrote a function to read a file and generate a data frame. Now I want to iterate (vectorize) this over many files, and get a much larger data frame." This may be one specific case of the larger challenge of "transforming R data structures". A somewhat pedantic set of recipes might usefully be evolved on e.g. the wiki. - Allen S. Rout __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
You are definitely right... What to do with bad beginner's questions is not a simple issue. If a "beginner's mailing list" is created, who will answer to such questions? And moreover, the beginners won't take advantage of the other questions (I've personally learned a lot trying to understand the questions and answers to other's problems). And also, as you said, the problems might persist. The beginner's mailing list might be good in one aspect though: the "experts" who subscribe to it would be willing to help the beginners to get started with R, knowing that the questions might not be clearly stated. As you pointed out, the mailing list is not the best for basic stuff (the question is of course "what is basic?"). Not everybody knows some colleagues who work with R (I'm personally the 1st one to use R in my lab). I think, somehow and I have no idea how, documentation and guidance to search for help should be more accessible as soon as you start with R. Maybe a _*clear*_ section on the R homepage or in the "introduction to R" manual like "where to find help", including all of the most common and useful resources available (from "?" and RSiteSearch() to R Wiki and Crantastic). I hope that this whole discussion might help to make the R world better. Thank you Patrick for initiating it! Regards, Ivan Le 2/26/2010 15:09, Paul Hiemstra a écrit : > Ivan Calandra wrote: >> Since you want input from beginners, here are some thoughts >> >> I had and still have two big problems with R: >> - this vectorization thing. I've read many manuals (including R >> inferno), but I'm still not completely clear about it. In simple >> examples, it's fine. But when it gets a bit more complex, then... >> Related to it, the *apply functions are still a bit difficult to >> understand. When I have to use them, I just try one and see what >> happens. I don't understand them well enough to know which one I need. >> - the second problem is where to find the functions/packages I need. >> There are many options, and that's actually the problem. R Wiki, >> Rseek, RSiteSearch, Crantastic, etc... When you start with R, you >> discover that the capabilities of R are almost unlimited and you >> don't really know where to start, where to find what you need. >> >> As noted in earlier posts, the mailing list is really great, but some >> people are really hard with beginners. It was noted in a discussion a >> few days ago, but it looks like some don't realize how difficult it >> is at the beginning to formulate a good question, clear, with >> self-contained example and so on. Moreover, not everybody speaks >> English natively. I don't mean that you must help, even when the >> question is really vague and not clear and whatever. I'm just saying >> that if you don't want to help (whatever the reason), you don't have >> to say it badly. But in any cases, the mailing list is still really >> helpful. As someone noted (sorry I erased the email so I don't >> remember who), it might be a good idea to split it. > Hi everyone, > > My 2ct about the mailing list :). I understand that beginners have a > hard time formulating a good question. But the problem is that we > can't answer the question when it is unclear. So either I: > > - Don't bother answering > - Try do discuss with the author of the question, taking lots of time > to find out what exactly is the question. > - Send a "read the posting guide" answer > > I mostly do the first, as I have to get things done during my PhD :). > So this leaves us with kind of a problem, the person mailing the list > doesn't have the knowledge to ask the right question, the list can't > answer properly and consequently, the person mailing the list still > doesn't get the information he/she needs. We could start an R-beginner > mailing list, but this would also suffer from this problem. What do > you guys think? > > Maybe the mailing list is not the right medium for really basic stuff. > For that I would recommend a good R-book or (better) a course in R or > (even better) some colleagues who work with R that you can ask > questions to. > > cheers, > Paul >> >> Hope that's what you wanted >> Ivan >> >> >> Le 2/26/2010 08:39, Dieter Menne a écrit : >>> >>> Patrick Burns wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? >>> (This derives partly from teaching) >>> >>> The fact that this xapply-stuff was not idempotent (worse: not >>> always) and >>> that you need a monster like do.call() to straighten this out. >>> Nowadays, >>> plyr comes close. >>> >>> The concept of environment. With S it was worse, though. >>> >>> That you cannot change values "passed by reference". I noted that >>> the latter >>> is no problem for students who have not worked with c(++/#) before. >>> That >>> there is only one return-result in functions. >>> >>> "[" and the likes as an operator. >>> >>> 10 years ago, when I starte
Re: [R] two questions for R beginners
Ivan Calandra wrote: Since you want input from beginners, here are some thoughts I had and still have two big problems with R: - this vectorization thing. I've read many manuals (including R inferno), but I'm still not completely clear about it. In simple examples, it's fine. But when it gets a bit more complex, then... Related to it, the *apply functions are still a bit difficult to understand. When I have to use them, I just try one and see what happens. I don't understand them well enough to know which one I need. - the second problem is where to find the functions/packages I need. There are many options, and that's actually the problem. R Wiki, Rseek, RSiteSearch, Crantastic, etc... When you start with R, you discover that the capabilities of R are almost unlimited and you don't really know where to start, where to find what you need. As noted in earlier posts, the mailing list is really great, but some people are really hard with beginners. It was noted in a discussion a few days ago, but it looks like some don't realize how difficult it is at the beginning to formulate a good question, clear, with self-contained example and so on. Moreover, not everybody speaks English natively. I don't mean that you must help, even when the question is really vague and not clear and whatever. I'm just saying that if you don't want to help (whatever the reason), you don't have to say it badly. But in any cases, the mailing list is still really helpful. As someone noted (sorry I erased the email so I don't remember who), it might be a good idea to split it. Hi everyone, My 2ct about the mailing list :). I understand that beginners have a hard time formulating a good question. But the problem is that we can't answer the question when it is unclear. So either I: - Don't bother answering - Try do discuss with the author of the question, taking lots of time to find out what exactly is the question. - Send a "read the posting guide" answer I mostly do the first, as I have to get things done during my PhD :). So this leaves us with kind of a problem, the person mailing the list doesn't have the knowledge to ask the right question, the list can't answer properly and consequently, the person mailing the list still doesn't get the information he/she needs. We could start an R-beginner mailing list, but this would also suffer from this problem. What do you guys think? Maybe the mailing list is not the right medium for really basic stuff. For that I would recommend a good R-book or (better) a course in R or (even better) some colleagues who work with R that you can ask questions to. cheers, Paul Hope that's what you wanted Ivan Le 2/26/2010 08:39, Dieter Menne a écrit : Patrick Burns wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? (This derives partly from teaching) The fact that this xapply-stuff was not idempotent (worse: not always) and that you need a monster like do.call() to straighten this out. Nowadays, plyr comes close. The concept of environment. With S it was worse, though. That you cannot change values "passed by reference". I noted that the latter is no problem for students who have not worked with c(++/#) before. That there is only one return-result in functions. "[" and the likes as an operator. 10 years ago, when I started, the message was: S4 is the future, S3 is legacy. So I learned S4. Only to never use is in self-written code later. Might be different for BioConductor people. That sometimes you can use vectors not in data= (lattice), and sometimes not (ggplot2). Still a VERY confusing inconsistency. The "why-does-this-not-print" FAQ. Why does par(oma..) not work with lattice? Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +3130 274 3113 Mon-Tue Phone: +3130 253 5773 Wed-Fri http://intamap.geo.uu.nl/~paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On 25/02/2010 20:42, Greg Snow wrote: Patrick, I would add one more question: * where did you look for help expecting answers, but did not find them? Yes, an excellent additional question. Pat If you add hubris to laziness and impatience, you have Larry Wall's 3 virtues of a programmer. [...] -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'The R Inferno' and 'A Guide for the Unwilling S User') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Saeed, If the R-help list were split, what do you see as the pieces? Pat On 26/02/2010 01:53, Saeed Abu Nimeh wrote: On Thu, Feb 25, 2010 at 9:31 AM, Patrick Burns wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? 1- Compared to other programming languages it is hard to learn R by example, because it is hard to find code on the web that will do the exact thing you are looking for, sometimes you might get lucky though. By contrast, take Perl for example, it is an easy language to learn by example. 2- The R mailing list. Beginners get frustrated after they struggle for a long time to solve a problem and the easiest thing then is to send an email to the R mailing list. I did this in the past. The best thing that happened was that my request was neglected and I had to spend more time on the problem and find a solution by myself eventually. Do not get me wrong, I am not saying that the mailing list is bad, but it should be more organized. Maybe broken down into couple of other mailing lists. This might bring up a good discussion thread. * What documents helped you the most in this initial phase? An Introduction to R by Venables simpleR – Using R for Introductory Statistics by Verzani -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'The R Inferno' and 'A Guide for the Unwilling S User') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
My difficulties: 1) Statistics :-) well, I'm learning. 2) Understand what is available *per subject area*. Something like the task view for packages, should be compiled for basic commands/functions. Like: all things related to string manipulation, all things related to number formatting, all *apply things, and so on. Something similar is available for C runtime library functions (like in http://msdn.microsoft.com/en-us/library/2aza74he(VS.71).aspx ) and is really useful, also to expand the number of functions known. 3) The Diktakt-like: "avoid for loops!" without clear examples of alternatives. I have found them later in the maillist, but at the beginning it is not simple, especially coming from C/C++. 4) for statement behavior different from C/C++: for(i in 1:0) counts backward instead of stopping. 5) missing small things like ++var On the positive side: - it is not too difficult to setup something simple to create a decent chart. - it is possible to use for loops without feeling guilty. :-) - documentation is very well done. Maybe some page are still clear only to who already know the argument. - there are zillions of courses/papers/tutorials to read - after studying R by myself, now I'm becoming the local R expert, that from a workplace point of view is not bad... Hope it helps. Ciao! mario Ivan Calandra wrote: > Since you want input from beginners, here are some thoughts > > I had and still have two big problems with R: > - this vectorization thing. I've read many manuals (including R > inferno), but I'm still not completely clear about it. In simple > examples, it's fine. But when it gets a bit more complex, then... > Related to it, the *apply functions are still a bit difficult to > understand. When I have to use them, I just try one and see what > happens. I don't understand them well enough to know which one I need. > - the second problem is where to find the functions/packages I need. > There are many options, and that's actually the problem. R Wiki, Rseek, > RSiteSearch, Crantastic, etc... When you start with R, you discover that > the capabilities of R are almost unlimited and you don't really know > where to start, where to find what you need. > > As noted in earlier posts, the mailing list is really great, but some > people are really hard with beginners. It was noted in a discussion a > few days ago, but it looks like some don't realize how difficult it is > at the beginning to formulate a good question, clear, with > self-contained example and so on. Moreover, not everybody speaks English > natively. I don't mean that you must help, even when the question is > really vague and not clear and whatever. I'm just saying that if you > don't want to help (whatever the reason), you don't have to say it > badly. But in any cases, the mailing list is still really helpful. As > someone noted (sorry I erased the email so I don't remember who), it > might be a good idea to split it. > > Hope that's what you wanted > Ivan > > > Le 2/26/2010 08:39, Dieter Menne a écrit : >> Patrick Burns wrote: >> >>> * What were your biggest misconceptions or >>> stumbling blocks to getting up and running >>> with R? >>> >>> >>> >> (This derives partly from teaching) >> >> The fact that this xapply-stuff was not idempotent (worse: not always) and >> that you need a monster like do.call() to straighten this out. Nowadays, >> plyr comes close. >> >> The concept of environment. With S it was worse, though. >> >> That you cannot change values "passed by reference". I noted that the latter >> is no problem for students who have not worked with c(++/#) before. That >> there is only one return-result in functions. >> >> "[" and the likes as an operator. >> >> 10 years ago, when I started, the message was: S4 is the future, S3 is >> legacy. So I learned S4. Only to never use is in self-written code later. >> Might be different for BioConductor people. >> >> That sometimes you can use vectors not in data= (lattice), and sometimes not >> (ggplot2). Still a VERY confusing inconsistency. >> >> The "why-does-this-not-print" FAQ. >> >> Why does par(oma..) not work with lattice? >> >> Dieter >> >> >> > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Ing. Mario Valle Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducib
Re: [R] two questions for R beginners
Since you want input from beginners, here are some thoughts I had and still have two big problems with R: - this vectorization thing. I've read many manuals (including R inferno), but I'm still not completely clear about it. In simple examples, it's fine. But when it gets a bit more complex, then... Related to it, the *apply functions are still a bit difficult to understand. When I have to use them, I just try one and see what happens. I don't understand them well enough to know which one I need. - the second problem is where to find the functions/packages I need. There are many options, and that's actually the problem. R Wiki, Rseek, RSiteSearch, Crantastic, etc... When you start with R, you discover that the capabilities of R are almost unlimited and you don't really know where to start, where to find what you need. As noted in earlier posts, the mailing list is really great, but some people are really hard with beginners. It was noted in a discussion a few days ago, but it looks like some don't realize how difficult it is at the beginning to formulate a good question, clear, with self-contained example and so on. Moreover, not everybody speaks English natively. I don't mean that you must help, even when the question is really vague and not clear and whatever. I'm just saying that if you don't want to help (whatever the reason), you don't have to say it badly. But in any cases, the mailing list is still really helpful. As someone noted (sorry I erased the email so I don't remember who), it might be a good idea to split it. Hope that's what you wanted Ivan Le 2/26/2010 08:39, Dieter Menne a écrit : Patrick Burns wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? (This derives partly from teaching) The fact that this xapply-stuff was not idempotent (worse: not always) and that you need a monster like do.call() to straighten this out. Nowadays, plyr comes close. The concept of environment. With S it was worse, though. That you cannot change values "passed by reference". I noted that the latter is no problem for students who have not worked with c(++/#) before. That there is only one return-result in functions. "[" and the likes as an operator. 10 years ago, when I started, the message was: S4 is the future, S3 is legacy. So I learned S4. Only to never use is in self-written code later. Might be different for BioConductor people. That sometimes you can use vectors not in data= (lattice), and sometimes not (ggplot2). Still a VERY confusing inconsistency. The "why-does-this-not-print" FAQ. Why does par(oma..) not work with lattice? Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Patrick Burns wrote: > > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > (This derives partly from teaching) The fact that this xapply-stuff was not idempotent (worse: not always) and that you need a monster like do.call() to straighten this out. Nowadays, plyr comes close. The concept of environment. With S it was worse, though. That you cannot change values "passed by reference". I noted that the latter is no problem for students who have not worked with c(++/#) before. That there is only one return-result in functions. "[" and the likes as an operator. 10 years ago, when I started, the message was: S4 is the future, S3 is legacy. So I learned S4. Only to never use is in self-written code later. Might be different for BioConductor people. That sometimes you can use vectors not in data= (lattice), and sometimes not (ggplot2). Still a VERY confusing inconsistency. The "why-does-this-not-print" FAQ. Why does par(oma..) not work with lattice? Dieter -- View this message in context: http://n4.nabble.com/two-questions-for-R-beginners-tp1569384p1570249.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Thu, Feb 25, 2010 at 9:31 AM, Patrick Burns wrote: > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? 1- Compared to other programming languages it is hard to learn R by example, because it is hard to find code on the web that will do the exact thing you are looking for, sometimes you might get lucky though. By contrast, take Perl for example, it is an easy language to learn by example. 2- The R mailing list. Beginners get frustrated after they struggle for a long time to solve a problem and the easiest thing then is to send an email to the R mailing list. I did this in the past. The best thing that happened was that my request was neglected and I had to spend more time on the problem and find a solution by myself eventually. Do not get me wrong, I am not saying that the mailing list is bad, but it should be more organized. Maybe broken down into couple of other mailing lists. This might bring up a good discussion thread. > > * What documents helped you the most in this > initial phase? An Introduction to R by Venables simpleR – Using R for Introductory Statistics by Verzani __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
My biggest blocker was my misconception that R is extremely difficult to start with. It is powerful and one can do very complicated things ( that consequently turn things complicated) but it comes with very nice defaults and one can produce great results with standard tasks in very little time - especially if one has done programming and/or scripting before. I pushed it away for too long that way. I wish I would have used it years ago and avoided SPSS altogether - must have wasted 100s of hours doing repetitive tasks by click and partial scripts in SPSS. Not to mention a horrible license policy and a visualization unit that is simply embarrassing for a product that is in its 18th or 19th version. Ralf On Thu, Feb 25, 2010 at 1:11 PM, Tal Galili wrote: > My biggest stumbling blocks to getting up and running with R was whenever I > was lazy and impatient. > > The more you love R, the more it loves you back. > > Tal > > > > > Contact > Details:--- > Contact me: tal.gal...@gmail.com | 972-52-7275845 > Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | > www.r-statistics.com (English) > -- > > > > > On Thu, Feb 25, 2010 at 7:31 PM, Patrick Burns > wrote: > >> * What were your biggest misconceptions or >> stumbling blocks to getting up and running >> with R? >> >> * What documents helped you the most in this >> initial phase? >> >> I especially want to hear from people who are >> lazy and impatient. >> >> Feel free to write to me off-list. Definitely >> write off-list if you are just confirming what >> has been said on-list. >> >> -- >> Patrick Burns >> pbu...@pburns.seanet.com >> http://www.burns-stat.com >> (home of 'The R Inferno' and 'A Guide for the Unwilling S User') >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On Thu, Feb 25, 2010 at 5:39 PM, Carl Witthoft wrote: > Well, here goes... > > I still wish there were a really good monograph on the use and > implementation of factors. To get a good handle on factors, and the sets of contrasts they encode, it is really necessary to study a good statistics book. I recommend mine Statistical Analysis and Data Display, An Intermediate Course with Examples in S-Plus, R, and SAS, Richard M. Heiberger and Burt Holland, Springer 2004 But I will acknowledge that other books are available. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
My biggest stumbling blocks to getting up and running with R was whenever I was lazy and impatient. The more you love R, the more it loves you back. Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Feb 25, 2010 at 7:31 PM, Patrick Burns wrote: > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > pbu...@pburns.seanet.com > http://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Patrick Burns wrote: > > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > R was the first scripting language that I *really* invested time in learning. Prior to R I had a few years experience programming in Fortran and had worked on a few projects using Matlab. Because most of my programming experience was with Fortran, the toughest thing to get my head around was definitely lexical scoping and that unlike Fortran subroutines, R function results had to be assigned to something in order to persist outside of the function. Patrick Burns wrote: > > * What documents helped you the most in this > initial phase? > Definitely the "An Introduction to R" manual that ships with the core distribution. It helped me translate my knowledge of programming concepts to the R language very quickly. Patrick Burns wrote: > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > -- View this message in context: http://n4.nabble.com/two-questions-for-R-beginners-tp1569384p1569901.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
> "The best way to spread information is to tell someone that it is a secret, > the best way to keep it secret is to put it in > a manual." ==> Nice quote. ;-) The problem is not that there's too little information, rather there's so much. That's probably because R is so powerful, but it makes it tough to sieve out the relevant bits. Some of the info is way too technical to be practical. If I want to drive a car I do not necessarily need to know all the nitty gritty about engine technology. > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? ==> That R can't deal very well with large data, which is not entirely untrue. Also, I was learning another language (Python) and I didn't want R to interfere with that. Finally, in a working environment, it;s almost impossible to justify the time 'lost' learning a new language. Managers generally don't give a %$# about the beauty and robustness of a language. They just want to get the job done asap. > > * What documents helped you the most in this > initial phase? > ==> Many docs. CRAN documents (pdfs), other tutorials, Bob Muenchen's book. Many docs == many angles == a good way to learn things. > I especially want to hear from people who are > lazy and impatient. > ==> Lazy? n/a. Impatient? Yup, guilty as charged. > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. Cheers!! Albert-Jan ~~ In the face of ambiguity, refuse the temptation to guess. ~~~~~~~~~~~~~~~~~~~~~~~~~~ --- On Thu, 2/25/10, Greg Snow wrote: From: Greg Snow Subject: Re: [R] two questions for R beginners To: "Patrick Burns" , "r-help@r-project.org" Date: Thursday, February 25, 2010, 9:42 PM Patrick, I would add one more question: * where did you look for help expecting answers, but did not find them? If you add hubris to laziness and impatience, you have Larry Wall's 3 virtues of a programmer. To new users of R who may not understand why Patrick is asking: Patrick Burns is the author of some great tutorials/references on S/R and is probably looking for questions to answer in his next contribution. Lately there have been a large number of questions on some fairly basic issues (and some rather complex issues that people expected to be simple/basic). My initial response (and probably others as well) to some of these requests was to quickly think that the answer is obvious and that the obvious place to look is ..., but then I realize that I am a high school dropout who has been using S/R for over 20 years, majored in statistics but reads Shakespeare for fun, and have been known to saw people in half for the entertainment of others; so I am probably not representative of most beginners. Fortune(89) probably applies here. If R beginners will share their frustrations, where they looked but did not find answers (and why they looked there), what would have helped them, etc. Then we (well probably Patrick mostly) can do more to help the next set of beginners. It does not matter how good our answers are if they answer the wrong questions or are in places that the questioner never sees them. "The best way to spread information is to tell someone that it is a secret, the best way to keep it secret is to put it in a manual." -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of Patrick Burns > Sent: Thursday, February 25, 2010 10:31 AM > To: r-help@r-project.org > Subject: [R] two questions for R beginners > > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > pbu...@pburns.seanet.com > http://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help
Re: [R] two questions for R beginners
Well, here goes... I still wish there were a really good monograph on the use and implementation of factors. I had to do a certain amount of digging to learn that {assign, get, eval, expression, call, parse, deparse} all existed and how they play together. Sometimes they are look like the C language's indirect addressing, *foo and &foo , and sometimes they don't. :-) Remembering exactly what " y~x " can do and what it can't took a while. Learning about, and watching for 'lazy evaluation,' especially in variables passed to a function, was a bit of a surprise. And to echo others, "R-inferno" has been invaluable, along with the Zoonek manual. Carl __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Patrick Burns wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? * What documents helped you the most in this initial phase? I especially want to hear from people who are lazy and impatient. Can't be bothered with questionnaires and can't wait to see your next book... ;-) -pd -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Patrick, I would add one more question: * where did you look for help expecting answers, but did not find them? If you add hubris to laziness and impatience, you have Larry Wall's 3 virtues of a programmer. To new users of R who may not understand why Patrick is asking: Patrick Burns is the author of some great tutorials/references on S/R and is probably looking for questions to answer in his next contribution. Lately there have been a large number of questions on some fairly basic issues (and some rather complex issues that people expected to be simple/basic). My initial response (and probably others as well) to some of these requests was to quickly think that the answer is obvious and that the obvious place to look is ..., but then I realize that I am a high school dropout who has been using S/R for over 20 years, majored in statistics but reads Shakespeare for fun, and have been known to saw people in half for the entertainment of others; so I am probably not representative of most beginners. Fortune(89) probably applies here. If R beginners will share their frustrations, where they looked but did not find answers (and why they looked there), what would have helped them, etc. Then we (well probably Patrick mostly) can do more to help the next set of beginners. It does not matter how good our answers are if they answer the wrong questions or are in places that the questioner never sees them. "The best way to spread information is to tell someone that it is a secret, the best way to keep it secret is to put it in a manual." -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of Patrick Burns > Sent: Thursday, February 25, 2010 10:31 AM > To: r-help@r-project.org > Subject: [R] two questions for R beginners > > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > > Feel free to write to me off-list. Definitely > write off-list if you are just confirming what > has been said on-list. > > -- > Patrick Burns > pbu...@pburns.seanet.com > http://www.burns-stat.com > (home of 'The R Inferno' and 'A Guide for the Unwilling S User') > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
On 2/25/10, Patrick Burns wrote: > * What were your biggest misconceptions or > stumbling blocks to getting up and running > with R? > > * What documents helped you the most in this > initial phase? > > I especially want to hear from people who are > lazy and impatient. > I'm quite resilient so I don't think I got to the point of frustration, but getting up to speed was a lengthy process. The biggest stumbler was getting onto the console, and not knowing what to do next. (My first encounter with stats was SPSS, so it was similar to getting onto a UNIX virtual console after a life-long experience with point-and-click windows: it's not very reassuring to know that there are man pages.) I stayed in the what-do-I-do-next state of mind for about 6-12 months (I learned R myself, and my professors were quite reticent when I first introduced them to R). Of particular help to making progress were JGR (arguments suggestions, editor with syntax highlighting, object browser, etc.), Rcmdr (quick access to examples for performing specific tasks, etc.) and Sweave + LyX (for easy results transfer and report creation, without the burden of learning LaTeX). For graphics, playwith latticist and rggobi come in very handy. From the documentation, right now I can recall Quick-R and "R for SAS and SPSS users". And of course, RSiteSearch (also via the sos package), Rseek and the vignettes are a must. Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
Apparently I need to explain the "lazy and impatient" comment. No offence was intended (quite the contrary). The meaning of it is that the higher your level of frustration, the more valuable your comments are likely to be to me. On 25/02/2010 17:31, Patrick Burns wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? * What documents helped you the most in this initial phase? I especially want to hear from people who are lazy and impatient. Feel free to write to me off-list. Definitely write off-list if you are just confirming what has been said on-list. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'The R Inferno' and 'A Guide for the Unwilling S User') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] two questions for R beginners
I started using statistical software with the commercial product S+ when I obtained a new HP735 workstation. We kept the S+ license going for a number of years until I heard about R. It was an easy transition and because I have been proficient in fortran and perl, the scripting came naturally--except for some syntax similarities/differences between perl and R interacting with a natural tendency towards dyslexia. I especially like that I can slice and dice the data to ferret out relationships e.g., concentration by hour of day, by month, by wind speed, by wind direction--love those boxplots. I also find that even the default settings produce some pretty attractive plots that are useable in many settings--I've also produced some pretty awful ones. And the price always reminds me that I need to find every way possible to contribute to the overall good--I've forgotten too much of my fortran and C programming skills to contribute directly to the R Project. Clint -- Clint BowmanINTERNET: cl...@ecy.wa.gov Air Quality Modeler INTERNET: cl...@math.utah.edu Department of Ecology VOICE: (360) 407-6815 PO Box 47600FAX:(360) 407-7534 Olympia, WA 98504-7600 On Thu, 25 Feb 2010, Ralf B wrote: My biggest blocker was my misconception that R is extremely difficult to start with. It is powerful and one can do very complicated things ( that consequently turn things complicated) but it comes with very nice defaults and one can produce great results with standard tasks in very little time - especially if one has done programming and/or scripting before. I pushed it away for too long that way. I wish I would have used it years ago and avoided SPSS altogether - must have wasted 100s of hours doing repetitive tasks by click and partial scripts in SPSS. Not to mention a horrible license policy and a visualization unit that is simply embarrassing for a product that is in its 18th or 19th version. Ralf On Thu, Feb 25, 2010 at 1:11 PM, Tal Galili wrote: My biggest stumbling blocks to getting up and running with R was whenever I was lazy and impatient. The more you love R, the more it loves you back. Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Feb 25, 2010 at 7:31 PM, Patrick Burns wrote: * What were your biggest misconceptions or stumbling blocks to getting up and running with R? * What documents helped you the most in this initial phase? I especially want to hear from people who are lazy and impatient. Feel free to write to me off-list. Definitely write off-list if you are just confirming what has been said on-list. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'The R Inferno' and 'A Guide for the Unwilling S User') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.