Re: [R] Re: Thanks Frank, setting graph parameters, and why social scientists don't use R
There are answers that could and should be applied in specific situations. At least in academia and in substantial research teams, statisticians ought to have a prominent part in many of the research teams. Senior statisticians should have a prominent role in deciding the teams to which this applies. why should it be ok to do combine high levels of chemical expertise with truly appalling statistical misunderstandings, to the extent that the suppose chemical insights are not what they appear to be? There should be a major focus on training application area students on training them to understand important ideas, to recognize when they are out of their depth, and to work with statisticians. There should be much more use of statisticians in the refereeing of published papers. Editors need to seek advice from experienced statisticians (some do) on what sorts of papers are candidates for statistical refereeing. Publication in an archive of the data that have been used for a paper could be a huge help, so that others can check whether the data really do support the conclusion. Even better, as Robert Gentleman has argued, would/will be papers that can be processed through Sweave or its equivalent. Really enlightened people (in the statistical sense) in the applied communities will latch onto R, as some are doing, because the limitations inherent in much other software so often lead to crippled and/or misleading analyses. Increasingly, we can hope that it will become difficult for statistics to in various applied area communities to proceed on its merry way, ignorant of or ignoring most of what has happened in the mainstream statistical community in the past 20 years. The statistical community needs to be a lot more aggressive in demanding adequate standards of data analysis in applied areas, at the same time suggesting ways in which it can work with application area people to improve standards. It is also fair to comment that the situation is very uneven. There are some areas where the standards are pretty reasonable, at least for the types of problems that typically come up in those areas. John Maindonald. John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Bioinformation Science, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 18 Aug 2004, Bert Gunter wrote: So we see fairly frequently indications of misunderstanding and confusion in using R. But the problem isn't R -- it's that users don't know enough statistics. . . . . I wish I could say I had an answer for this, but I don't have a clue. I do not thing it's fair to expect a mechnical engineer or psychologist or biologist to have the numerous math and statistical courses and experience in their training that would provide the base they need. For one thing, they don't have the time in their studies for this; for another, they may not have the background or interest -- they are, after all, mechanical engineers or biologists, not statisticians. Unfortunately, they could do their jobs as engineers and scientists a lot better if they did know more statistics. To me, it's a fundamental conundrum, and no one is to blame. It's just the reality, but it is the source for all kinds of frustrations on both sides of the statistical divide, which both you and Roger expressed in your own ways. . . . . __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Re: Thanks Frank, setting graph parameters, and why social scientists don't use R
On Tuesday 17 August 2004 09:20, Berton Gunter wrote: > A few comments: It has been decades since I used SPSS. At that time, to really work with it you edited a text file program that identified the data file and variable columns you wanted to work with. You assembled the flow of work commands after carefully going through the SPSS documentation. After you were ready, you ran the program and crossed your fingers. R IS complex, enough so that the useability at a basic level is readily achievable. What it lacks is simply the Stat 1 and Stat 101 packages that lead users from the very basics covered in introductory statistics texts into more profound analyses that some many R users are interested in. There are some texts, such as Peter Daalgard's Introductory Statistics with R, which is a very useful book. However, from a student's view point Chapter 1 focuses on R, everything from the R Language to R programming. The statistics chapters that follow almost seem to be used as an adjunct to teaching R rather than vice versa. For some social science students, a package that leads more gradually into R would probably be a big help learning learning the language while getting their feet wet in statistics. John __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Re: Thanks Frank, setting graph parameters, and why social scientists don't use R
On Tuesday 17 August 2004 06:14, Roger D. Peng wrote: > I'm just curious, but how do social scientists, or anyone else for > that matter, learn SPSS, besides taking a class? > They sit down with a book, a computer, and data they desperately need to analyze and start working. SPSS documentation and some of the third party works are fairly thorough, and pretty gentle, and the writings fits the expectations of someone who has had only the initiatory stats courses. Your teacher emphasizes checking the normality of the data, so you look for the means of measuring it and the tests that tell you whether it is significant or not, after very carefully considering the nature of your data in the light of the assumptions made in the SPSS tests make. You are far less concerned with the real mathematical mechanics than you are about meeting the expectations of the professor. SPSS, SYSTAT, NCSS and similar programs all support this kind work. Many social science professors don't really know enough to judge your work beyond similar expectations THEY learned from their own professors. It's sad, but the way it works in many schools. J __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Re: Thanks Frank, setting graph parameters,and why social scientists don't use R
A few comments: First, your remarks are interesting and, I would say, mainly well founded. However, I think they are in many respects irrelevant, although they do point to the much bigger underlying issue, which Roger Peng also hinted at in his reply. I think they are sensible because R IS difficult; the documentation is often challenging, which is not surprising given (a) the inherent complexity of R; (b) the difficulty in writing good documentation, especially when many of the functions being documented are inherently technical, so subject matter knowledge (CS, statistics, numerical analysis ,...) must be assumed; (c) the documentation has been written by a variety of mostly statistical types as a sidelight of their main professional activities -- none of these writers are ** professional documenters ** (whatever that may mean) and some of them even speak ENglish as a second or third language. My own take is that the documentation for Core R and many of the packages is remarkably well done given these realities, and my hat is off to those who have produced it. Nevertheless, I agree, it is challenging -- it MUST be. But they are irrelevant because the fundamental issue **is** that there is an inherent tension between ease of use and power/flexibility. Writing good GUI's for anything is hard, very hard. For a project such as R, it doesn't make sense, although it may to write GUI's for small subsets of R targeted at specific audiences (as in BioConductor, RCommander, etc.). But even this is hard to do well and takes a lot of time and effort. So, IMHO, there never will be nor ever should/could be an overall GUI for R: it is too complex and needs to be too extensible and flexible to constrain it in that way. However, I believe the larger question that both you and Roger Peng hint at is more important: not "How does a social scientist learn to use R," but how does any scientist/technologist for whom experimental design and data analysis forms a large component of their work gain the necessary technical background in statistics and related disciplines (linear algebra, numerical analysis, ...) to ** know how to use the statistical tools they need that R provides.** Software like SPSS must assume a limited collection of methods to present to their customers in an effective GUI. Their strategy **must** be (this is NOT a criticism) to "dumb it down" so that they can provide coherent albeit limited data analysis strategies. As you have explicitly stated, users who wish to venture outside those narrow paradigms are simply out of luck. R was designed from the outset not to be so constrained, but the cost is that you must know a good deal to use it effectively. It is obvious from the questions posted to this list that even something as "simple" as lm() often demands from users technical statistical understanding far beyond what they have. So we see fairly frequently indications of misunderstanding and confusion in using R. But the problem isn't R -- it's that users don't know enough statistics. I wish I could say I had an answer for this, but I don't have a clue. I do not thing it's fair to expect a mechnical engineer or psychologist or biologist to have the numerous math and statistical courses and experience in their training that would provide the base they need. For one thing, they don't have the time in their studies for this; for another, they may not have the background or interest -- they are, after all, mechanical engineers or biologists, not statisticians. Unfortunately, they could do their jobs as engineers and scientists a lot better if they did know more statistics. To me, it's a fundamental conundrum, and no one is to blame. It's just the reality, but it is the source for all kinds of frustrations on both sides of the statistical divide, which both you and Roger expressed in your own ways. Obviously, all of this is just personal ranting, so I would love to hear alternative views. An thanks again for your clear and interesting comments. Cheers, Bert [EMAIL PROTECTED] wrote: > First, many thanks to Frank Harrell for once again helping me out. This actually > relates to the next point, which is my contribution to the 'why don't social > scientists use R' discussion. I am a hybrid social scientist(child psychiatrist) > who trained on SPSS. Many of my difficulties in coming to terms with R have been to > do with trying to apply the logic underlying SPSS, with dire results. You do not > want to know how long I spent looking for a 'recode' command in R, to change factor > names and classes. > > I think the solution is to combine a graphical interface that encourages command > line use (such as Rcommander) with the analyse(this) paradigm suggested, but also > explaining how one can a) display the code on a separate window ('page' is only an > obvious command once you know it), and b) how one can then save one's modification, > make it generall
Re: [R] Re: Thanks Frank, setting graph parameters, and why social scientists don't use R
I'm just curious, but how do social scientists, or anyone else for that matter, learn SPSS, besides taking a class? -roger [EMAIL PROTECTED] wrote: First, many thanks to Frank Harrell for once again helping me out. This actually relates to the next point, which is my contribution to the 'why don't social scientists use R' discussion. I am a hybrid social scientist(child psychiatrist) who trained on SPSS. Many of my difficulties in coming to terms with R have been to do with trying to apply the logic underlying SPSS, with dire results. You do not want to know how long I spent looking for a 'recode' command in R, to change factor names and classes. I think the solution is to combine a graphical interface that encourages command line use (such as Rcommander) with the analyse(this) paradigm suggested, but also explaining how one can a) display the code on a separate window ('page' is only an obvious command once you know it), and b) how one can then save one's modification, make it generally available, and not overwrite the unmodified version (again, thanks, Frank). Finally, one would need to change the emphasis in basic statistical teaching from 'the right test' to 'the right model'. That should get people used to R's logic. If a rabbit starts to use R, s/he is likely to head for the help files associated with each function, which can assume that the reader can make sense of gnomic utterances like "Omit 'var' to impute all variables, creating new variables in 'search' position 'where'". I still don't know what that one means (as I don't understand search positions, or why they're important). This can be very offputting, and could lead the rabbit to return to familiar SPSS territory. Finally, friendlier error messages would also help. It took me 3 days, and opening every function I could, to work out that '...cannot find function xxx.data.frame...' meant that MICE was unable to make a polychotomous logistic imputation model converge for the variable immediately preceding it. I am now off to the help files and FAQs to find out how to change graph parameters, as the plot.mids function in MICE a) doesn't allow one to select a subset of variables, and b) tells me that the graph it wants to produce on the whole of my 26 variable dataset is too big to fit on the (windows) plotting device. Unless anyone wants to tell me how/where? (which of course is why, in the end, R is EASIER to use than SPSS) __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html