[R] R Online Workshops October 7-11

2013-09-11 Thread Muenchen, Robert A (Bob)
Learn R and/or data mangement at home October 7 through 11 

http://r4stats.com/2013/09/11/learn-r-andor-data-management-from-home-october-7-11/

==
  Bob Muenchen (pronounced Min'-chen)
  Accredited Professional Statistician(tm)   
  Manager, Research Computing Support
  Voice: (865) 974-5230  
  Email: muenc...@utk.edu
  UT Web Site:   http://oit.utk.edu/research
  Personal Web Site: http://r4stats.com 
  News:  http://itc2.utk.edu/newsletter_monthly/
==

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How Rcmdr or na.exclude blocks TukeyHSD

2012-10-23 Thread Muenchen, Robert A (Bob)
Dear R-Helpers,

I was calling the TukeyHSD function and not getting confidence intervals or 
p-values. It turns out this was caused by missing data and the fact that I had 
previously turned on R Commander (Rcmdr). John Fox knew that Rcmdr sets 
na.action to na.exclude, which causes the problem. If you have this problem, 
you can either exit Rcmdr before calling TukeyHSD or you can set na.action to 
na.omit. The code below demonstrates the situation.

Cheers,
Bob

data(warpbreaks)
head(warpbreaks)

# Introduce a missing value:
warpbreaks$breaks[1] - NA
head(warpbreaks)

# Do a model:
fm1 - aov(breaks ~ tension, data = warpbreaks)
TukeyHSD(fm1, tension, ordered = TRUE)

# Setting na.exclude or starting Rcmdr will kill the confidence intervals:
options(na.action = na.exclude)
fm1 - aov(breaks ~ tension, data = warpbreaks)
TukeyHSD(fm1, tension, ordered = TRUE)

# Setting na.omit or exiting Rcmdr will get it working again:
options(na.action=na.omit)
fm1 - aov(breaks ~ tension, data = warpbreaks)
TukeyHSD(fm1, tension, ordered = TRUE)

=
  Bob Muenchen (pronounced Min'-chen), Manager  
  Research Computing Support
  Voice: (865) 974-5230  
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research, 
  News:  http://itc2.utk.edu/newsletter_monthly/


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Programming examples added to r4stats.com

2011-08-31 Thread Muenchen, Robert A (Bob)
Hi All,

I now have programming examples for common research tasks done in R, SAS, SPSS 
and Stata at http://r4stats.com.  The examples fall into the following 
categories:

Data Import  Export
Data Management
Enhancing Output
Graphics, ggplot2
Graphics, Traditional
Selecting Variables and Observations
Statistics

For the graphics examples, I got lazy and only show the R code done two ways. 

All the examples are from the books R for SAS and SPSS Users and R for Stata 
Users. They've been downloadable with their practice data sets for a few 
years, but it's much easier to just to pop over there to find one thing on a 
web page than it is to download the whole set and sift through them. The second 
edition of R for SAS and SPSS Users should print tomorrow. The web site also 
shows the new topics in that edition under What's New. We'll add those to the 
second edition of R for Stata Users but that won't happen until after a few 
other projects wrap up.

Cheers,
Bob

=
  Bob Muenchen (pronounced Min'-chen), Manager  
  Research Computing Support
  Voice: (865) 974-5230  
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research, 
  News:  http://oit.utk.edu/research/news.php


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Popularity of R, SAS, SPSS, Stata, Statistica, S-PLUS updated

2011-03-22 Thread Muenchen, Robert A (Bob)
Greetings,

I've just put out the latest version of The Popularity of Data Analysis 
Software at http://r4stats.com/popularity. This update includes complete data 
for 2010, the addition of number of blogs for each software, more coverage of 
Statistica, and, where possible, measures regarding the implementations of the 
SAS Language: Carolina and the World Programming System (WPS).

Cheers,
Bob

=
  Bob Muenchen (pronounced Min'-chen), Manager  
  Research Computing Support
  Voice: (865) 974-5230  
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research, 
  News:  http://oit.utk.edu/research/news.php


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Teaching R: To quote, or not to quote?

2011-03-07 Thread Muenchen, Robert A (Bob)
Hi All,

When I teach an intro workshop on R, I've been minimizing quote confusion by 
always using quotes around package names in function calls. For example:

install.packages(Hmisc)
update.packages(Hmisc)
library(Hmisc)
citation(Hmisc)
search()  # displays package names in quotes
detach(packages:Hmisc)  # just as search displayed it

all look consistent with quotes. They're optional, of course, with library and 
detach and I tell them that. But for beginners, it's hard to remember when they 
don't need quotes. This perspective continues with function names in help:

help(mean)
?mean
help(if)
?if

which avoids the fact that some important topics like control-flow words (e.g. 
help(if) ) generate error messages without the quotes. For help, the quotes 
make the string a topic instead of a name, but that doesn't seem to block it 
from finding function names in quotes.

I'm about to go to press with the second edition of R for SAS and SPSS Users  
I'm wondering if there's a downside to this. No other books I've seen use 
library(package) or help(function) consistently. Is there a reason I should 
avoid it?

Thanks,
Bob

=
  Bob Muenchen (pronounced Min'-chen), Manager  
  Research Computing Support
  Voice: (865) 974-5230  
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research, 
  News:  http://oit.utk.edu/research/news.php


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (New) Popularity of R, SAS, SPSS, Stata...

2010-06-28 Thread Muenchen, Robert A (Bob)
Greeting Listserv Readers,

At http://r4stats.com/popularity I have added plots, data, and/or
discussion of:

1. Scholarly impact of each package across the years
2. The number of subscribers to some of the listservs
3. How popular each package is among Google searches across the years
4. Survey results from a Rexer Analytics poll
5. Survey results from a KDnuggests poll
6. A rudimentary analysis of the software skills that employers are
seeking

Thanks very much to all the folks who helped on this project including:
John Fox, Marc Schwartz, Duncan Murdoch, Martin Weiss, John (Jiangtang)
HU, Andre Wielki, Kjetil Halvorsen, Dario Solari, Joris Meys, Keo
Ormsby, Karl Rexer, and Gregory Piatetsky-Shapiro.

If anyone can think of other angles, please let me know.

Cheers,
Bob

=
  Bob Muenchen (pronounced Min'-chen), Manager  
  Research Computing Support
  Voice: (865) 974-5230  
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research, 
  News:  http://oit.utk.edu/research/news.php
  Feedback: http://oit.utk.edu/feedback/
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-26 Thread Muenchen, Robert A (Bob)


-Original Message-
From: Joris Meys [mailto:jorism...@gmail.com]
Sent: Friday, June 25, 2010 10:10 PM
To: Muenchen, Robert A (Bob)
Cc: Dario Solari; r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...

I had taken the opposite tack with Google Trends by subtracting
 keywords
like:
SAS -shoes -airlines -sonar...
but never got as good results as that beautiful X code for search.
When you see the end-of-semester panic bumps in traffic, you know
 you're
nailing it!

 I have to eat those words already. The R code for search that
showed
a
 peak every December did not have quotes around it, so it was
searching
 for those three words not the complete phrase. When you add the
quotes,
 the peaks vanish.

Don't swallow! You're looking through search terms, not through web
pages. R code for regression, regression code R etc. are all valid
searches, no quotation marks needed.

I wondered why those clear peaks had vanished when I added quotes.
Here's one that combines the search terms without the quotes. It shows
several March/April  October/November peaks: 

http://www.google.com/insights/search/#q=r%20code%20for%2Br%20manual%2Br
%20tutorial%2Br%20graph%2Csas%20code%20for%2Bsas%20manual%2Bsas%20tutori
al%2Bsas%20graph%2Cspss%20code%20for%2Bspss%20manual%2Bspss%20tutorial%2
Bspss%20graph%2Cstata%20code%20for%2Bstata%20manual%2Bstata%20tutorial%2
Bstata%20graph%2Cs-plus%20code%20for%2Bs-plus%20manual%2Bs-plus%20tutori
al%2Bs-plus%20graphcmpt=q

I've been trying to make sense of Google Scholar searches. I'm obviously
missing something basic. Here are two searches on www.google.com:

sas - gets 68M hits
sas OR spss - gets 74.3M hits. A bigger number as OR would imply.

But when I do the same searches on scholar.google.com, here's what I
get:

sas - gets 4.6M hits
sas OR spss - gets 1.65M hits

How on earth can an OR get you less??

Thanks,
Bob


http://www.google.com/insights/search/#q=code%20for%20r%2Ccode%20for%20
S
AS%2Ccode%20for%20SPSS%2Ccode%20for%20matlabcmpt=q

This one is nice too. You can see that the bump in the autumn semester
for R is replacing the one for Matlab. Then in the spring semester
Matlab stays high but R drops. And both the US and India always have a
very large search index, whereas the rest of the world is essentially
worthless. Which leads me to the conclusion that : 1) The results are
probably coming from google.com, excluding local versions, and 2) in
the US (and India), statistics is mainly taught in the autumn
semester. Given the fact that daylight has a beneficial effect on the
emotional well being, the impopularity of statistics is likely caused
by unfortunate scheduling.

Forget Excel. Google rocks! ;-)

Cheers
Joris


 Once you go the phrase route, you gain precision but end up with zero
 counts on various phrases. I avoided that by combining them with +
to
 get enough to plot. The resulting graph shows SAS dominant until
 mid-2006 when SPSS takes the top position, followed by R, SAS, Stata
in
 order:


http://www.google.com/insights/search/#q=%22r%20code%20for%22%2B%22r%20
m

anual%22%2B%22r%20tutorial%22%2B%22r%20graph%22%2C%22sas%20code%20for%2
2

%2B%22sas%20manual%22%2B%22sas%20tutorial%22%2B%22sas%20graph%22%2C%22s
p

ss%20code%20for%22%2B%22spss%20manual%22%2B%22spss%20tutorial%22%2B%22s
p

ss%20graph%22%2C%22stata%20code%20for%22%2B%22stata%20manual%22%2B%22st
a
 ta%20tutorial%22%2B%22stata%20graph%22%2C%22s-
plus%20code%20for%22%2B%22
 s-plus%20manual%22%2Bs-plus%20tutorial%22%2B%22s-
plus%20graph%22cmpt=q

 This might be a good one to add to http://r4stats.com/popularity

 Bob


I see that there's a car, the R Code Mustang, that adding for gets
 rid
of.

Thanks for getting me back on a topic that I had given up on!

Bob

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Joris Meys
Sent: Thursday, June 24, 2010 7:56 PM
To: Dario Solari
Cc: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...

Nice idea, but quite sensitive to search terms, if you compare your
result on ... code with ... code for:
http://www.google.com/insights/search/#q=r%20code%20for%2Csas%20code
%
2
 0
f
or%2Cspss%20code%20forcmpt=q

On Thu, Jun 24, 2010 at 10:48 PM, Dario Solari
 dario.sol...@gmail.com
wrote:
 First: excuse for my english

 My opinion: a useful font for measuring popoularity can be
Google
 Insights for Search - http://www.google.com/insights/search/#

 Every person using a software like R, SAS, SPSS needs first to
learn
 it. So probably he make a web-search for a manual, a tutorial, a
 guide. One can measure the share of this kind of serach query.
 This kind of results can be useful to determine trends of
 popularity.

 Example 1: R tutorial/manual/guide, SAS tutorial/manual/guide,
 SPSS tutorial/manual/guide

http://www.google.com/insights/search/#q=%22r%20tutorial%22%2B%22r%2
0
m
 a
n
ual%22%2B%22r%20guide%22%2B%22r%20vignette%22%2C%22spss%20tutorial%2
2
%
 2
B
%22spss

Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-25 Thread Muenchen, Robert A (Bob)


-Original Message-
From: Liviu Andronic [mailto:landronim...@gmail.com]
Sent: Friday, June 25, 2010 7:15 AM
To: Muenchen, Robert A (Bob)
Cc: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...

On Sun, Jun 20, 2010 at 2:31 PM, Muenchen, Robert A (Bob)
muenc...@utk.edu wrote:
 come up with so far at http://r4stats.com/popularity . I'm sure people
 will have plenty of ideas on how to improve this, so please let me
know
 what you think.

This is not much of a metric, probably not even a ballpark, but I have
a habit of measuring the popularity of a software by the number of
unread messages in my mail account, sent to one of its main mailing
lists. For example, I subscribed to Gentoo, Xfce and LyX MLs much
earlier than to that of R, but R quickly and surpassed all in number
of unread messages. At the moment I have the following: R ( 37k), LyX
(10k), Debian (7k), Xfce (3k), Geany (.5k). I dare say that R might
be more popular than Debian, but again, any such estimation seems
farfetched.

Regards
Liviu

Hi Liviu,

E-mail was the thing that got me back to this paper. I had been working on 
variations of measures for several years  was frustrated mostly by how many 
problems I ran into regarding search logic (SAS stands for about 15 
scientific topics and of course R is far worse). I have all my listserv email 
routed to a set of folders which I always empty at the same time. I noticed 
that recently R-Help had really taken off and that Statalist had surpassed 
SAS-L. So I got the latest monthly data from the listservs and switched the 
program from doing yearly counts to means of the monthly figures so I could add 
2010 to it. Figure 1 at  http://r4stats.com/popularity is indeed the number of 
emails send by each of the listservs. All these measures have their own 
limitations, but I find that graph the most interesting since it includes the 
trends across time.

Cheers,
Bob
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-25 Thread Muenchen, Robert A (Bob)
I had taken the opposite tack with Google Trends by subtracting keywords
like:
SAS -shoes -airlines -sonar... 
but never got as good results as that beautiful X code for search.
When you see the end-of-semester panic bumps in traffic, you know you're
nailing it! 

I see that there's a car, the R Code Mustang, that adding for gets rid
of. 

Thanks for getting me back on a topic that I had given up on!

Bob

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Joris Meys
Sent: Thursday, June 24, 2010 7:56 PM
To: Dario Solari
Cc: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...

Nice idea, but quite sensitive to search terms, if you compare your
result on ... code with ... code for:
http://www.google.com/insights/search/#q=r%20code%20for%2Csas%20code%20
f
or%2Cspss%20code%20forcmpt=q

On Thu, Jun 24, 2010 at 10:48 PM, Dario Solari dario.sol...@gmail.com
wrote:
 First: excuse for my english

 My opinion: a useful font for measuring popoularity can be Google
 Insights for Search - http://www.google.com/insights/search/#

 Every person using a software like R, SAS, SPSS needs first to learn
 it. So probably he make a web-search for a manual, a tutorial, a
 guide. One can measure the share of this kind of serach query.
 This kind of results can be useful to determine trends of
 popularity.

 Example 1: R tutorial/manual/guide, SAS tutorial/manual/guide,
 SPSS tutorial/manual/guide

http://www.google.com/insights/search/#q=%22r%20tutorial%22%2B%22r%20ma
n
ual%22%2B%22r%20guide%22%2B%22r%20vignette%22%2C%22spss%20tutorial%22%2
B
%22spss%20manual%22%2B%22spss%20guide%22%2C%22sas%20tutorial%22%2B%22sa
s
%20manual%22%2B%22sas%20guide%22cmpt=q

 Example 2: R software, SAS software, SPSS software

http://www.google.com/insights/search/#q=%22r%20software%22%2C%22spss%2
0
software%22%2C%22sas%20software%22cmpt=q

 Example 3: R code, SAS code, SPSS code

http://www.google.com/insights/search/#q=%22r%20code%22%2C%22spss%20cod
e
%22%2C%22sas%20code%22cmpt=q

 Example 4: R graph, SAS graph, SPSS graph

http://www.google.com/insights/search/#q=%22r%20graph%22%2C%22spss%20gr
a
ph%22%2C%22sas%20graph%22cmpt=q

 Example 5: R regression, SAS regression, SPSS regression

http://www.google.com/insights/search/#q=%22r%20regression%22%2C%22spss
%
20regression%22%2C%22sas%20regression%22cmpt=q

 Some example are cross-software (learning needs - Example1), other
can
 be biased by the tarditional use of that software (in SPSS usually
you
 don't manipulate graph, i think)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
 and provide commented, minimal, self-contained, reproducible code.




--
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-25 Thread Muenchen, Robert A (Bob)


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Muenchen, Robert A (Bob)
Sent: Friday, June 25, 2010 3:08 PM
To: Joris Meys; Dario Solari
Cc: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...

I had taken the opposite tack with Google Trends by subtracting
keywords
like:
SAS -shoes -airlines -sonar...
but never got as good results as that beautiful X code for search.
When you see the end-of-semester panic bumps in traffic, you know
you're
nailing it!

I have to eat those words already. The R code for search that showed a
peak every December did not have quotes around it, so it was searching
for those three words not the complete phrase. When you add the quotes,
the peaks vanish. 

Once you go the phrase route, you gain precision but end up with zero
counts on various phrases. I avoided that by combining them with + to
get enough to plot. The resulting graph shows SAS dominant until
mid-2006 when SPSS takes the top position, followed by R, SAS, Stata in
order:

http://www.google.com/insights/search/#q=%22r%20code%20for%22%2B%22r%20m
anual%22%2B%22r%20tutorial%22%2B%22r%20graph%22%2C%22sas%20code%20for%22
%2B%22sas%20manual%22%2B%22sas%20tutorial%22%2B%22sas%20graph%22%2C%22sp
ss%20code%20for%22%2B%22spss%20manual%22%2B%22spss%20tutorial%22%2B%22sp
ss%20graph%22%2C%22stata%20code%20for%22%2B%22stata%20manual%22%2B%22sta
ta%20tutorial%22%2B%22stata%20graph%22%2C%22s-plus%20code%20for%22%2B%22
s-plus%20manual%22%2Bs-plus%20tutorial%22%2B%22s-plus%20graph%22cmpt=q

This might be a good one to add to http://r4stats.com/popularity 

Bob


I see that there's a car, the R Code Mustang, that adding for gets
rid
of.

Thanks for getting me back on a topic that I had given up on!

Bob

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Joris Meys
Sent: Thursday, June 24, 2010 7:56 PM
To: Dario Solari
Cc: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...

Nice idea, but quite sensitive to search terms, if you compare your
result on ... code with ... code for:
http://www.google.com/insights/search/#q=r%20code%20for%2Csas%20code%2
0
f
or%2Cspss%20code%20forcmpt=q

On Thu, Jun 24, 2010 at 10:48 PM, Dario Solari
dario.sol...@gmail.com
wrote:
 First: excuse for my english

 My opinion: a useful font for measuring popoularity can be Google
 Insights for Search - http://www.google.com/insights/search/#

 Every person using a software like R, SAS, SPSS needs first to learn
 it. So probably he make a web-search for a manual, a tutorial, a
 guide. One can measure the share of this kind of serach query.
 This kind of results can be useful to determine trends of
 popularity.

 Example 1: R tutorial/manual/guide, SAS tutorial/manual/guide,
 SPSS tutorial/manual/guide

http://www.google.com/insights/search/#q=%22r%20tutorial%22%2B%22r%20m
a
n
ual%22%2B%22r%20guide%22%2B%22r%20vignette%22%2C%22spss%20tutorial%22%
2
B
%22spss%20manual%22%2B%22spss%20guide%22%2C%22sas%20tutorial%22%2B%22s
a
s
%20manual%22%2B%22sas%20guide%22cmpt=q

 Example 2: R software, SAS software, SPSS software

http://www.google.com/insights/search/#q=%22r%20software%22%2C%22spss%
2
0
software%22%2C%22sas%20software%22cmpt=q

 Example 3: R code, SAS code, SPSS code

http://www.google.com/insights/search/#q=%22r%20code%22%2C%22spss%20co
d
e
%22%2C%22sas%20code%22cmpt=q

 Example 4: R graph, SAS graph, SPSS graph

http://www.google.com/insights/search/#q=%22r%20graph%22%2C%22spss%20g
r
a
ph%22%2C%22sas%20graph%22cmpt=q

 Example 5: R regression, SAS regression, SPSS regression

http://www.google.com/insights/search/#q=%22r%20regression%22%2C%22sps
s
%
20regression%22%2C%22sas%20regression%22cmpt=q

 Some example are cross-software (learning needs - Example1), other
can
 be biased by the tarditional use of that software (in SPSS usually
you
 don't manipulate graph, i think)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
 and provide commented, minimal, self-contained, reproducible code.




--
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting

Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-24 Thread Muenchen, Robert A (Bob)


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Dr. David Kirkby
Sent: Tuesday, June 22, 2010 7:49 PM
To: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...
...

I don't know how practical it is with R, but with Mathematica, MATLAB
etc, job
adverts is a good bet if you care about outside academia.

I like this idea. About three years ago I got all the job advertisements
from Monster.com and did a content analysis on their software
requirements. The jobs were for statistician and data miner. SAS was
the biggest data analysis package by far  I hope to do this again soon
and add it to http://r4stats.com/popularity. I'll send out a notice when
I do. 

The tough part may be picking the best sources of position descriptions.
The free ones (Monster, etc.) apparently have problems now with lots of
fake postings. Wait...I've got it, R-job-listings, I don't even have to
go collect the data, it gets emailed to me! I wonder if the results will
be different this time. ;-)

You did get me thinking in reverse though. Don't choose a job title,
just search for software. That way the shill job postings, which are
duplicates of real jobs with the employer changed, would (I hope) affect
them all about the same. A quick search shows:

SAS, SPSS  1,000 (that's the most it will report)
Stata 72
S-PLUS 53
Minitab 194  
JMP 54
R - well, there's R  D, A / R (accounts receivable)... Even SAS+R gets
more R  D.
  Many of the S-PLUS listings say, R or S-PLUS, the 53 figure is
probably the best we can get without much more effort.

Bob



I found very few job ads wanting Mathematica skills, but lots wanting
MATLAB. I
doubt it is practical to search on the letter R though.

jobsite.com, monstir.com etc

Dave

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-22 Thread Muenchen, Robert A (Bob)
Interesting! I had no idea there were R-help lists in other languages. I don't 
see it on http://www.r-project.org/mail.html, but then that's in English! Is 
there a list of such sites?

Thanks,
Bob

-Original Message-
From: Kjetil Halvorsen [mailto:kjetilbrinchmannhalvor...@gmail.com]
Sent: Monday, June 21, 2010 9:12 AM
To: Muenchen, Robert A (Bob)
Cc: ted.hard...@manchester.ac.uk; r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...

One should also take into account the other R list. For example, as of
today the number of subscribers to
R-help-es (R-help for spanish speakers) is 290, increasing.

Kjetil Halvorsen

On Sun, Jun 20, 2010 at 6:28 PM, Muenchen, Robert A (Bob)
muenc...@utk.edu wrote:


-Original Message-
From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org]
On Behalf Of Ted Harding
Sent: Sunday, June 20, 2010 3:42 PM
To: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...


I've given thought in the past to the question of estimating the R
user base, and came to the conclusion that it is impossible to get
an estimate of the number of users that one could trust (or even
put anything like a margin of error to).

I think one could get a number which represented a moderately
informative lower bound -- just count the number of different email
addresses that have ever posted to the R-help list. This will of
course include people who post (or have posted) from more than one
email address, and people who tried R for a while and then dropped
it, but my feeling is that these are likely to be outweighed by the
number of people who have used R but have never posted (for example
students who are getting their R help from their instructors, people
using R in a corporate context who are discouraged from posting to
public lists, etc.).

 Ted, that's a very interesting suggestion. Do you know of a practical
 way of getting that count?


The number of subscribers to R-help (currently about 10200) is
a definite lower bound for the number of R users, but many users
post to R-help without being subscribed.

 10,200 is quite an amazing number! Here are the number of subscribers
 to:

 SAS-L    3,251
 SPSSX-L  2,103
 Statlist 1,847
 S-PLUS - havn't figured out how to get this yet

 How did you get the R-help figure?


I would expect that the total number of different email addresses
that have posted to R-help would be considerably larger than 10200.

I don't think a Mark-Recapture approach is feasible.

Further, I don't know how one might take account of the fact that
some installations of R (e.g. on a corporate or institutional
or departmental server) may each be used by several users.

 The server question in particular intrigues me. Research organizations
 are stuffed with high performance clusters. The cost of all the
 commercial packages is just incredible. Even at the heavily discounted
 rate academia gets, they're still unaffordable. However, if queried
we'd
 find the commercial packages on them, but limited to 4 out of 2,500
 nodes! You might see the reverse in industry, with one mainframe copy
of
 SAS serving hundreds of users.

 Cheers,
 Bob


Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 20-Jun-10                                       Time: 20:41:43
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-21 Thread Muenchen, Robert A (Bob)


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Ted Harding
Sent: Sunday, June 20, 2010 9:01 PM
To: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...
...

John and I discussed the snowball idea at some length off-list,
and that is when I came to the conclusion (for reasons such as
the above) that although it had some mileage, and could provide
information supplementary to other methods, the extent of its
potential reach into the unkown was, well, unknowable ...
[with acknowledgement to Donald Rumsfeld].

That's a good point. Even when we know the total sampling frame and
contact them all, we rarely see more than a 10% response rate. We often
go on the assumption that whoever responded was a random selection
(occasionally checked via phone interview) and if so, the overall
estimates of ATTITUDES will be accurate. However, wanting to know how
many could have responded is a different problem. Hm. Nets are sounding
better all the time! ;-)


In reponse to the question from Bob Muenchen as to How did you
get the R-help figure? (of email addresses subscribed to R-help),
since I am one of the list moderators I can log in and access the
subscriber's list.

As of today, the numbers are:

 4629 Non-digested Members of R-help
 5560 Digested Members of R-help
 (190 private members not shown)

10379

Great. The figures from UGA cover SAS (3,253) and SPSS (2,105). I'm
still waiting to hear from people about Statalist and S-NEWS. I'll post
them on the site as soon as I have those.

Thanks!
Bob 


(A few more than the number I picked up a some days ago).

Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 21-Jun-10   Time: 02:00:45
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-21 Thread Muenchen, Robert A (Bob)


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Patrick Burns
Sent: Monday, June 21, 2010 5:16 AM
To: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...

I think there is a problem with the
question:  Not everyone thinks of R
as a statistics program.  Furthermore,
I don't think it should be thought of
as a statistics program.

(Statistics is what stuffy professors
do, I just look at my data and try to
figure out what it means.)

Pat,

Yes, I think that's why the Business Intelligence crowd prefers
Analytics, Data Mining, etc. The official reason may be that it combines
methods drawn from statistics, machine learning and artificial
intelligence, but I suspect the marketers really want to avoid
statistics as that one class I barely survived. I debated what to
call that page and ended up using Analytical Software. I'm not so
happy with that either. -Bob 


On 20/06/2010 23:46, Muenchen, Robert A (Bob) wrote:


 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org]
 On Behalf Of Muenchen, Robert A (Bob)
 Sent: Sunday, June 20, 2010 6:43 PM
 To: Hadley Wickham; ted.hard...@manchester.ac.uk
 Cc: r-help@r-project.org
 Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...



 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org]
 On Behalf Of Hadley Wickham
 ...  What about snowball
 sampling with R-help as an initial frame?

 That's an interesting idea! I could put together a Two-item web
survey:

 1. What stat package do you use?
 2. What's your main email address

 P.S. the email address was an attempt to keep people from stuffing
the
 ballot box but on the other hand, it could turn people off. I guess
the
 number of blank fields would tell us which.

 Also, stat package choice would have to be a check all that apply
 question.


 If they choose R, I could optionally ask what their favorite
packages
 are. I might be able to get that on a web survey this week if it
 doesn't
 get too crazy.

 Bob


 Hadley

 --
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
 and provide commented, minimal, self-contained, reproducible code.


--
Patrick Burns
pbu...@pburns.seanet.com
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-21 Thread Muenchen, Robert A (Bob)


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Joris Meys
Sent: Monday, June 21, 2010 5:32 AM
To: Patrick Burns
Cc: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...

On Mon, Jun 21, 2010 at 11:15 AM, Patrick Burns
pbu...@pburns.seanet.com wrote:

 (Statistics is what stuffy professors
 do, I just look at my data and try to
 figure out what it means.)

Often those stuffy professors have a reason to do so. When they want
an objective view on the data for example, or an objective measure of
the significance of a hypothesis. But you're right, who cares about
objectiveness these days? It doesn't sell you a paper, does it?

Joris,

Perhaps we can coin a term that's the statistical equivalent of Stephen
Colbert's truthiness; when a study totally fails to find anything, but
it just feels significant in your gut!

Bob 


Cheers
Joris


--
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Popularity of R, SAS, SPSS, Stata...

2010-06-20 Thread Muenchen, Robert A (Bob)
Hi All,

I've been fiddling around with various ways to estimate the popularity
of R, SAS, SPSS, Stata, JMP, Minitab, Statistica, Systat, BMDP, S-PLUS,
R-PLUS and Revolution R. It's not an easy task. You can see what I've
come up with so far at http://r4stats.com/popularity . I'm sure people
will have plenty of ideas on how to improve this, so please let me know
what you think.

Cheers,
Bob 

=
  Bob Muenchen (pronounced Min'-chen), Manager  
  Research Computing Support
  Voice: (865) 974-5230  
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research, 
  News:  http://oit.utk.edu/research/news.php
  Feedback: http://oit.utk.edu/feedback/
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-20 Thread Muenchen, Robert A (Bob)


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Stefan Grosse
Sent: Sunday, June 20, 2010 10:25 AM
To: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...

Am 20.06.2010 15:31, schrieb Muenchen, Robert A (Bob):

 I've been fiddling around with various ways to estimate the
popularity
 of R, SAS, SPSS, Stata, JMP, Minitab, Statistica, Systat, BMDP, S-
PLUS,
 R-PLUS and Revolution R. It's not an easy task. You can see what I've
 come up with so far at http://r4stats.com/popularity . I'm sure
people
 will have plenty of ideas on how to improve this, so please let me
know
 what you think.

Your analysis is quite web-based. But to define what popular means is -
I believe - hard. 

Stefan,

I agree with all your points. What I have so far is nowhere near the big
picture, but it's a start. When you install some software it asks if you
mind it reporting usage stats back to its home site. I know that sort of
thing has been discussed before on R-help. I'd love to see that added so
we would have a better estimate of R's user base. 

Cheers, 
Bob


R is open source and very broad in its different
applications so of course it generates much more e-mail and web traffic
because there are many different uses and users.

SPSS and Stata for example are closed and very specialized. You get
support also directly from the company and do not necessarily need a
mailing list. Does this mean that they are less popular? I'd say no.

So the question I would raise here is whether it is a fair comparison?
I know that is a sufficient statistics-subset like panel econometrics
Stata is by far leading and for time series econometrics Eviews, Gauss
in research. I would say that in the industry that I know plus in
econometrics research those programs are much more widespread or
popular. To measure their popularity I would say a
industry-and-education-wide-questionnaire should be used.

Plus it is not sufficient so I would also name Matlab, Gauss, Ox,
Eviews
from the areas of my interest (econometrics) as popular proprietary
software.

I do not deny that R is becoming more popular, but I doubt whether
mailing lists and search requests are enough to prove this hypothesis.

My 2cents
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-20 Thread Muenchen, Robert A (Bob)


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of David Winsemius
Sent: Sunday, June 20, 2010 1:05 PM
To: Stefan Grosse
Cc: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...


On Jun 20, 2010, at 10:24 AM, Stefan Grosse wrote:

 Am 20.06.2010 15:31, schrieb Muenchen, Robert A (Bob):

 I've been fiddling around with various ways to estimate the
 popularity
 of R, SAS, SPSS, Stata, JMP, Minitab, Statistica, Systat, BMDP, S-
 PLUS,
 R-PLUS and Revolution R. It's not an easy task. You can see what
I've
 come up with so far at http://r4stats.com/popularity . I'm sure
 people
 will have plenty of ideas on how to improve this, so please let me
 know
 what you think.

 Your analysis is quite web-based. But to define what popular means
 is -
 I believe - hard. R is open source and very broad in its different
 applications so of course it generates much more e-mail and web
 traffic
 because there are many different uses and users.

 SPSS and Stata for example are closed and very specialized.

I suspect proponents of their use would actively dispute the very
specialized description.

Here at UT SPSS is dominant across a wide range of departments with
around 3,600 users. The older professors never stopped programming in it
while the many programming-phobic students love its point-and-click
interface. SAS is also used widely with about 800 users, many of them
caused by class requirements. When it comes to dissertation time, many
switch over to SPSS. Stata has around 120 concentrated in just a few
departments. With R it's hard to tell as we don't get local counts and R
users tend to not need much consulting support.

Cheers,
Bob


 You get
 support also directly from the company and do not necessarily need a
 mailing list. Does this mean that they are less popular? I'd say no.

I was under the impression that both SAS and Stata actively support
their two mailing lists, but the SAS FAQ disputes this impression
regarding SAS.


 So the question I would raise here is whether it is a fair
comparison?
 I know that is a sufficient statistics-subset like panel econometrics
 Stata is by far leading and for time series econometrics Eviews,
Gauss
 in research. I would say that in the industry that I know plus in
 econometrics research those programs are much more widespread or
 popular. To measure their popularity I would say a
 industry-and-education-wide-questionnaire should be used.

 Plus it is not sufficient so I would also name Matlab, Gauss, Ox,
 Eviews
 from the areas of my interest (econometrics) as popular
 proprietary
 software.

 I do not deny that R is becoming more popular, but I doubt whether
 mailing lists and search requests are enough to prove this
hypothesis.

Certainly there are additional factors that might influence the
absolute numbers of posting to a particular mailing list. The SAS
mailing list/newsgroup, SAS-L/comp.soft-sys.sas, has a well-
established Internet presence. Each one probably has a particular
culture. (I was stunned to see the low number of daily posts to
comp.soft-sys.sas when I just looked at the last week on
GoogelGroups.) I didn't think either the SAS or the Stata lists had
any sort of published or informal effort to steer users in the
direction of R-ing the FM, searching-before-posting, or admonishments
to RT-FAQ.  However, now that I look, it does appear that the
Statalist FAQ  makes an effort similar to that of the r-help Posting
Guide. There may be differences in the degree and clarity of the
documentation as well. The Stata distribution includes a medium-sized
library. All of that said, ..., the relative frequency of postings
would seem to less subject to such influences.

The SAS curve with its peak in 2006-2008 and significantly lower
numbers in more recent years contrasted with the steady increase in R
and Stata would seem to reflect a material shift. Agreed, you cannot
say that R passed SAS in number of active users, or that SAS has the
same number of users as Stata. The flatness of SPSS also appears
meaningful.  And within the R/S world the differences in the activity
on Snews and rhelp are likewise pretty dramatic.

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-20 Thread Muenchen, Robert A (Bob)
I wonder if there are any capture-recapture type methodologies for
estimating open-source software usage?  Another idea would be to
combine with some other known numbers, e.g. book sales, conference
attendance etc. You'd need personal information to link the data sets
together.

Hadley

This totally cracked me up! I'm envisioning going into one of our
computer labs, tossing a net over an unsuspecting student, and then
tagging their ear with a code that represents which stat package they're
using. Then release and later recapture. What percent did we get? That's
what the profs I deal with do with animals to estimate populations.

Conference attendance might be easy to get if I remember to contact the
people running them. Does anyone know how many we expect at UseR 2010? I
recall SAS conferences with 3,500 but data analysis is a tiny part of
that conference. I also heard someone say that they took it to Hawaii
one year to REDUCE the attendance as it had grown so large. Sounds crazy
to me, but if there are attempts to manage the figures, that could muck
up the interpretation. Well, all these approaches have their own
problems, so that's just another limitation of the study. I think SPSS
Directions has more like 500 but it's all focused on some sort of
analysis.

I did try to count books at Amazon and papers published via Google
Scholar. Those searches are devilishly difficult for SAS let alone for
letter R!

An easy one to get should be number of list subscribers. I'll try to get
those figures. Anyone know it for R-help?

Cheers,
Bob


PS.  It would be also interesting to see the contributions of the
R-SIG mailing lists and other specialised R related mailing lists.  My
feeling is that there is not a lot of overlap between the members of
the ggplot2 mailing list and R-help.

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-20 Thread Muenchen, Robert A (Bob)


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Ted Harding
Sent: Sunday, June 20, 2010 3:42 PM
To: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...


I've given thought in the past to the question of estimating the R
user base, and came to the conclusion that it is impossible to get
an estimate of the number of users that one could trust (or even
put anything like a margin of error to).

I think one could get a number which represented a moderately
informative lower bound -- just count the number of different email
addresses that have ever posted to the R-help list. This will of
course include people who post (or have posted) from more than one
email address, and people who tried R for a while and then dropped
it, but my feeling is that these are likely to be outweighed by the
number of people who have used R but have never posted (for example
students who are getting their R help from their instructors, people
using R in a corporate context who are discouraged from posting to
public lists, etc.).

Ted, that's a very interesting suggestion. Do you know of a practical
way of getting that count?


The number of subscribers to R-help (currently about 10200) is
a definite lower bound for the number of R users, but many users
post to R-help without being subscribed.

10,200 is quite an amazing number! Here are the number of subscribers
to:

SAS-L3,251
SPSSX-L  2,103
Statlist 1,847
S-PLUS - havn't figured out how to get this yet

How did you get the R-help figure? 


I would expect that the total number of different email addresses
that have posted to R-help would be considerably larger than 10200.

I don't think a Mark-Recapture approach is feasible.

Further, I don't know how one might take account of the fact that
some installations of R (e.g. on a corporate or institutional
or departmental server) may each be used by several users.

The server question in particular intrigues me. Research organizations
are stuffed with high performance clusters. The cost of all the
commercial packages is just incredible. Even at the heavily discounted
rate academia gets, they're still unaffordable. However, if queried we'd
find the commercial packages on them, but limited to 4 out of 2,500
nodes! You might see the reverse in industry, with one mainframe copy of
SAS serving hundreds of users. 

Cheers,
Bob


Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 20-Jun-10   Time: 20:41:43
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-20 Thread Muenchen, Robert A (Bob)


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Ivan Calandra
Sent: Sunday, June 20, 2010 3:47 PM
To: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...

Bob,

I have no idea whether it is realistic, but if you look for the papers
that used R or SAS (or anything), you might get better results by
searching for the way R and SAS are cited.

Hi Ivan, that was what I tried when more generic keywords failed. However, 
almost no one seems to use that citation. For example, in 2009, only 28 papers 
contain R Foundation and 61 contain Bioconductor, which uses R. One single 
paper contains both. I appreciate the idea though!

Thanks,
Bob


It looks to me that what I'm saying is not clear, so here an example.
To cite R in a paper you have to write it this way:
 citation(base)
To cite R in publications use:
  R Development Core Team (2009). R: A language and environment for
  statistical computing. R Foundation for Statistical Computing, Vienna,
  Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.

So instead of searching for R, searching for R Development Core Team
might give better results. And same thing for SAS or any other
softwares.

If that doesn't help, just forget it!

Ivan



Le 20 juin 2010 à 21:07, Muenchen, Robert A (Bob) a écrit :

 I wonder if there are any capture-recapture type methodologies for
 estimating open-source software usage?  Another idea would be to
 combine with some other known numbers, e.g. book sales, conference
 attendance etc. You'd need personal information to link the data sets
 together.

 Hadley

 This totally cracked me up! I'm envisioning going into one of our
 computer labs, tossing a net over an unsuspecting student, and then
 tagging their ear with a code that represents which stat package
they're
 using. Then release and later recapture. What percent did we get?
That's
 what the profs I deal with do with animals to estimate populations.

 Conference attendance might be easy to get if I remember to contact
the
 people running them. Does anyone know how many we expect at UseR 2010?
I
 recall SAS conferences with 3,500 but data analysis is a tiny part of
 that conference. I also heard someone say that they took it to Hawaii
 one year to REDUCE the attendance as it had grown so large. Sounds
crazy
 to me, but if there are attempts to manage the figures, that could
muck
 up the interpretation. Well, all these approaches have their own
 problems, so that's just another limitation of the study. I think
SPSS
 Directions has more like 500 but it's all focused on some sort of
 analysis.

 I did try to count books at Amazon and papers published via Google
 Scholar. Those searches are devilishly difficult for SAS let alone for
 letter R!

 An easy one to get should be number of list subscribers. I'll try to
get
 those figures. Anyone know it for R-help?

 Cheers,
 Bob


 PS.  It would be also interesting to see the contributions of the
 R-SIG mailing lists and other specialised R related mailing lists.
My
 feeling is that there is not a lot of overlap between the members of
 the ggplot2 mailing list and R-help.

 --
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
 and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Institut und Museum
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-20 Thread Muenchen, Robert A (Bob)


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Hadley Wickham
...  What about snowball
sampling with R-help as an initial frame?

That's an interesting idea! I could put together a Two-item web survey:

1. What stat package do you use?
2. What's your main email address

If they choose R, I could optionally ask what their favorite packages
are. I might be able to get that on a web survey this week if it doesn't
get too crazy.

Bob


Hadley

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata...

2010-06-20 Thread Muenchen, Robert A (Bob)


-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Muenchen, Robert A (Bob)
Sent: Sunday, June 20, 2010 6:43 PM
To: Hadley Wickham; ted.hard...@manchester.ac.uk
Cc: r-help@r-project.org
Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...



-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Hadley Wickham
...  What about snowball
sampling with R-help as an initial frame?

That's an interesting idea! I could put together a Two-item web survey:

1. What stat package do you use?
2. What's your main email address

P.S. the email address was an attempt to keep people from stuffing the
ballot box but on the other hand, it could turn people off. I guess the
number of blank fields would tell us which.

Also, stat package choice would have to be a check all that apply
question. 


If they choose R, I could optionally ask what their favorite packages
are. I might be able to get that on a web survey this week if it
doesn't
get too crazy.

Bob


Hadley

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R for Stata Users

2010-05-22 Thread Muenchen, Robert A (Bob)
Dear R-Helpers,

If you know of any Stata users looking to learn R, our book R for Stata
Users finally shipped this week. A software snag delayed the printing
of all Springer books for quite a few weeks. A description of that book,
and reviews of its predecessor, R for SAS and SPSS Users is at
http://r4stats.com . The example programs and data files used by both
books are also at that site.

Cheers,
Bob Muenchen  Joe Hilbe

=
  Bob Muenchen (pronounced Min'-chen), Manager  
  Research Computing Support
  Voice: (865) 974-5230  
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research, 
  News:  http://oit.utk.edu/research/news.php
  Feedback: http://oit.utk.edu/feedback/
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SAS for R-users

2010-05-15 Thread Muenchen, Robert A (Bob)
 Thomas Levine wrote:
Bob Muenchen says that 'Ralph O’Brien says that
in a few years there will be so many students
graduating knowing mainly R that [he]’ll need to
write, “SAS for R Users.” That’ll be the day!'

Heh! I quite agree. I've had a few people write me saying they had used my book 
R for SAS and SPSS Users to learn SAS, but I certainly didn't aim for that 
when writing it. For R programmers wanting to learn SAS, here's what I 
recommend:

1. Read the text of the free version of R for SAS and SPSS Users at 
http://r4stats.com. That version has extremely short explanations of the 
differences by topic. Most of the explanation about R is in the form of 
comments in the R programs, which you can skip of course. The SAS programs will 
give you an idea of the basics. The book version adds lots of explanation but 
it's all about R, so skip that.

2. Read The Little SAS Book 
http://www.amazon.com/Little-SAS-Book-Primer-Third/dp/1590473337/ref=sr_1_1?ie=UTF8s=booksqid=1273963558sr=8-1
 

This is a quick and easy read that covers the basics well.

3. Read SAS and R 
http://www.amazon.com/SAS-Management-Statistical-Analysis-Graphics/dp/1420070576/ref=sr_1_1?ie=UTF8s=booksqid=1273963594sr=1-1

SAS and R is a good book that covers both SAS and R. The explanations are 
very brief but well written. That brevity allows it to cover a lot of ground.

4. For in-depth topics, the SAS documentation is well written and all online: 
http://support.sas.com/documentation/index.html 

Although the SAS manuals are online, knowing what to look up is the challenge 
for an R user. That's where 1 and 3 will help. 

Get ready for a whole different kind of world!

Cheers,
Bob

=
  Bob Muenchen (pronounced Min'-chen), Manager  
  Research Computing Support
  Voice: (865) 974-5230  
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research, 
  News:  http://oit.utk.edu/research/news.php
  Feedback: http://oit.utk.edu/feedback/
=



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A primitive OO in R -- where next?

2010-05-15 Thread Muenchen, Robert A (Bob)
Hi All,

This was a very interesting question  I enjoyed reading everyone's
responses. I've played around with it and summarized some of the
variations below.

Cheers,
Bob

# A fun example of how a list can store both a function 
# and data for that function. 

# Create a list that contains both a function and some data:

myList - list(Fun=mean, x=c(1,2,3,4,5))
myList

# Execute the function on the data
myList$Fun(myList$x)

# Alternately, put the data into the function call:

as.call(myList)

# And evaluate it:

eval( as.call(myList) )

# Why does this not work?

myList - list(Fun=mean, mydata=c(1,2,3,4,5))
eval( as.call(myList) )

# Because mean has an x argument
# and that does not exist since it's now named mydata

myList - list(Fun=mean, x=c(1,2,3,4,5))
eval( as.call(myList) )

# And how does order affect it?
# Let's put the data first

myList - list(x=c(1,2,3,4,5), Fun=mean)

# This works fine:
myList$Fun(myList$x)

# But look at what as.call tries to do:
as.call(myList)

# So evaluating it would be nonsense:
eval( as.call(myList) )



-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of S Ellison
Sent: Thursday, May 13, 2010 8:07 AM
To: Ted Harding; r-h...@stat.math.ethz.ch
Subject: Re: [R] A primitive OO in R -- where next?

R OO is documented for S3 classes under section 5 (Object-oriented
programming) in the R language definition.

I guess the issue is somewhat philosophial as to how you use it.

R philosophy _mostly_ separates data from operations on data, so the OO
model provides classes for data and essentially separate methods that
apply to those classes. This is the kind of model sometimes called a
'visitor pattern'. An alternative is to include operations on the data
within the data object, which sometimes has advantages if you want to
simplify the look of code for things like display (instead of a display
method for each class, one effectively sends a mesage to any object of
the form display yourself here). In practice, of course, one ends up
writing class-specific operations code; the difference is pretty much
where it's stored.

On balance there seems to me a rationale for a statistician to separate
data from the operations formed on it; one collects and curates data
carefuly, so it as a kind of lifecycle of its own that is unrelated to
mathematical operations performed on it.

But I have allowed _data_ objects to include functions or at least
function names when it is a necessary part of the description of the
data. For example, in some of our interlaboratory studies labs give
uncertainty information in the form of a variance or interval, but may
additionally tell us what the assumed distribution is (eg Normal, t,
lognormal etc). It then makes sense to have the distribution as part of
the data. For these functions, the root name (norm, t, etc)_ suffices
in
conjunction with do.call, but to generalise completely, one can
consider
allowing a user to specify the distribution as (say) some arbitrary
density function or density/probability family. (It's pretty rare that
we'd need that, but hey - thinking ahead and all that). That would
generate data which in part consisted of a function describing the
(assumed) associated distribution.

Steve Ellison

 Ted Harding ted.hard...@manchester.ac.uk 12/05/2010 22:48:17 
Greetings All,

Out of curiosity, I've just done a very primitive experiment:

  Obj - list(Fun=sum, Dat=c(1,2,3,4))
  Obj$Fun(Obj$Dat)
  # [1] 10

That sort of thing (much more sophisticated) must be documented
mind-blowingly somewhere. Where?

Where I stand right now: The above (and its immediately obvious
generalisations, like Obj$Fun-cos) is all I know about it so far.

Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 12-May-10   Time: 22:48:14
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

***
This email and any attachments are confidential. Any
use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data Mining Survey

2010-05-12 Thread Muenchen, Robert A (Bob)
Dear R-Helpers,

SAS Institute just mailed out the notice below regarding a survey of
people who do data mining. To help keep the survey from becoming biased
toward commercial software, I thought it would be good to post it here
as well.

Cheers,
Bob

Fourth Annual Data Miner Survey
Rexer Analytics has asked statistical and data mining software vendors
to forward this survey as a courtesy. (SAS is not a sponsor of the
survey.) To participate, click here and enter the access code (KP2970)
in the space provided. Your responses will be confidential. For a copy
of the survey findings, provide your email address at the end of the
questionnaire.

=
  Bob Muenchen (pronounced Min'-chen), Manager  
  Research Computing Support
  Voice: (865) 974-5230  
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research, 
  News:  http://oit.utk.edu/research/news.php
  Feedback: http://oit.utk.edu/feedback/
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Mining Survey

2010-05-12 Thread Muenchen, Robert A (Bob)
Oops! I forgot that R-help strips out HTML. When I checked the link, it
referenced SAS.COM. I've written Karl Rexer for a more appropriate one.
More soon. -Bob

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Muenchen, Robert A (Bob)
Sent: Wednesday, May 12, 2010 8:54 AM
To: r-help@r-project.org
Subject: [R] Data Mining Survey

Dear R-Helpers,

SAS Institute just mailed out the notice below regarding a survey of
people who do data mining. To help keep the survey from becoming biased
toward commercial software, I thought it would be good to post it here
as well.

Cheers,
Bob

Fourth Annual Data Miner Survey
Rexer Analytics has asked statistical and data mining software vendors
to forward this survey as a courtesy. (SAS is not a sponsor of the
survey.) To participate, click here and enter the access code (KP2970)
in the space provided. Your responses will be confidential. For a copy
of the survey findings, provide your email address at the end of the
questionnaire.

=
  Bob Muenchen (pronounced Min'-chen), Manager
  Research Computing Support
  Voice: (865) 974-5230
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research,
  News:  http://oit.utk.edu/research/news.php
  Feedback: http://oit.utk.edu/feedback/
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Mining Survey

2010-05-12 Thread Muenchen, Robert A (Bob)
OK, here's a link specific to R-Help readers:

http://rexeranalytics.com/Data-Miner-Survey-2010-Intro2.html 

And you can tell people to use this access code:  MD21C9 

Cheers,
Bob

-Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
On Behalf Of Muenchen, Robert A (Bob)
Sent: Wednesday, May 12, 2010 8:54 AM
To: r-help@r-project.org
Subject: [R] Data Mining Survey

Dear R-Helpers,

SAS Institute just mailed out the notice below regarding a survey of
people who do data mining. To help keep the survey from becoming biased
toward commercial software, I thought it would be good to post it here
as well.

Cheers,
Bob

Fourth Annual Data Miner Survey
Rexer Analytics has asked statistical and data mining software vendors
to forward this survey as a courtesy. (SAS is not a sponsor of the
survey.) To participate, click here and enter the access code (KP2970)
in the space provided. Your responses will be confidential. For a copy
of the survey findings, provide your email address at the end of the
questionnaire.

=
  Bob Muenchen (pronounced Min'-chen), Manager
  Research Computing Support
  Voice: (865) 974-5230
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research,
  News:  http://oit.utk.edu/research/news.php
  Feedback: http://oit.utk.edu/feedback/
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How good is R at making publication quality tables?

2010-03-17 Thread Muenchen, Robert A (Bob)
Hi Paul,

Sorry I didn't get to that subject in the first edition of R for SAS and SPSS 
Users. Several of the options people have mentioned will be in the second 
edition, although that's about a year off. I did get them added to R for Stata 
Users, due out in early April.

Cheers,
Bob


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Paul Miller
Sent: Wednesday, March 17, 2010 10:51 AM
To: r-help@r-project.org
Subject: [R] How good is R at making publication quality tables?

Hello Everyone,

I have just started learning R and am in the process of figuring out
what it can and can't do. I must say I am very impressed with R so far
and am amazed that something this good can actually be free.

Recently, I finished reading R for SAS and SPSS Users and have begun
reading SAS and R and Data Manipulation with R. Based on what I've read
in these books and elsewhere, I get the impression that R is very good
at drawing high quality graphs but maybe not so good at creating nice
looking tables of the sort I'm used to getting through SAS ODS.

Am I right or wrong about this? If I am wrong, can anyone show me some
examples of how R can be used to create really nice looking tables? I
often make tables of adverse events in clinical trials that have n(%)
values in the cells. I'd love to see an example that does a nice job of
making that sort of table but would be happy to see any examples that
someone might be willing to send to me.

Thanks,

Paul



  __
Looking for the perfect gift? Give the gift of Flickr!


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Updated comparison table for SAS-SPSS Add-ons and R Functions

2010-01-14 Thread Muenchen, Robert A (Bob)
Hi Liviu,

Thanks for those suggestions. I've made the changes and added you to the list 
of contributors. 

Cheers,
Bob

 -Original Message-
 From: Liviu Andronic [mailto:landronim...@gmail.com]
 Sent: Thursday, January 14, 2010 7:06 AM
 To: Muenchen, Robert A (Bob)
 Cc: sa...@listserv.uga.edu; spss...@listserv.uga.edu; r-h...@r-
 project.org
 Subject: Re: [R] Updated comparison table for SAS-SPSS Add-ons and R
 Functions
 
 On 1/14/10, Liviu Andronic landronim...@gmail.com wrote:
  Perhaps add latticist and playwith to the list of Graphics,
 Interactive?
 
 .. and remove latticist from Graphics, Static. Also, add rattle to
 Graphical user interfaces?
 
 Liviu
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Updated comparison table for SAS-SPSS Add-ons and R Functions

2010-01-13 Thread Muenchen, Robert A (Bob)
Hi All,

I have substantially expanded the table that compares SAS and SPSS
add-on modules to somewhat equivalent R packages. This new version is
at:
http://r4stats.com/add-on-modules 
and I would very much appreciate any feedback you might have on it.

The site http://r4stats.com is the replacement to
http://RforSASandSPSSusers.com and includes the support files for both
R for SAS and SPSS Users and the new R for Stata Users, due out in
March from Springer. I'll phase the older site out eventually and change
the URL to point to the new one.

Thanks,
Bob

=
  Bob Muenchen (pronounced Min'-chen), Manager  
  Research Computing Support
  Voice: (865) 974-5230  
  Email: muenc...@utk.edu
  Web:   http://oit.utk.edu/research, 
  News:  http://oit.utk.edu/research/news.php
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Updated comparison table for SAS-SPSS Add-ons and R Functions

2010-01-13 Thread Muenchen, Robert A (Bob)
From: b.rowling...@googlemail.com [mailto:b.rowling...@googlemail.com] On 
Behalf Of Barry Rowlingson
Sent: Wednesday, January 13, 2010 7:03 PM
To: Muenchen, Robert A (Bob)
Cc: r-help@r-project.org
Subject: Re: [R] Updated comparison table for SAS-SPSS Add-ons and R Functions

Maybe the first thing you should do is a global search and replace of 'SPSS' 
with 'PASW'

 http://www.spss.com/software/product-name-guide/

Barry

One of the things I updated was to *remove* the now-obsolete PASW! Since IBM 
bought the company, they did away with that and renamed things IBM SPSS 
 See the list at:
http://spss.com/software/statistics/ 
They still have some old web pages to clean up as you point out.

Cheers, 
Bob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Packages Crack the 3,000 Mark!

2009-11-25 Thread Muenchen, Robert A (Bob)
Hi Liviu,

Yes, I selected all the repositories on the list, including things like CRAN 
(extras), the four Bioconductor (BioC) sites, and R-Forge. 

Cheers,
Bob

-Original Message-
From: Liviu Andronic [mailto:landronim...@gmail.com] 
Sent: Wednesday, November 25, 2009 4:47 AM
To: Muenchen, Robert A (Bob)
Cc: r-help@r-project.org
Subject: Re: [R] R Packages Crack the 3,000 Mark!

Hello


On 11/24/09, Muenchen, Robert A (Bob) muenc...@utk.edu wrote:
  I don't know if this has been reported before, but according to Henrique
  Dallazuanna's program (below) the number of R packages has exceeded the
  3,000 mark. The count today is 3,175. I ran this just a couple of months
  ago  the number was still in the high 2,000s, so it must be fairly
  recent. I think this represents about 50% growth in the last year. Not
  bad!

Performing the same here I get only 2000+ packages.
 myPackageNames - available.packages()
--- Please select a CRAN mirror for use in this session ---
Loading Tcl/Tk interface ... done
 length(unique( rownames(myPackageNames) ))
[1] 2058

And CRAN [1] reports a similar number. Perhaps you have some
non-standard repositories configured?
Liviu

[1] http://cran.r-project.org/web/packages/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Packages Crack the 3,000 Mark!

2009-11-25 Thread Muenchen, Robert A (Bob)
I thought that the unique function would eliminate duplicate package names. Is 
there a better way to count the number of packages?

Thanks,
Bob

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Wednesday, November 25, 2009 10:40 AM
To: Muenchen, Robert A (Bob)
Cc: r-help@r-project.org
Subject: Re: [R] R Packages Crack the 3,000 Mark!

Note that:

- there are also 199 R packages on google code:
http://code.google.com/hosting/search?q=label:R
- some (many?) of the packages on R-Forge and on google code are also on CRAN

On Wed, Nov 25, 2009 at 9:11 AM, Muenchen, Robert A (Bob)
muenc...@utk.edu wrote:
 Hi Liviu,

 Yes, I selected all the repositories on the list, including things like CRAN 
 (extras), the four Bioconductor (BioC) sites, and R-Forge.

 Cheers,
 Bob

 -Original Message-
 From: Liviu Andronic [mailto:landronim...@gmail.com]
 Sent: Wednesday, November 25, 2009 4:47 AM
 To: Muenchen, Robert A (Bob)
 Cc: r-help@r-project.org
 Subject: Re: [R] R Packages Crack the 3,000 Mark!

 Hello


 On 11/24/09, Muenchen, Robert A (Bob) muenc...@utk.edu wrote:
  I don't know if this has been reported before, but according to Henrique
  Dallazuanna's program (below) the number of R packages has exceeded the
  3,000 mark. The count today is 3,175. I ran this just a couple of months
  ago  the number was still in the high 2,000s, so it must be fairly
  recent. I think this represents about 50% growth in the last year. Not
  bad!

 Performing the same here I get only 2000+ packages.
 myPackageNames - available.packages()
 --- Please select a CRAN mirror for use in this session ---
 Loading Tcl/Tk interface ... done
 length(unique( rownames(myPackageNames) ))
 [1] 2058

 And CRAN [1] reports a similar number. Perhaps you have some
 non-standard repositories configured?
 Liviu

 [1] http://cran.r-project.org/web/packages/
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Packages Crack the 3,000 Mark!

2009-11-24 Thread Muenchen, Robert A (Bob)
Hi All,

 

I don't know if this has been reported before, but according to Henrique
Dallazuanna's program (below) the number of R packages has exceeded the
3,000 mark. The count today is 3,175. I ran this just a couple of months
ago  the number was still in the high 2,000s, so it must be fairly
recent. I think this represents about 50% growth in the last year. Not
bad!

 

Does anyone have a program that graphs the growth of R packages? I don't
know if that historical data is around.

 

Cheers,

Bob

http://RforSASandSPSSusers.com 

 

Henrique's program:

 

 setRepositories()

 myPackageNames - available.packages()

--- Please select a CRAN mirror for use in this session --- [I selected
them all]

 length(unique( rownames(myPackageNames) ))

[1] 3175


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Frequencies, proportions cumulative proportions

2009-10-17 Thread Muenchen, Robert A (Bob)
David,

I use CrossTable, so that was my first guess. It'll do
proportions/percents by row, column or total in a 2-way table. For 1-way
tables, it still tries looks like a 2-way table, unless you specify
max.width=1. Then it does one column, but no cumulative proportions (see
below).

I appreciate the idea though!

Thanks,
Bob

 CrossTable(Score, max.width=1)

 
   Cell Contents
|-|
|   N |
| N / Table Total |
|-|

 
Total Observations in Table:  1000 

 
  |70 | 
  |---|
  |44 | 
  | 0.044 | 
  |---|


  |71 | 
  |---|
  |42 | 
  | 0.042 | 
  |---|


  |72 | 
  |---|
  |40 | 
  | 0.040 | 
  |---|


  |73 | 
  |---|
  |40 | 
  | 0.040 | 
  |---|


  |74 | 
  |---|
  |43 | 
  | 0.043 | 
  |---|


  |75 | 
  |---|
  |45 | 
  | 0.045 | 
  |---|


  |76 | 
  |---|
  |46 | 
  | 0.046 | 
  |---|


  |77 | 
  |---|
  |40 | 
  | 0.040 | 
  |---|


  |78 | 
  |---|
  |46 | 
  | 0.046 | 
  |---|


  |79 | 
  |---|
  |43 | 
  | 0.043 | 
  |---|

...

-Original Message-
From: David Scott [mailto:d.sc...@auckland.ac.nz] 
Sent: Friday, October 16, 2009 8:42 PM
To: Muenchen, Robert A (Bob)
Cc: ted.hard...@manchester.ac.uk; r-help@r-project.org
Subject: Re: [R] Frequencies, proportions  cumulative proportions

Muenchen, Robert A (Bob) wrote:
 Ted,
 
 I know how to do that. It's just such a standard display in SAS, SPSS
 and Stata that I figured someone had done it and I had just overlooked
 it.
 
 Thanks!
 Bob
 
 
 
 I don't think there is a ready-made one, but it is very little
 effort to make your own:
 
 mkMyTable - function(X){
   Table - data.frame( table(X) )
   Table$Prop - prop.table( Table$Freq )
   Table$CumProp -  cumsum( Table$Prop )
   Table
 }
 
 myTable - mkMyTable(Score)
 
 Hoping this helps!
 Ted.
 
I think CrossTable in gmodels does what Bob is after:

CrossTable(gmodels) R Documentation

Cross Tabulation with Tests for Factor Independence
Description
An implementation of a cross-tabulation function with output similar to 
S-Plus crosstabs() and SAS Proc Freq (or SPSS format) with Chi-square, 
Fisher and McNemar tests of the independence of all table factors.



David Scott

-- 
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Frequencies, proportions cumulative proportions

2009-10-16 Thread Muenchen, Robert A (Bob)
Dear R-Helpers,

I've looked high and low for a function that provides frequencies,
proportions and cumulative proportions side-by-side. Below is the table
I need. Is there a function that already does it?

Thanks,
Bob

 # Generate some test scores
 myValues - c(70:95)
 Score - ( sample( myValues, size=1000, replace=TRUE) )
 head(Score)
[1] 77 71 81 88 83 93
 
 # Get frequencies  proportions
 myTable - data.frame( table(Score) )
 myTable$Prop - prop.table( myTable$Freq )
 myTable$CumProp -  cumsum( myTable$Prop )
 
 # Print result
 myTable
   Score Freq  Prop CumProp
1 70   44 0.044   0.044
2 71   42 0.042   0.086
3 72   40 0.040   0.126
4 73   40 0.040   0.166
5 74   43 0.043   0.209
6 75   45 0.045   0.254
7 76   46 0.046   0.300
8 77   40 0.040   0.340
9 78   46 0.046   0.386
1079   43 0.043   0.429
1180   37 0.037   0.466
1281   29 0.029   0.495
1382   33 0.033   0.528
1483   39 0.039   0.567
1584   31 0.031   0.598
1685   32 0.032   0.630
1786   31 0.031   0.661
1887   37 0.037   0.698
1988   30 0.030   0.728
2089   33 0.033   0.761
2190   43 0.043   0.804
2291   41 0.041   0.845
2392   37 0.037   0.882
2493   39 0.039   0.921
2594   42 0.042   0.963
2695   37 0.037   1.000


=
Bob Muenchen (pronounced Min'-chen),
Manager, Research Computing Support
U of TN Office of Information Technology 
Stokely Management Center, Suite 200
916 Volunteer Blvd., Knoxville, TN 37996-0520
Voice: (865) 974-5230
FAX: (865) 974-8655
Please help us improve: http://oit.utk.edu/feedback   
Web: http://oit.utk.edu/research
Map to Office: http://www.utk.edu/maps
Newsletter: http://oit.utk.edu/research/news.php
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Frequencies, proportions cumulative proportions

2009-10-16 Thread Muenchen, Robert A (Bob)
Ted,

I know how to do that. It's just such a standard display in SAS, SPSS
and Stata that I figured someone had done it and I had just overlooked
it.

Thanks!
Bob



I don't think there is a ready-made one, but it is very little
effort to make your own:

mkMyTable - function(X){
  Table - data.frame( table(X) )
  Table$Prop - prop.table( Table$Freq )
  Table$CumProp -  cumsum( Table$Prop )
  Table
}

myTable - mkMyTable(Score)

Hoping this helps!
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 16-Oct-09   Time: 22:48:06
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SPSS Statistics-R Integration Plug-In

2009-09-12 Thread Muenchen, Robert A (Bob)
Hi Michael,

I've used the R plug-in and really like it. You can read my instructions
on how to install and use it by going to Amazon.com, searching for the
book, R for SAS and SPSS Users and then search inside the book for
the section, Running R from SPSS. I've only got about 3 pages on it
(28 bottom through 31), and Amazon will let you read them all. 

You do need to choose a specific version of R, but the old versions are
kept on CRAN, so they're easy to get. Version 18 of SPSS actually ships
with the version of R it needs on the SPSS DVD.

SPSS makes it really easy to transfer data and results back and forth,
and it formats the results nicely as SPSS pivot tables. I also have an
example of how to make an R program appear on the SPSS menus in R you
taking advantage of R?, an SPSS Directions 2009 talk. It's at:
http://RforSASandSPSSusers.com, right-hand side.

Cheers,
Bob

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Michael Chajewski
Sent: Tuesday, September 08, 2009 9:43 AM
To: r-help@r-project.org
Subject: [R] SPSS Statistics-R Integration Plug-In

Dear All,

Has anyone tried to use this plug-in? Since I am running R-2.9.1 it will
not even let me install it. Further, since I am running Windows I cannot
use the R provided R-2.7.0 Linux installation file from the archive
(tried to install it through cygwin and it was a mess). Suggestions?
Ideas? Has anybody used this plug-in?

Michael

--
Michael Chajewski, M.A.
Department of Psychology
Fordham University
Dealy Hall Room 239
441 East Fordham Road
Bronx, NY 10458
(718) 817-0654
http://www.chajewski.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reading data entered within an R program

2009-07-11 Thread Muenchen, Robert A (Bob)
Dear R-helpers,

I know of two ways to reading data within an R program, using
textConnection and stdin (demo program below). I've Googled about and
looked in several books for comparisons of the two approaches but
haven't found anything. Are there any particular advantages or
disadvantages to these two approaches? If you were teaching R beginners,
which would you present?

Thanks,
Bob
http://RforSASandSPSSusers.com 


# R Program to Read Data Within a Program.
# Very similar to SAS datalines or cards statements,
# and SPSS BEGIN DATA / END DATA commands.

# This stores the data as one long text string.

mystring -
workshop,gender,q1,q2,q3,q4
01,1,f,1,1,5,1
02,2,f,2,1,4,1
03,1,f,2,2,4,3
04,2, ,3,1, ,3
05,1,m,4,5,2,4
06,2,m,5,4,5,5
07,1,m,5,3,4,4
08,2,m,4,5,5,5

# The textConnection function allows read.csv to 
# read data from the text string just as it would 
# from a file.
# The leading zero on first column helps show that
# R is storing row names as a character vector.

mydata - read.csv( textConnection(mystring) )
mydata


mydata - read.csv( stdin() )
workshop,gender,q1,q2,q3,q4
01,1,f,1,1,5,1
02,2,f,2,1,4,1
03,1,f,2,2,4,3
04,2, ,3,1, ,3
05,1,m,4,5,2,4
06,2,m,5,4,5,5
07,1,m,5,3,4,4
08,2,m,4,5,5,5

#The blank line above tells R to stop reading.
mydata

# Read it again stripping out blanks and setting
# nothing to be missing for gender.

mydata - read.csv( stdin(), strip.white=TRUE, na.strings= )
workshop,gender,q1,q2,q3,q4
01,1,f,1,1,5,1
02,2,f,2,1,4,1
03,1,f,2,2,4,3
04,2, ,3,1, ,3
05,1,m,4,5,2,4
06,2,m,5,4,5,5
07,1,m,5,3,4,4
08,2,m,4,5,5,5

mydata

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading data entered within an R program

2009-07-11 Thread Muenchen, Robert A (Bob)
Since stdin seemed simpler I figured textConnection must have some
advantage. 

 

Thanks!

Bob

 

From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Saturday, July 11, 2009 6:00 PM
To: Muenchen, Robert A (Bob)
Cc: R-help@r-project.org
Subject: Re: [R] Reading data entered within an R program

 


Both will work if you copy and paste directly into an R session
but textConnection has the advantage that you can place
it in a file and source it and it still works.

On Sat, Jul 11, 2009 at 4:38 PM, Muenchen, Robert A (Bob)
muenc...@utk.edu wrote:

Dear R-helpers,

I know of two ways to reading data within an R program, using
textConnection and stdin (demo program below). I've Googled about and
looked in several books for comparisons of the two approaches but
haven't found anything. Are there any particular advantages or
disadvantages to these two approaches? If you were teaching R beginners,
which would you present?

Thanks,
Bob
http://RforSASandSPSSusers.com


# R Program to Read Data Within a Program.
# Very similar to SAS datalines or cards statements,
# and SPSS BEGIN DATA / END DATA commands.

# This stores the data as one long text string.

mystring -
workshop,gender,q1,q2,q3,q4
01,1,f,1,1,5,1
02,2,f,2,1,4,1
03,1,f,2,2,4,3
04,2, ,3,1, ,3
05,1,m,4,5,2,4
06,2,m,5,4,5,5
07,1,m,5,3,4,4
08,2,m,4,5,5,5

# The textConnection function allows read.csv to
# read data from the text string just as it would
# from a file.
# The leading zero on first column helps show that
# R is storing row names as a character vector.

mydata - read.csv( textConnection(mystring) )
mydata


mydata - read.csv( stdin() )
workshop,gender,q1,q2,q3,q4
01,1,f,1,1,5,1
02,2,f,2,1,4,1
03,1,f,2,2,4,3
04,2, ,3,1, ,3
05,1,m,4,5,2,4
06,2,m,5,4,5,5
07,1,m,5,3,4,4
08,2,m,4,5,5,5

#The blank line above tells R to stop reading.
mydata

# Read it again stripping out blanks and setting
# nothing to be missing for gender.

mydata - read.csv( stdin(), strip.white=TRUE, na.strings= )
workshop,gender,q1,q2,q3,q4
01,1,f,1,1,5,1
02,2,f,2,1,4,1
03,1,f,2,2,4,3
04,2, ,3,1, ,3
05,1,m,4,5,2,4
06,2,m,5,4,5,5
07,1,m,5,3,4,4
08,2,m,4,5,5,5

mydata

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SAS Institute Adding Support for R

2009-02-12 Thread Muenchen, Robert A (Bob)
Hi Folks,

 

SAS Institute is adding official support for R:

http://support.sas.com/rnd/app/studio/Rinterface2.html 

 

Cheers,

Bob

 

=

Bob Muenchen (pronounced Min'-chen), 

Manager, Research Computing Support 

U of TN Office of Information Technology

Stokely Management Center, Suite 200

916 Volunteer Blvd., Knoxville, TN 37996-0520

Voice: (865) 974-5230

FAX: (865) 974-4810

Email: muenc...@utk.edu

Web: http://oit.utk.edu/research http://oit.utk.edu/scc 

Map to Office: http://www.utk.edu/maps

Newsletter: http://listserv.utk.edu/archives/rcnews.html
http://listserv.utk.edu/archives/statnews.html 

=

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The Quality Accuracy of R

2009-01-26 Thread Muenchen, Robert A (Bob)
That's a great idea. I know of no commercial vendors who provide such
detailed info.

Bob

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Monday, January 26, 2009 7:52 PM
To: Muenchen, Robert A (Bob)
Cc: R-help@r-project.org
Subject: Re: [R] The Quality  Accuracy of R

It would be possible to develop tools to develop code coverage
statistics quantifying the percent of the code that the tests
exercise.

On Fri, Jan 23, 2009 at 10:04 AM, Muenchen, Robert A (Bob)
muenc...@utk.edu wrote:
 Hi All,



 We have all had to face skeptical colleagues asking if software made
by
 volunteers could match the quality and accuracy of commercially
written
 software. Thanks to the prompting of a recent R-help thread, I read,
R:
 Regulatory Compliance and Validation Issues, A Guidance Document for
the
 Use of R in Regulated Clinical Trial Environments
 (http://www.r-project.org/doc/R-FDA.pdf). This is an important
document,
 of interest to the general R community. The question of R's accuracy
is
 such a frequent one, it would be beneficial to increase the visibility
 of the non-clinical  information it contains. A document aimed at a
 general audience, entitled something like, R: Controlling Quality and
 Assuring Accuracy could be compiled from the these sections:



 1.  What is R? (section 4)

 2.  The R Foundation for Statistical Computing (section  3)

 3.  The Scope of this Guidance Document (section 2)

 4.  Software Development Life Cycle (section 6)



 Marc Schwartz, Frank Harrell, Anthony Rossini, Ian Francis and others
 did such a great job that very few words would need to change. The
only
 addition I suggest is to mention how well R did in, Keeling  Parvur's
 A comparative study of the reliability to nine statistical software
 packages, May 1, 2007 Computational Statistics  Data Analysis,
Vol.51,
 pp 3811-3831.



 Given the importance of this issue, I would like to see such a
document
 added to the PDF manuals in R's Help.



 The document mentions (Sect. 6.3) that a set of validation tests, data
 and known results are available. It would be useful to have an option
to
 run that test suite in every R installation, providing clear progress,
 Validating accuracy of t-tests...Validating accuracy of linear
 regression Whether or not people chose to run the tests, they
would
 at least know that such tests are available. Back in my mainframe
 installation days, this step was part of many software installations
and
 it certainly gave the impression that those were the companies that
took
 accuracy seriously. Of course the other companies probably just ran
 their validation suite before shipping, but seeing it happen had a
 tremendous impact.  I don't know how much this would add to download,
 but if it was too much, perhaps it could be implemented as a separate
 download.



 I hope these suggestions can help mitigate the concerns so many non-R
 users have.



 Cheers,

 Bob



 =

 Bob Muenchen (pronounced Min'-chen),

 Manager, Research Computing Support

 U of TN Office of Information Technology

 Stokely Management Center, Suite 200

 916 Volunteer Blvd., Knoxville, TN 37996-0520

 Voice: (865) 974-5230

 FAX: (865) 974-4810

 Email: muenc...@utk.edu

 Web: http://oit.utk.edu/research http://oit.utk.edu/scc

 Map to Office: http://www.utk.edu/maps

 Newsletter: http://listserv.utk.edu/archives/rcnews.html
 http://listserv.utk.edu/archives/statnews.html

 =




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] The Quality Accuracy of R

2009-01-23 Thread Muenchen, Robert A (Bob)
Hi All,

 

We have all had to face skeptical colleagues asking if software made by
volunteers could match the quality and accuracy of commercially written
software. Thanks to the prompting of a recent R-help thread, I read, R:
Regulatory Compliance and Validation Issues, A Guidance Document for the
Use of R in Regulated Clinical Trial Environments
(http://www.r-project.org/doc/R-FDA.pdf). This is an important document,
of interest to the general R community. The question of R's accuracy is
such a frequent one, it would be beneficial to increase the visibility
of the non-clinical  information it contains. A document aimed at a
general audience, entitled something like, R: Controlling Quality and
Assuring Accuracy could be compiled from the these sections:

 

1.  What is R? (section 4)

2.  The R Foundation for Statistical Computing (section  3)

3.  The Scope of this Guidance Document (section 2)

4.  Software Development Life Cycle (section 6)

 

Marc Schwartz, Frank Harrell, Anthony Rossini, Ian Francis and others
did such a great job that very few words would need to change. The only
addition I suggest is to mention how well R did in, Keeling  Parvur's
A comparative study of the reliability to nine statistical software
packages, May 1, 2007 Computational Statistics  Data Analysis, Vol.51,
pp 3811-3831. 

 

Given the importance of this issue, I would like to see such a document
added to the PDF manuals in R's Help.

 

The document mentions (Sect. 6.3) that a set of validation tests, data
and known results are available. It would be useful to have an option to
run that test suite in every R installation, providing clear progress,
Validating accuracy of t-tests...Validating accuracy of linear
regression Whether or not people chose to run the tests, they would
at least know that such tests are available. Back in my mainframe
installation days, this step was part of many software installations and
it certainly gave the impression that those were the companies that took
accuracy seriously. Of course the other companies probably just ran
their validation suite before shipping, but seeing it happen had a
tremendous impact.  I don't know how much this would add to download,
but if it was too much, perhaps it could be implemented as a separate
download. 

 

I hope these suggestions can help mitigate the concerns so many non-R
users have.

 

Cheers,

Bob

 

=

Bob Muenchen (pronounced Min'-chen), 

Manager, Research Computing Support 

U of TN Office of Information Technology

Stokely Management Center, Suite 200

916 Volunteer Blvd., Knoxville, TN 37996-0520

Voice: (865) 974-5230

FAX: (865) 974-4810

Email: muenc...@utk.edu

Web: http://oit.utk.edu/research http://oit.utk.edu/scc 

Map to Office: http://www.utk.edu/maps

Newsletter: http://listserv.utk.edu/archives/rcnews.html
http://listserv.utk.edu/archives/statnews.html 

=

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Articles about comparision between R and others softwares

2008-09-06 Thread Muenchen, Robert A (Bob)
Hi Ricardo,

You can search for comparisons by entering the packages that interest
you at:

http://finzi.psych.upenn.edu/search.html 

Michael Mitchell wrote an interesting comparison of SAS, SPSS, Stata and
R at:

http://www.ats.ucla.edu/stat/technicalreports/  

That report says little about R, but Patrick Burns' excellent rejoinder
to that report fills in much of the missing R material. It is at that
link too.

The accuracy of various stat packages, including R, is in:

Keeling, Kellie B. and Pavur, Robert J. A comparative study of the
reliability of nine
statistical software packages. 8, May 1, 2007, Computational Statistics
 Data Analysis,
Vol. 51, pp. 3811-3831.

I've got an 80-page comparison of many features of R to SAS and SPSS at:

http://RforSASandSPSSusers.com 

That document focuses more on the language basics and data management
rather than statistics. My book by the same name adds graphics and basic
statistics to the mix. That should finally be printed in a few weeks.

Cheers,
Bob

=
Bob Muenchen (pronounced Min'-chen), Manager 
Statistical Consulting Center
U of TN Office of Information Technology
200 Stokely Management Center, Knoxville, TN 37996-0520
Voice: (865) 974-5230 
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc, 
News: http://listserv.utk.edu/archives/statnews.html
=


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of ricardo13
 Sent: Friday, September 05, 2008 3:46 PM
 To: r-help@r-project.org
 Subject: [R] Articles about comparision between R and others softwares
 
 
 Hi
 
 Do you know some articles, papers, something than tell about
 comparision
 between R and others softwares statisticals.
 
 Thank You
 
 Ricardo
 --
 View this message in context: http://www.nabble.com/Articles-about-
 comparision-between-R-and-others-softwares-tp19338210p19338210.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dealing with NaN's in data frames

2008-08-16 Thread Muenchen, Robert A (Bob)
Hi Jon,

Here's one way.

 x - c(1,2,3,4,NaN)
 y - c(1,2,NaN,4,5)
 
 myDF - data.frame(x,y)
 myDF
x   y
1   1   1
2   2   2
3   3 NaN
4   4   4
5 NaN   5
 
 myDF[ is.na(myDF) ] - NA
 myDF
   x  y
1  1  1
2  2  2
3  3 NA
4  4  4
5 NA  5

Cheers,
Bob

=
Bob Muenchen (pronounced Min'-chen), Manager 
Statistical Consulting Center
U of TN Office of Information Technology
200 Stokely Management Center, Knoxville, TN 37996-0520
Voice: (865) 974-5230 
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc, 
News: http://listserv.utk.edu/archives/statnews.html
=


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Peck, Jon
 Sent: Friday, August 15, 2008 10:28 PM
 To: r-help@r-project.org
 Subject: [R] Dealing with NaN's in data frames
 
 I am looking for the most efficient way to replace all occurrences of
 NaN in a data frame with NA.  I can do this with a double loop, but it
 seems that there should be a higher level and more efficient way.
With
 is.na, I could use ifelse, but if.nan seems not to have similar
 capabilities.
 
 
 
 TIA,
 
 Jon Peck
 
 
 
 Jon K. Peck
 
 [EMAIL PROTECTED]
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] .Rprofile is being executed twice

2008-05-17 Thread Muenchen, Robert A (Bob)
I think I did that once by accidentally placing the .Rprofile in two
places. In Windows I think that was the directory that contains the R
executable and in My Documents. I think you can also cause this by
setting your working directory in your .Rprofile with setwd() and then
it runs any .Rprofile it finds there too. Whichever way I did it, I had
removed a function from one and still had the second one performing that
function! That blew my mind until I found the second file. Cheers, -Bob


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Dan Tenenbaum
 Sent: Friday, May 02, 2008 4:25 PM
 To: r-help@r-project.org
 Subject: [R] .Rprofile is being executed twice
 
 Hi,
 
 After updating to R 2.7, my .Rprofile executes twice on startup. I
 confirmed this by putting in the following line:
 print(starting .Rprofile...)
 
 When I start R, I see:
 [1] starting .Rprofile...
 [1] starting .Rprofile...
 
 This seems like the obverse of the following FAQ:
 http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-did-my-_002eRprofile-
 stop-working-when-I-updated-R_003f
 
 But in my case .Rprofile is working, just working twice as much as it
 should.
 
 Even if there is nothing in my .Rprofile except that print()
statement,
 it still executes twice.
 
 What could be causing this?
 Thanks
 Dan
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] .Rprofile, date tagging history, loading packages

2008-04-13 Thread Muenchen, Robert A (Bob)
Dear R-Helpers,

I'm fiddling with my .Rprofile in Windows XP  R 2.7.0 Beta. I prefer to
manually save my workspace but automatically save my command history via
the .Rprofile. That is working fine once I found that utils:: was
required before the loadhistory  savehistory functions. What I would
like to do is add a separator line with a date between the histories of
each session. Something like,

=History for Sun Apr 13 09:43:50 2008

Is this possible? I have it print the date at startup, but that doesn't
appear as part of the history.

Also, I'm loading two packages at startup, which is working fine with
this code:

local({
   myOriginal - getOption(defaultPackages)
   myAutoLoads - c(Hmisc,ggplot2)
   myBoth - c(myOriginal,myAutoLoads)
   options(defaultPackages = myBoth)
 })

But when reading AITR, I noticed it has a .Rprofile example that looks
much simpler. It just loads the one additional package without checking
the defaultPackages. Is this just as good? Does it even need to be in
the .First function definition, or could it simply be a command by
itself in .Rprofile?

.First - function() {
options(prompt=$ , continue=+\t) # $ is the prompt
options(digits=5, length=999) # custom numbers and printout
x11() # for graphics
par(pch = +) # plotting character
source(file.path(Sys.getenv(HOME), R, mystuff.R))
# my personal functions
library(MASS) # attach a package
}

Thanks,
Bob

My whole .Rprofile:

# Startup Settings

# Place any R commands below.

options(width=64, digits=5, show.signif.stars=TRUE)
set.seed(1234)
setwd(/myRfolder)
myPackages - c(car,foreign,hexbin,
  ggplot2,gmodels,gplots, Hmisc,
  lattice, reshape,ggplot2,Rcmdr)
utils::loadhistory(file = myCumulative.Rhistory)


# Load packages automatically below.

 local({
   myOriginal - getOption(defaultPackages)

   # Edit next line to include your favorites.
   myAutoLoads - c(Hmisc,ggplot2)
   myBoth - c(myOriginal,myAutoLoads)
   options(defaultPackages = myBoth)
 })

# Things put here are done first.
.First - function() 
  {
cat(\n   Welcome to R!)
cat(\n   , paste(date()), \n\n )
  }


# Things put here are done last.
.Last - function() 
  {
graphics.off()
cat(\n\n  myCumulative.Rhistory has been saved on  ,paste(date())
) 
cat(\n\n  Goodbye!\n\n) 
utils::savehistory(file=myCumulative.Rhistory)
  }

=
Bob Muenchen (pronounced Min'-chen), Manager 
Statistical Consulting Center
U of TN Office of Information Technology
200 Stokely Management Center, Knoxville, TN 37996-0520
Voice: (865) 974-5230 
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc, 
News: http://listserv.utk.edu/archives/statnews.html
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] NA vs. NA

2008-04-04 Thread Muenchen, Robert A (Bob)
Dear R-Helpers,

Why does R show character missing values in vectors as NA and when
stored in a data frame as NA? I've searched but did not find an
explanation.

Thanks,
Bob

 gender - c(f,f,f,NA,m,m,m,m)
 gender
[1] f f f NA  m m m m  #here it lacks brackets.
 
 q1 - c(1,2,2,3,4,5,5,4)
 q1
[1] 1 2 2 3 4 5 5 4
 
 myDF - data.frame(q1,gender)
 myDF
  q1 gender
1  1  f
2  2  f
3  2  f
4  3   NA  #here it has brackets.
5  4  m
6  5  m
7  5  m
8  4  m

=
Bob Muenchen (pronounced Min'-chen), 
Manager, Statistical Consulting Center 
U of TN Office of Information Technology
Stokely Management Center, Suite 200
916 Volunteer Blvd., Knoxville, TN 37996-0520
Voice: (865) 974-5230
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc
Map: http://www.utk.edu/maps 
News: http://listserv.utk.edu/archives/statnews.html
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NA vs. NA

2008-04-04 Thread Muenchen, Robert A (Bob)
Peter  Hadley, thanks for the clarification. The NA=North America example 
reminds me of a text analysis problem in which TO meant Take Off for 
pilots. Of course many text analysis programs toss supposedly low-information 
words like that! Thanks, Bob

 -Original Message-
 From: Peter Dalgaard [mailto:[EMAIL PROTECTED]
 Sent: Friday, April 04, 2008 12:18 PM
 To: Muenchen, Robert A (Bob)
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] NA vs. NA
 
 Muenchen, Robert A (Bob) wrote:
  Dear R-Helpers,
 
  Why does R show character missing values in vectors as NA and when
  stored in a data frame as NA? I've searched but did not find an
  explanation.
 
  Thanks,
  Bob
 
 
  gender - c(f,f,f,NA,m,m,m,m)
  gender
 
  [1] f f f NA  m m m m  #here it lacks brackets.
 
  q1 - c(1,2,2,3,4,5,5,4)
  q1
 
  [1] 1 2 2 3 4 5 5 4
 
  myDF - data.frame(q1,gender)
  myDF
 
q1 gender
  1  1  f
  2  2  f
  3  2  f
  4  3   NA  #here it has brackets.
  5  4  m
  6  5  m
  7  5  m
  8  4  m
 
 It is actually a factor in the latter case
 
   data.frame(gender)$gender
 [1] fffNA mmmm
 Levels: f m
 
 However, you have the same effect with
 
   data.frame(gender,stringsAsFactors=FALSE)
   gender
 1  f
 2  f
 3  f
 4   NA
 5  m
 6  m
 7  m
 8  m
 
 The thing to notice is that the printing is without the quote
 character.
 We also have
 
   noquote(gender)
 [1] fffNA mmmm
 
 And the point in either case is that we need some way to distinguish
 between NA (missing) and NA (New Alliance, Noradrenalin, North
 America, Neil Armstrong, etc.)
 
 --
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45)
 35327907
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] When to quote a package name

2008-03-10 Thread Muenchen, Robert A (Bob)
Dear HelpeRs,

I'm confused about the role of quotes around package names on the
library and detach functions. Books on R use both approaches:

library(Hmisc)
describe(mydata)
detach(package:Hmisc)

and

library(Hmisc)
describe(mydata)
detach(package:Hmisc)

The help file for detach says quoted or unquoted and the help file for
library says about the package, the name of a package, given as a name
or literal character string, or a character string, depending on whether
character.only is FALSE (default) or TRUE).

Are there conditions under which it matters? Which is best?

Thanks,
Bob

=
Bob Muenchen (pronounced Min'-chen), 
Manager, Statistical Consulting Center 
U of TN Office of Information Technology
Stokely Management Center, Suite 200
916 Volunteer Blvd., Knoxville, TN 37996-0520
Voice: (865) 974-5230
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc
Map: http://www.utk.edu/maps 
News: http://listserv.utk.edu/archives/statnews.html
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Non-visible functions are asterisked

2008-03-08 Thread Muenchen, Robert A (Bob)
Dear R-Helpers,

I suspect I'm about to ask a FAQ, but I haven't been able to find an
answer in the FAQ, AItR or an R Site Search. When I look at the methods
of summary (below) it says, Non-visible functions are asterisked. I
looked at the help file for summary.princomp, which did not comment on
it being non-visible. I ran its help file example, which printed visible
output. I did not notice how it differed from other functions, like
summary.data.frame that is not marked non-invisible. What does the
non-visible mean?

Thanks,
Bob 

 methods(summary)
 [1] summary.aovsummary.aovlist   
 [3] summary.connection summary.data.frame
 [5] summary.Date   summary.default   
 [7] summary.ecdf*  summary.factor
 [9] summary.glmsummary.infl  
[11] summary.lm summary.loess*
[13] summary.manova summary.matrix
[15] summary.mlmsummary.nls*  
[17] summary.packageStatus* summary.POSIXct   
[19] summary.POSIXltsummary.ppr*  
[21] summary.prcomp*summary.princomp* 
[23] summary.stepfunsummary.stl*  
[25] summary.table  summary.tukeysmooth*  

   Non-visible functions are asterisked

=
Bob Muenchen (pronounced Min'-chen), 
Manager, Statistical Consulting Center 
U of TN Office of Information Technology
Stokely Management Center, Suite 200
916 Volunteer Blvd., Knoxville, TN 37996-0520
Voice: (865) 974-5230
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc
Map: http://www.utk.edu/maps 
News: http://listserv.utk.edu/archives/statnews.html
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How many R packages?

2008-02-12 Thread Muenchen, Robert A (Bob)
Thanks to all who responded, those were very helpful!

Henrique's solution (below) gets right to it  counts whatever repositories you 
have selected. For all repositories the number is 2,758, so there must have 
been a few duplicates in my manual count.

I'm trying make the case to users of other stat packages that R offers a lot. 
However, I don't want to overstate the case. Does anyone know about the 
packages with repteticious sounding names like the 19 bvbovines? I know 
technically they are different packages, but are they minor tweaks or would you 
count them as different packages in this context?

Thanks,
Bob



 -Original Message-
 From: Henrique Dallazuanna [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, February 12, 2008 9:58 AM
 To: Muenchen, Robert A (Bob)
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] How many R packages?
 
 x - available.packages()
 length(unique(rownames(x)))
 
 On 12/02/2008, Muenchen, Robert A (Bob) [EMAIL PROTECTED] wrote:
  Hi All,
 
  I searched around to find the number of R packages currently
 available,
  but didn't find anything, so I choose all repositories  told it to
  install. The list contained about 2,856 (correcting roughly for those
  installed). But the list includes repetitions such as 19 names that
  begin with bvbovine.
 
  Selecting only CRAN and CRAN(extras) I get 1,344.
 
  Is there an easier way to determine the total number of R packages
  available?
 
  Thanks,
  Bob
 
  =
  Bob Muenchen (pronounced Min'-chen), Manager
  Statistical Consulting Center
  U of TN Office of Information Technology
  200 Stokely Management Center, Knoxville, TN 37996-0520
  Voice: (865) 974-5230
  FAX: (865) 974-4810
  Email: [EMAIL PROTECTED]
  Web: http://oit.utk.edu/scc,
  News: http://listserv.utk.edu/archives/statnews.html
  =
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How many R packages?

2008-02-12 Thread Muenchen, Robert A (Bob)
Hi All,

I searched around to find the number of R packages currently available,
but didn't find anything, so I choose all repositories  told it to
install. The list contained about 2,856 (correcting roughly for those
installed). But the list includes repetitions such as 19 names that
begin with bvbovine.

Selecting only CRAN and CRAN(extras) I get 1,344.

Is there an easier way to determine the total number of R packages
available?

Thanks,
Bob

=
Bob Muenchen (pronounced Min'-chen), Manager 
Statistical Consulting Center
U of TN Office of Information Technology
200 Stokely Management Center, Knoxville, TN 37996-0520
Voice: (865) 974-5230 
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc, 
News: http://listserv.utk.edu/archives/statnews.html
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Form Pairs of Variables for a paired t-test

2008-01-30 Thread Muenchen, Robert A (Bob)
That's a dandy little program but the apply with lapply blew my mind! I had to 
pick it apart to figure out what it was doing. Perhaps others will find this 
expanded version useful:

# Make up some repeated measures data with measures at 4 times.
t1-c(1,2,3,4,5)
t2-c(2,3,3,5,5)
t3-c(3,3,4,4,4)
t4-c(5,6,6,7,7)
myTimes-data.frame(t1,t2,t3,t4)
myTimes

# Get matrix of combinations of 4 things taken two at a time.
myCombos-combn( ncol(myQs), 2 )
myCombos

# Generate a list that contains all the pairs of times.
myTimesList - apply( myCombos, 2, function(y) myTimes[ ,y] )
myTimesList

# apply the t.test function to each set of pairs in the myTimeCombos
lapply( myTimesList, 
function(z) t.test( z[ ,1], z[ ,2] ) 
)

Cheers,
Bob

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Henrique Dallazuanna
 Sent: Tuesday, January 29, 2008 9:15 AM
 To: nalluri pratap
 Cc: r-help@r-project.org
 Subject: Re: [R] Form Pairs of Variables for a paired t-test
 
 Try this:
 
 lapply(apply(combn(ncol(x),2), 2, function(y)x[,y]),
 function(z)t.test(z[,1], z[,2]))
 
 On 29/01/2008, nalluri pratap [EMAIL PROTECTED] wrote:
  Hi Users,
 
This is regarding the paired t-test. I have 5 variables (say)
 Data$v1,Data$v2,Data$v3,Data$v4,Data$v5 in my data frame. Now, I need
 to perform a paired t-test on all the possible 10 pairs.How do I set up
 the pairs table directly and pass those variables in to t-test.
 
Thanks in advance,
 
Pratap
 
 
  -
   Now you can chat without downloading messenger. Click here to know
 how.
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] Open source archive program on windows

2008-01-27 Thread Muenchen, Robert A (Bob)
This is a popular one:
http://www.7-zip.org/

Cheers,
Bob

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Duncan Murdoch
 Sent: Sunday, January 27, 2008 6:37 PM
 To: David Scott
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] [OT] Open source archive program on windows
 
 On 27/01/2008 4:46 PM, David Scott wrote:
 
  I am looking for a recommendation for an open source competitor to
 Winzip.
  I seem to recall Brian Ripley mentioning one in the last year or so,
 but
  couldn't find it in the mail archives. (Searching on Ripley there is
  somehow not terribly useful.)
 
 The R toolset includes the Info-zip program zip and unzip.  These are
 open source, but not exactly competitors to Winzip, in that they are
 command-line only.
 
 Duncan Murdoch
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Barplot w/ single stacked bar

2008-01-24 Thread Muenchen, Robert A (Bob)
Hi All,

I can get the barplot function to do many types of plots, stacked or
otherwise. However, I cannot get it to do a *single* stacked bar. I've
searched several books  listserv archives to no avail. I suspect I'm
missing the obvious from the help file!

I can reach my goal in ggplot2, although the relative heights of the
bar's pieces don't seem quite right (it does generate a warning):

library(ggplot2)
x-factor(1)
y-factor( c(Male,Male,Female) )
mydata - data.frame(x,y)
rm(x,y)
mydata

#These are close to my goal:
qplot( x, y, fill=y, geom=bar, data=mydata)

# or
ggplot(mydata, aes(x=x, y=y, fill=y)) + geom_bar()

# But this places the bars beside each other rather than stack them.
barplot( table(mydata$y), beside=FALSE)

Thanks!
Bob

=
Bob Muenchen (pronounced Min'-chen), Manager 
Statistical Consulting Center
U of TN Office of Information Technology
200 Stokely Management Center, Knoxville, TN 37996-0520
Voice: (865) 974-5230 
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc, 
News: http://listserv.utk.edu/archives/statnews.html
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Barplot w/ single stacked bar

2008-01-24 Thread Muenchen, Robert A (Bob)
Marc  Eric,

Thanks so much for the help. That is exactly what I was looking for. 

I should have mentioned that I don't really like this plot, but I'm
writing an explanation of the Grammar of Graphics concept. A very nice
example of that is that a single stacked bar chart converts to a pie
chart when you change from Cartesian to polar coordinates. And yes, that
may well be going from bad to worse!

Thanks!
Bob

=
Bob Muenchen (pronounced Min'-chen), Manager 
Statistical Consulting Center
U of TN Office of Information Technology
200 Stokely Management Center, Knoxville, TN 37996-0520
Voice: (865) 974-5230 
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc, 
News: http://listserv.utk.edu/archives/statnews.html
=


 -Original Message-
 From: Marc Schwartz [mailto:[EMAIL PROTECTED]
 Sent: Thursday, January 24, 2008 11:32 AM
 To: Muenchen, Robert A (Bob)
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] Barplot w/ single stacked bar
 
 Muenchen, Robert A (Bob) wrote:
  Hi All,
 
  I can get the barplot function to do many types of plots, stacked or
  otherwise. However, I cannot get it to do a *single* stacked bar.
 I've
  searched several books  listserv archives to no avail. I suspect
I'm
  missing the obvious from the help file!
 
  I can reach my goal in ggplot2, although the relative heights of the
  bar's pieces don't seem quite right (it does generate a warning):
 
  library(ggplot2)
  x-factor(1)
  y-factor( c(Male,Male,Female) )
  mydata - data.frame(x,y)
  rm(x,y)
  mydata
 
  #These are close to my goal:
  qplot( x, y, fill=y, geom=bar, data=mydata)
 
  # or
  ggplot(mydata, aes(x=x, y=y, fill=y)) + geom_bar()
 
  # But this places the bars beside each other rather than stack them.
  barplot( table(mydata$y), beside=FALSE)
 
  Thanks!
  Bob
 
 Bob,
 
 Try this:
 
   barplot(as.matrix(table(mydata$y)), beside = FALSE)
 
 Conceptually, for a stacked bar, each bar is a column in a matrix. The
 components in a stacked bar are the row values in the column.
 
 Thus, you need to create a single column matrix from your table.
 
 One might question the value of such a plot however, if the intent is
 to
 provide a visual representation of the difference in
counts/proportions
 between two groups. A side-by-side barplot or a dotchart would seem to
 be better here.
 
 HTH,
 
 Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] p.adjust on matrix of P-values from correlations

2007-11-20 Thread Muenchen, Robert A (Bob)
Hi All,

I'm stumped on something that must be trivial. I created a correlation
matrix on 4 variables (6 correlations) using Hmisc's rcorr function. I
wanted to correct the P-value matrix for the number of tests done, so I
ran it through the p.adjust function. That function adjusted for the 12
p-values it saw, rather than 6. I added the argument n=6 to p.adjust but
it requires that n be greater than the length of x. I guess its author
assumed you would always be correcting for more tests than it could see.

I changed the matrix into a long vector to see if that would matter. The
help file says it requires vector, but the result was the same. 

If I were using the conservative Bonferroni correction, I could divide
the corrected P-values by 2 to make n=6 after the fact. However, I'm
using Holm's sequential method, so that's no good. 

Any ideas?

Thanks,
Bob

P.S. I'm using R 2.6.0 Patched on Windows XP.

=
Bob Muenchen (pronounced Min'-chen), 
Manager, Statistical Consulting Center 
U of TN Office of Information Technology
Stokely Management Center, Suite 200
916 Volunteer Blvd., Knoxville, TN 37996-0520
Voice: (865) 974-5230
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc
Map: http://www.utk.edu/maps 
News: http://listserv.utk.edu/archives/statnews.html
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] producing output as *.spo (spss output format)

2007-11-12 Thread Muenchen, Robert A (Bob)
You probably don't want to spend time figuring out the .spo format. From
SPSS 16 on, that format is obsolete and replaced by the Unicode
XML-based .spv file. SPSS 16 users need a separate Legacy Viewer to read
.spo files. -Bob

=
Bob Muenchen (pronounced Min'-chen), 
Manager, Statistical Consulting Center 
U of TN Office of Information Technology
Stokely Management Center, Suite 200
916 Volunteer Blvd., Knoxville, TN 37996-0520
Voice: (865) 974-5230
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc
Map: http://www.utk.edu/maps 
News: http://listserv.utk.edu/archives/statnews.html
=

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Ivan Uemlianin
 Sent: Saturday, November 10, 2007 6:04 AM
 To: r-help@r-project.org
 Subject: [R] producing output as *.spo (spss output format)
 
 Dear All
 
 I am considering moving from SPSS to R as my stats environment of
 choice.  I have read around and everything looks favourable.  There is
 just one issue on which I have been unable to find information.
 
 Many clients ask me to send them output (tables, graphs, etc) as an
 spss
 output file (ie .spo).  I haven't asked them why, I've just said yes.
 I
 know R can produce graphics as nice as SPSS, and presumably they can
be
 output in some portable format for pasting into a word-processor
 document.  I need to find out why the client wants spo.  In the
 meantime
 let's assume they have a good reason.
 
 Can R write .spo files?  Failing that does any one know of any spo
 writers that I could wire up to R (eg with some python gluecode)?
 Failing that any suggestions for overcoming the output hurdle would be
 welcome, as R looks very attractive (platform independent, easy to use
 and to automate, fast).
 
 Best wishes
 
 Ivan
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] JGR makes help more helpful

2007-10-17 Thread Muenchen, Robert A (Bob)
Hi All,

A few weeks ago I suggested that it would be nice to be able to submit
lines from the help files for execution. You can cut and paste them into
the console, or enter example(function) to run them all. However, I
often find myself wanting to run just a line or two, or even parts of a
line to see an intermediate result. 

It turns out that this is one of the many nice features of the JGR
interface. You simply select what you want to submit from the help
window and use CTRL-R to run it. JGR is essentially an enhanced R
console, but it doesn't replace your standard console, making it easy to
use either. It easy to learn and you can figure out most of what it does
without documentation. However, to learn its full capabilities read
JGR: Java GUI for R by Markus Helbig, Martin Theus  Simon Urbanek on
page nine of:
http://www.amstat-online.org/sections/graphics/newsletter/Volumes/v162.p
df
JGR itself is a free download at:
http://rosuda.org/JGR/

Cheers,
Bob

=
Bob Muenchen (pronounced Min'-chen), 
Manager, Statistical Consulting Center 
U of TN Office of Information Technology
Stokely Management Center, Suite 200
916 Volunteer Blvd., Knoxville, TN 37996-0520
Voice: (865) 974-5230
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc
Map: http://www.utk.edu/maps 
News: http://listserv.utk.edu/archives/statnews.html
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cutting pasting help examples into script window

2007-09-20 Thread Muenchen, Robert A (Bob)
Hi All,

When I cut  paste help file examples into a script window, about half
the time it pastes as a single long line. 

The steps I follow are:

1. Open a help file e.g. ?data.frame.
2. Select the examples at the bottom.
3. Choose File: Copy.
4. Return to the console.
5. Choose File: New script.
6. Choose File: Paste or do CTRL-V. The examples frequently paste as a
single long line. 

I came across this in 2.6.0 beta on Windows XP  thought it was related
to the cut/paste changes in that version. I went back to 2.5.1 and at
first it worked fine, verifying my suspicion that it was a 2.6.0
problem. I double-checked my steps before posting, and to my surprise
found that in both versions this problem is intermittent.

I thought it might be a menu vs. keyboard CTRL-V difference, but found
it happens with both, and in both versions. I have also fiddled with
sizing or moving the Help window but that doesn't seem to be related. I
did discover that may be a problem on the paste side of things, as so
far it *always* pastes into Notepad correctly.

Any ideas?

Thanks,
Bob

P.S. What would really be slick would be to select the example in Help,
right-click and choose Run Line or Selection. Perhaps in version 2.7.
;-) .

=
Bob Muenchen (pronounced Min'-chen), 
Manager, Statistical Consulting Center 
U of TN Office of Information Technology
Stokely Management Center, Suite 200
916 Volunteer Blvd., Knoxville, TN 37996-0520
Voice: (865) 974-5230
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc
Map: http://www.utk.edu/maps 
News: http://listserv.utk.edu/archives/statnews.html
=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cutting pasting help examples into script window

2007-09-20 Thread Muenchen, Robert A (Bob)
Stephan Grosse replied:

 
 What I do not understand is why you not just type
example(yourcommand)?
 
 Stefan

That's a good question. I want to play around with variations of the
examples rather than run them exactly as they are.

Thanks,
Bob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cutting pasting help examples into script window

2007-09-20 Thread Muenchen, Robert A (Bob)
Does this look like a bug? If so, is there a different way to report it?
Thanks, Bob

 -Original Message-
 From: Dirk Eddelbuettel [mailto:[EMAIL PROTECTED]
 Sent: Thursday, September 20, 2007 10:17 AM
 To: Muenchen, Robert A (Bob)
 Subject: Re: [R] Cutting  pasting help examples into script window
 
 On Thu, Sep 20, 2007 at 10:01:03AM -0400, Muenchen, Robert A (Bob)
 wrote:
  Stephan Grosse replied:
 
  
   What I do not understand is why you not just type
  example(yourcommand)?
  
   Stefan
 
  That's a good question. I want to play around with variations of the
  examples rather than run them exactly as they are.
 
 In Emacs' wonderful ESS mode, you just press 'l' and the line of
 example code you're on gets sent to R.  You can then 'pick it up' in
 the R buffer and play with it.  I do that all the time ...
 
 Dirk
 
 --
 Three out of two people have difficulties with fractions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cutting pasting help examples into script window

2007-09-20 Thread Muenchen, Robert A (Bob)
Now I'm working in 2.5.1 on a home machine also running XP. It has the
same problem, and I think I finally figured it out. 

I've noticed that if the cursor is directly over the text, it becomes an
I-beam. When hovering over the blank space around the text, the cursor
becomes an arrow. Selections via the arrow almost always paste properly
into a script window. Copies made while selecting with the I-beam cursor
almost always fail.

Regardless of how the selection is done, a paste into Notepad never
fails. Copying from Notepad to a script window never fails, regardless
of how the paste into Notepad was selected.

Very strange!

Bob

P.S. almost the testing has been with the ?data.frame and ?summary
examples.

 -Original Message-
 From: Duncan Murdoch [mailto:[EMAIL PROTECTED]
 Sent: Thursday, September 20, 2007 7:59 PM
 To: Muenchen, Robert A (Bob)
 Cc: [EMAIL PROTECTED]
 Subject: Re: [R] Cutting  pasting help examples into script window
 
 On 20/09/2007 1:49 PM, Muenchen, Robert A (Bob) wrote:
  Does this look like a bug? If so, is there a different way to report
 it?
 
 It sounds like a bug, but I can't reproduce it.  You said it is
 intermittent on your system.  Can you try to work out the conditions
 that reliably trigger it?
 
 It might be something specific to your system; does anyone else see
 this?
 
 Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Save to File... option on File menu

2007-09-12 Thread Muenchen, Robert A (Bob)
Hi Talbot,

I just had that question a couple of weeks ago. Here's the thread:

RSiteSearch(Saving results from Linux command line)

Thomas Lumley concluded with:

There could still be functions that divert a copy of all the output to a
file, for example. And indeed there are.

sink(transcript.txt, split=TRUE)

And you're right, you do this at the start, or put it in your .Rprofile
so you don't have to remember it each time. The UNIX tee command does
this as well.

Cheers,
Bob

=
Bob Muenchen (pronounced Min'-chen), Manager 
Statistical Consulting Center
U of TN Office of Information Technology
200 Stokely Management Center, Knoxville, TN 37996-0520
Voice: (865) 974-5230 
FAX: (865) 974-4810
Email: [EMAIL PROTECTED]
Web: http://oit.utk.edu/scc, 
News: http://listserv.utk.edu/archives/statnews.html
=


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 project.org] On Behalf Of Talbot Katz
 Sent: Wednesday, September 12, 2007 3:05 PM
 To: [EMAIL PROTECTED]
 Subject: [R] Save to File... option on File menu
 
 Hi.
 
 There was an interesting thread about a year ago, called  'Command
 equivalent of rgui File, Save to File?'
 (http://tolstoy.newcastle.edu.au/R/e2/help/06/09/0553.html) started by
 Michael Prager, and contributed to by Duncan Murdoch (I didn't notice
 anything beyond the four entries they posted).  The question was how
to
 replicate programmatically the Save to File... option on the File
 menu.
 The closest answers given involved either running in batch or using
the
 sink() command.  Perhaps I don't understand the sink() command well
 enough,
 but it appears to me that you have to set it up before you run
 commands, and
 that it can't be used to save command output from commands that were
 already
 run; am I right about this?  Whereas the Save to File... command
 scoops up
 everything that's still in the console.  Here is my problem.  I am
 running R
 on Linux in a VNC window.  I'd like to save my console output, but
 there
 doesn't appear to be a File menu available and I didn't start out with
 the
 sink() command.  Is there any way to replicate Save to File... in
 this
 situation?  Thanks!
 
 --  TMK  --
 212-460-5430  home
 917-656-5351  cell
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.