from:"Robert Wilkins"

[R] Clinical Trial data sets in public domain?

2018-01-13 Thread Robert Wilkins

Is anybody using R to do analysis of clinical trial datasets that have been
put in the public domain (which are super hard to find). Not only a single
data table, but the actual database, with a handful of data tables with
one-to-one or many-to-one relationships?

[ For example, "Adverse Events" and "Patient Info" are two datasets with a
many-to-one relationship, the "Patient Info" dataset has precisely one row
for each patient who received a dose of study drug.]

Robert Wilkins

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data cleaning & Data preparation, what do R users want?

2017-12-11 Thread Robert Wilkins

Dominik (and others)

If it is indeed still the biggest paint point, even in 2017, then maybe we
can do something about that, with more efforts at different user interface
design and try-outs with them on specialized datasets.
[ The fact that in some specialties, such as clinical trials, for example,
getting access to public domain datasets (and not having to use a tiny
"toy" dataset, which nobody will pay attention to, does make it harder].

It would help if academia (both comp-sci and statistics departments) would
support those who invest resources in drafting and test-driving new product
designs. If, in the year 2017, it is still a big pain point, doesn't that
make sense. More speculative work in statistical programming language
design has not been a priority in academia since before 1980.

On Thu, Nov 30, 2017 at 4:11 AM, Dominik Schneider <
dominik.schnei...@colorado.edu> wrote:

> I would agree that getting data into R from various sources is the biggest
> pain point. Even if there is an api, the results are not always consistent
> and you have to do lots of dimension checking to get it right. Or there
> isn't an open api at all and you have to hack it by web scraping or
> otherwise- http://enpiar.com/2017/08/11/one-hour-package/
>
> On Thu, Nov 30, 2017 at 1:00 AM, Jim Lemon  wrote:
>
>> Hi again,
>> Typo in the last email. Should read "about 40 standard deviations".
>>
>> Jim
>>
>> On Thu, Nov 30, 2017 at 10:54 AM, Jim Lemon  wrote:
>> > Hi Robert,
>> > People want different levels of automation in the software they use.
>> > What concerns many of us is the desire for the function
>> > "figure-out-what-this-data-is-import-it-and-get-rid-of-bad-values".
>> > Such users typically want something that justifies its use by being
>> > written by someone who seems to know what they're doing and lots of
>> > other people use it. One advantage of many R functions is their
>> > modular construction. This encourages users to at least consider the
>> > steps that are taken rather than just accept what comes out of that
>> > long tube.
>> >
>> > Take the contentious problem of outlier identification. If I just let
>> > the black box peel off some values, I don't know what I have lost. On
>> > the other hand, if I import data and examine it with a summary
>> > function, I may find that one woman has a height of 5.2 meters. I can
>> > range check by looking up the Guinness Book of Records. It's an
>> > outlier. I can estimate the probability of such a height.  Hmm, about
>> > 4 standard deviations above the mean. It's an outlier. I can attempt a
>> > Sherlock Holmes. "Watson, I conclude that an imperial measure (5'2")
>> > has been recorded as a metric value". It's not an outlier.
>> >
>> > The more R gravitates toward "black box" functions, the more some
>> > users are encouraged to let them do the work.You pays your money and
>> > you takes your chances.
>> >
>> > Jim
>> >
>> >
>> > On Thu, Nov 30, 2017 at 3:37 AM, Robert Wilkins 
>> wrote:
>> >> R has a very wide audience, clinical research, astronomy, psychology,
>> and
>> >> so on and so on.
>> >> I would consider data analysis work to be three stages: data
>> preparation,
>> >> statistical analysis, and producing the report.
>> >> This regards the process of getting the data ready for analysis and
>> >> reporting, sometimes called "data cleaning" or "data munging" or "data
>> >> wrangling".
>> >>
>> >> So as regards tools for data preparation, speaking to the highly
>> diverse
>> >> audience mentioned, here is my question:
>> >>
>> >> What do you want?
>> >> Or are you already quite happy with the range of tools that is
>> currently
>> >> before you?
>> >>
>> >> [BTW,  I posed the same question last week to the r-devel list, and was
>> >> advised that r-help might be a more suitable audience by one of the
>> >> moderators.]
>> >>
>> >> Robert Wilkins
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >> __
>> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data cleaning & Data preparation, what do R users want?

2017-11-29 Thread Robert Wilkins

Christopher,

OK, well what about a range of functions in an R package that
automatically, with very little syntax, pulls in data from a variety of
formats (CSV, SQLite, and so on) and converts them to an R data frame. You
seem to be pointing to something like that.
Something like that, in some form or another, probably already exists,
though it might be either imperfect (not as user-friendly as possible) or
not well publicised, or both.
Or another tangent: your co-workers are not going to stop using Excel,
whether you like it or not, and many end-users are stuck in the exact same
position as you (co-workers who deliver the data in Excel). I will guess
that data stored in Excel tends to be dirty in somewhat predictable ways.
(And again, those other end-user's coworkers are not going to change their
behaviour). And so: a data munging tool that makes it as easy as possible
to clean up the data in Excel spreadsheets and export them to R data
frames. One prerequisite: an understanding of what tends to go wrong with
data with Excel ( the data in Excel tends to be dirty, but dirty in what
way?).

Thank you for your response Christopher. What state are you in?

On Wed, Nov 29, 2017 at 11:52 AM, Christopher W. Ryan 
wrote:

> Great question. What do I want? I want my co-workers to stop using Excel
> spreadsheets for data entry, storage, and sharing! I want them to
> understand the value of data discipline. But alas . . . .
>
> I work in a county health department in the US. Between dplyr, stringr,
> grep, grepl, and the base R read() functions, I'm doing OK.
>
> I need to learn more about APIs, so I can see if I can make R directly
> grab data from, e.g. our state health department sources. My biggest
> hassle is having to download a data file, save it somewhere, and then
> open R and read it in. I'd like to be able to do it all in R. Would make
> the generation of recurring reports easier.
>
> --Chris Ryan
>
> Robert Wilkins wrote:
> > R has a very wide audience, clinical research, astronomy, psychology, and
> > so on and so on.
> > I would consider data analysis work to be three stages: data preparation,
> > statistical analysis, and producing the report.
> > This regards the process of getting the data ready for analysis and
> > reporting, sometimes called "data cleaning" or "data munging" or "data
> > wrangling".
> >
> > So as regards tools for data preparation, speaking to the highly diverse
> > audience mentioned, here is my question:
> >
> > What do you want?
> > Or are you already quite happy with the range of tools that is currently
> > before you?
> >
> > [BTW,  I posed the same question last week to the r-devel list, and was
> > advised that r-help might be a more suitable audience by one of the
> > moderators.]
> >
> > Robert Wilkins
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data cleaning & Data preparation, what do R users want?

2017-11-29 Thread Robert Wilkins

R has a very wide audience, clinical research, astronomy, psychology, and
so on and so on.
I would consider data analysis work to be three stages: data preparation,
statistical analysis, and producing the report.
This regards the process of getting the data ready for analysis and
reporting, sometimes called "data cleaning" or "data munging" or "data
wrangling".

So as regards tools for data preparation, speaking to the highly diverse
audience mentioned, here is my question:

What do you want?
Or are you already quite happy with the range of tools that is currently
before you?

[BTW,  I posed the same question last week to the r-devel list, and was
advised that r-help might be a more suitable audience by one of the
moderators.]

Robert Wilkins

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Best way to study internals of R ( mix of C, C++, Fortran, and R itself)?

2017-11-21 Thread Robert Wilkins

How difficult is it to get a good feel for the internals of R, if you want
to learn the general code base, but also the CPU intensive stuff ( much of
it in C or Fortran?) and the ways in which the general code and the CPU
intensive stuff is connected together?

R has a very large audience, but my understanding is that only a small
group have a good understanding of the internals (and some of those will
eventually move on to something else in their career, or retire
altogether).

While I'm at it, a second question: 15 years ago, nobody would ever offer a
job based on R skills ( SAS, yes, SPSS, maybe, but R skills, year after
year, did not imply job offers). How much has that changed, both for R and
for NumPy/Pandas/SciPy ?

thanks in advance

Robert

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How would you program an Adverse Events statistical table using R code?

2012-02-24 Thread Robert Wilkins

A graph != A table.
I'm talking about a page full of summary statistics and advanced
statistics, with lots of cross categories on the top and left margin
of the table, as opposed to a visual display with x-axis and y-axis,
which is totally different.

(An example of how this is done in another language is available  at
http://fivetimesfaster.blogspot.com )

For an AE table, you have an N and % column for every treatment group,
and for all patients combined. On the right side, a categorical
p-value (chi-sq or Fisher's) for every preferred term (every row!
forget multiple testing issues, this is what the boss is asking
for(it's ad-hoc safety analysis))
There's a row for grand total N for each group.
A row for N and % of patients with any event (regardless of body
system and preferred term)
For each body system, there's a section of rows that include:
  A row for N and % of patients with any event (this body system)
  A row for N and % of patients who do NOT have an event( this body system)
  And , of course, within body system, a row for each preferred term
(again N and % for each group , and also the p-value)

Body system and preferred term are, of course broad medical category
and specific medical category.


In the Pharma industry, they use the SAS programming language. Each
table often needs several hundred lines of code. Essentially it's a
combination of analysis and (visual)-reporting mixed together, with
some prerequisite data transformation. (And yes, with this new
language, it can be done in under 20 lines of code).

I have not seen people discuss attempts to do such things with the R
programming language, and how successful such attempts have been. How
hard is it, how much code is it?

In general, we are talking about a variety of complex,
somewhat-nonhomogeneous statistical tables with a variety of different
row sections and row categories, and different column sections and
column categories, and a mixture of summary statistics and advanced
statistics (p-value , least square mean, etc), and sometimes
statistics from different statistical procedures on the same page.

Robert Wilkins

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Boston/Cambridge -- Statistical Programming Language Technology Breakthroughs

2011-12-11 Thread Robert Wilkins

If you are a statistician or researcher working in Boston/Cambridge,
and you have a strong interest in breakthroughs in statistical
programming language technology, contact me.

Robert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Statistical Tables Really Fast

2011-04-14 Thread Robert Wilkins

a new language that can produce complex statistical tables far faster,
with much less code and effort, than any previous statistical
programming language.

a version that outsources ( gives work to do ) to vilno data
transformation and R is already in beta mode, a version that
outsources to SAS/BASE and SAS/STAT is not yet in beta mode.

http://fivetimesfaster.blogspot.com


Robert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to loop thru a matrix or data frame , and append calculations to a new data frame?

2009-10-30 Thread Robert Wilkins

How do you do a double loop through a matrix or data frame , and
within each iteration , do a calculation and append it to a new,
second data frame?
(So, if your original matrix or data frame is 4 x 5 , then 20
calculations are done, and the new data frame, which had 0 rows to
start with, now has 20 rows)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] syntax for estimable(gmodels package) and glht(multcomp package)

2009-10-26 Thread Robert Wilkins

Hello,

I have a question as to how the syntax for glht(package multcomp) and
estimable (gmodels) works, since I'm not getting everything from the
documents I've googled so far, especially with models with 2nd order
terms.

A modestly complex model:
2-way anova with one continuous covariate, no random effects(and no
repeated measures) to keep it modestly complex:
Y = treatmentgroup + sex  + treatmentgroup*sex + weight

treatment has 3 levels : "Placebo" , "DrugA" , "DrugB"
sex has 2 levels

I want to do pairwise comparison(s) for one of the main effects, say
"DrugB" - "Placebo"

And a pairwise comparison at the cell-wise level, for example:
"Female:DrugA" - "Female:Placebo" or
"Female:DrugA" - "Male:DrugA"

The second request is not ambiguous since it's a difference of two
cells, (although the syntax for this request might be simplified if
the main first-order effects are constrained to zero ).

and suppose the marginal sums of the 2nd order terms sum to zero,
both down and across, that should make the first request non-ambigous.

Two things:
1: people in the mail list are having difficulties dealing with
interaction terms with both functions ( I see from googling ) and the
available PDFs don't explicitly deal with these cases.

2: specifying the desired estimate with actual categorical levels in
the calling syntax would be really nice:
i.e.  "Female:DrugA" - "Male:DrugA" , instead of something like ( 0 0
1 -1 0 0 ) , which to me is less intuitive and more prone to error.
One of the PDF s on the internet seem to suggest that estimable can do
this sort of thing for first order terms, but whether this extends to
two-way is not clear.

thanks for your time.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] installing R on Ubuntu, can ignore warning messages?

2009-10-14 Thread Robert Wilkins

It does, thank you. I was able to understand enough of it to do the
install successfully . Still trying to understand the later paragraphs
such as install.package() and the r-cran-foo build dependencies. (the
site you pointed me to is the same site i did a printout of yesterday
to try to do an install, the readme file prints to 3 pages).

Is there an easy way to:
1: List the R-related packages and add-ons that are already installed?
no point in trying to install what you already got!
2: List the R-related packages and add-ons that are available?
Probably a big number of them?

Also, for people who try Ubuntu out for the first time could be thrown
for a loop by the weird way it handles the root account:
https://help.ubuntu.com/community/RootSudo

thanks again.

On Wed, Oct 14, 2009 at 10:38 PM, Ista Zahn  wrote:
> Hi,
> Instructions for authenticating the cran repositories are here:
> http://cran.r-project.org/bin/linux/ubuntu/
>
> r-base comes with whatever the base R libraries are (stats, graphics
> etc.). I don't know if MASS in particular is in base because I don't
> use it directly.
>
> As far as I know it's safe to ignore the warnings, but they annoy me
> so I always following the instructions linked above.
>
> The list of packages regularly updated in the cran repo are also
> listed on the webpage linked above.
>
> A couple of further tips:
> 1) I usually install packages with sudo aptitude install r-cran-xxx
> and then make sure they are up-to date by running update.packages() in
> R.
> 2) You can also install packages using the regular install.packages()
> in an R session.
>
> Hope that helps,
> -Ista
>
> On Wed, Oct 14, 2009 at 10:11 PM, robstdev  wrote:
>> Installing R on Ubuntu 8.10,
>> ( using sudo apt-get install r-base , and using one of the cran sites
>> (cran.cnr.berkeley.edu))
>>
>> the installation process says something about not having some gpg
>> public key and
>> "are you sure you want to download non-authenticated stuff [y/n]"  (to
>> which I answered yes).
>> I'm assuming this warning can be ignored?
>>
>> Also: even though the Ubuntu install and online update did a GCC
>> install the other day, the R installation did an update of some GCC
>> files, which I thought was odd. Probably I can ignore that too.
>>
>> Once you've installed R, does that automatically include some data
>> examples ( such as that MASS library ? )?
>> Or does that require further downloads?
>>
>> Also, thanks for the previous tips
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Ista Zahn
> Graduate student
> University of Rochester
> Department of Clinical and Social Psychology
> http://yourpsyche.org
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Installing R on Ubuntu ( 8.10 ) ?

2009-10-13 Thread Robert Wilkins

installing on Ubuntu, how to do it and have people found it to be glitchy?

which is easier , binary install or from source ?

With the source install, are you less likely to have a dependencies issue ?

( Ubuntu does the GCC install seamlessly, but has no mention of R )

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] easy way to find all extractor functions and the datatypes of what they return

2009-10-10 Thread Robert Wilkins

Am I asking for too much:
for any object that a stat proc returns  ( y <- lm( y~x) , etc ) ) , is there
a super convenient function like give_all_extractors( y ) that lists all
extractor functions , the datatype returned , and a text descriptor
field ("pairwisepval" "lsmean" etc)

That would just be so convenient.

What are my options for querying an object so that I can quickly learn
the extractor functions to pull out the data and manipulate it?
Will the datatypes returned usually be named vectors and named
matrices, indiced by categorical values in the data
( "Male" "Female"  "Placebo" "DrugB" etc )? If they are indexed by 1 ,
2 , 3 , 4 , it's easier to lose track.

thanks a bunch in advance

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installing R on Suse 11.1 , cannot figure it out

2009-10-07 Thread Robert Wilkins

I believe i did last night: a cran site in Pittsburgh, with a
"Install" link that I believe you are referring to, it just didnt
work, unless you are referring to a different web site. The r site,
eventually leads to a list of install choices ( download from
different locations , such as michigan, pittsburgh, etc) Your example
is Iowa?

Not having GCC preinstalled ( C and Fortran ) might be a factor. When
you install LInux, it should just install GCC , just like that. I
mean, that's just wrong.

this blog post sheds some light maybe:
http://www.flexbeta.net/main/articles.php?action=show&id=70&perpage=1&pagenum=5

When you install from source ( which I can't , because I can't figure
out how to install GCC) , does the source install have binary
dependencies?

On Wed, Oct 7, 2009 at 9:10 PM, Cedrick W. Johnson
 wrote:
> see below:
>
> Robert Wilkins wrote:
>>
>> Can't figure out how the install works, it is certainly not automatic.
>> Also , the "Install" option on the R web site for Suse 11.1 does not work.
>> And the install software native to Suse, cannot figure out.
>>
>> Does Suse have more problems installing software than Fedora or Ubuntu?
>>
>
> Did you try installing the RPMS listed under your favorite CRAN mirror?
> (note, I didn't use the "install" links for the Readme file, I think you
> could grab these using 'wget'
>
> http://streaming.stat.iastate.edu/CRAN/bin/linux/suse/11.1/RPMS/i586/R-base-2.9.0-2.1.i586.rpm
> -and devel-
> http://streaming.stat.iastate.edu/CRAN/bin/linux/suse/11.1/RPMS/i586/R-base-devel-2.9.0-2.1.i586.rpm
>
> You may also have to compile R from source...
>
>> Or is this a hassle for any Linux distro? And Windows?
>>
>
> My windows installs have been relatively hassle-free (5 workstations). I
> just finished setting up a small cluster (3) of ubuntu R instances in under
> 30 minutes. So, your mileage may vary. I've found Ubuntu to be rather simple
> to install R instances.
>
> Hope this helps
> c
>
>
> =
> *Cedrick W. Johnson*
> aolim) cedrickjcvgr
> www.cedrickjohnson.com
> *New York - Chicago*
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] To hell with OpenSuse, ditch it and go to Ubuntu

2009-10-07 Thread Robert Wilkins

this blog entry

http://www.viggie.com/blog/software/opensuse-ubuntu-usage-experience

, if credible , would seem to suggest that there is no good reason to
choose Suse.
I really don't have time for such nonsense, maybe I'll just reinstall as Ubuntu.

Also, noticed that GCC was not installed when Suse installed. That's
just weird, GCC is part of Linux.
And if you try to install GCC, the available options , are , again ,
thoroughly confusing.
It appears the Software-Install features of Suse are just not very robust.

Is Fedora any better?
Do you think that blog post is accurate in comparing Ubuntu and Suse?
Since Mandriva apparently has little market share or support in the
US, I guess I won't do with that.
So it's Ubuntu or Fedora.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Installing R on Suse 11.1 , cannot figure it out

2009-10-07 Thread Robert Wilkins

Can't figure out how the install works, it is certainly not automatic.
Also , the "Install" option on the R web site for Suse 11.1 does not work.
And the install software native to Suse, cannot figure out.

Does Suse have more problems installing software than Fedora or Ubuntu?
Or is this a hassle for any Linux distro? And Windows?

Rob

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R on Linux, and R on Windows , any difference in maturity+stability?

2009-10-05 Thread Robert Wilkins

Will R have more glitches on one operating system as opposed to
another, or is it pretty much the same?

robert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Gentleman and Ihaka's integrity in question

2009-01-19 Thread Robert Wilkins

It does look like Gentleman and Ihaka not only lied to the New York
Times, but also to the New Zealand Herald and who knows who else. This
is disgusting. The R programming language is the S programming
language, and Gentleman and Ihaka are not the ones who designed it.

http://thenewyorktimesissloppy.blogspot.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] What does R have for age-adjusted survey analysis?

2009-01-08 Thread Robert Wilkins

A procedure that , after adjusting for sampling weights, also
explicitly does an age adjustment to conform with an age distribution
of an older census?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] survey statistics, rate/proportions with standard errors

2009-01-08 Thread Robert Wilkins

what does R have to compare with , say , proc surveymeans, estimate survey
means/proportions with standard errors, using Taylor methods?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] AT&T Researchers and the New York Times

2009-01-08 Thread Robert Wilkins

Is anyone in the leadership of the R-project going to contact the New
York Times and clarify that the article gave remarkably short shrift
to the people who designed the user interface for R, to a large extent
AT&T researchers from an earlier generation? It would be the
appropriate thing to do.

The R team did not develop the user interface for R, the designers of
the S programming language did. The layman reader of Vance's article
will get the impression that R is a brand new invention, which is
misleading and unfair. Gentleman and Ihaka should try harder to give
credit where credit is due.


And by the way, ARE YOU GUYS EVER GOING TO FIX your mailing list
platform? It is extremely user-unfriendly and a technological clunk.
The mailing lists for SAS, Python , and others (UseNet) may not be a
user-interface-work-of-genius, but they are far superior to the R
mailing list. What a clunk.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] The AT&T researchers and the New York Times

2009-01-08 Thread Robert Wilkins

Is anyone in the leadership of the R-project going to contact the New York
Times and clarify that the article gave remarkably short shrift to the
people who designed the user interface for R, to a large extent AT&T
researchers from an earlier generation? It would be the appropriate thing to
do.

The R team did not develop the user interface for R, the designers of the S
programming language did. The layman reader of Vance's article will get the
impression that R is a brand new invention, which is misleading and unfair.
Gentleman and Ihaka should try harder to give credit where credit is due.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Ashlee Vance's article on R in the New York Times

2009-01-07 Thread Robert Wilkins

Ashlee Vance's article on R in the New York Times.

This is typical of the New York Times. Because they get to coast on the
prestige and reputation of their brand , they have a history of just this
sort of journalistic sloppiness. Whether it's the author or the editor at
fault doesn't really matter, they do this screw-up all the time.

Look, if you write an article on the first page of the business section,
you're not just presenting yourself as a writer or entertainer, you're
presenting yourself as a journalist, and that implies two commitments:

1: I believe that my writing is true , and as fair and balanced as
appropriate in the context.
2: I've invested the time in research and fact-checking so that point #1
actually has credibility.

Vance clearly fails on point #2. He just didn't do his homework. And as I've
seen over the years, this is typical for NYT contributors. That's
complacency. A bit like SAS Institute - NYT is overly reliant on it's brand
name.

First of all, the third paragraph is a falsehood. I'm not saying Vance is
lying. I'm saying he's lazy. A couple of hours of research, and he could
have corrected that.

If you find computer programming to be tedious, unpleasant, or quite
difficult, then R is the wrong software for you. R has a reputation for
having a tougher learning curve than the SAS programming language. Even if
you disagree, neither is appropriate for people who don't have the time and
patience to study programming languages.

Vance's article is also deeply misleading , he gives the wrong impression of
where R actually came from, and who deserves credit for what. It's
especially glaring given that he does briefly mention R's precursor, S. Yet,
funny that, he neglects to mention that S and R basically use the same user
interface ( the same programming language ). Hey Vance, um, that's a big
oversight.

R is a quality software package, with years of development and debugging,
and substantial documentation, and diverse and reliable statistical function
libraries. The R project team deserves a great deal of credit for this. But
they don't deserve all of the credit. A great deal of the R software product
was already achieved before the R team ever came along. There is a tendency
to poo-poo the blood and sweat that go into the design of the user
interface.

The choices made when designing the user interface of any data analysis tool
are critical, whether GUI or language. Assuming the CPU is not overloaded,
which is often the case, it is the user interface that makes the difference
between a piece of cake , and hours lost coding what should have been a
routine task.

Well, Gentleman and Ihaka did not design the user interface for R. AT&T
researchers did, during the cold war. It's possible that a few employees at
proprietary software companies also contributed. It might have been largely
financed by American taxpayers, because there were a lot of backroom deals
during the cold war, and AT&T was typically in the thick of it. The user
interface for R, otherwise known as the S programming language has the same
origins as C and Unix.

Some R promoters point out that R has lexical scope and lots of Scheme
goodness. ( and what widespread programming language today does not have
lexical scope? ). But other R promoters point out that programs in S-Plus
usually work in R, and vice-versa. Well, in that case, then it's the same
damn programming language!

Quite likely, the R founders were careful to point this out in their
interviews with Vance. Even if they forgot, minutes of research on Vance's
part would have told him that. The New York Times - sloppy as usual. More
like an advertisement than a bona fide article.

And the upshot of this , in the outlook for statistical software, is that
regarding the strengths and (considerable) limitations of the three
classical statistical programming languages ( S, SAS, SPSS) , R really
doesn't change anything at all. I definitely like the pricetag though. And
that does not mean that R cannot achieve a quality and reliability
comparable to S-Plus and SAS, not withstanding Milley's snide comment. But
if you want to attack the chronic and painful productivity problems with
data preparation and statistical table production, you need to go beyond R
and SAS. You have to develop new user interfaces, and that is very risky,
and takes years of technical work and marketing.

And, to be honest, that is not what open source developers are willing to
do. In the majority of software categories, including specialized languages(
such as statistical), open source developers are not motivated to develop
user interfaces that make a ground-breaking difference in the user's
productivity level. One big, and crucial exception is the category of
all-purpose programming languages. Thousands of open source developers go to
bed dreaming of being the next Larry Wall. Thankfully, we have Ruby and
Python as a result.

[[alternative HTML version deleted]]

[R] Estimating the standard error when you have sampling weights.

2008-11-24 Thread Robert Wilkins

Hi,

Where can I find information ( freely available on the Internet , and also
books or other sources ) on how having sampling weights changes the
calculation of the standard error (of means and proportions)?

How good is R for this type of procedure? And SAS?

thanks

Robert

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Clinical Trial data sets in public domain?

Re: [R] Data cleaning & Data preparation, what do R users want?

Re: [R] Data cleaning & Data preparation, what do R users want?

[R] Data cleaning & Data preparation, what do R users want?

[R] Best way to study internals of R ( mix of C, C++, Fortran, and R itself)?

[R] How would you program an Adverse Events statistical table using R code?

[R] Boston/Cambridge -- Statistical Programming Language Technology Breakthroughs

[R] Statistical Tables Really Fast

[R] how to loop thru a matrix or data frame , and append calculations to a new data frame?

[R] syntax for estimable(gmodels package) and glht(multcomp package)

Re: [R] installing R on Ubuntu, can ignore warning messages?

[R] Installing R on Ubuntu ( 8.10 ) ?

[R] easy way to find all extractor functions and the datatypes of what they return

Re: [R] Installing R on Suse 11.1 , cannot figure it out

[R] To hell with OpenSuse, ditch it and go to Ubuntu

[R] Installing R on Suse 11.1 , cannot figure it out

[R] R on Linux, and R on Windows , any difference in maturity+stability?

[R] Gentleman and Ihaka's integrity in question

[R] What does R have for age-adjusted survey analysis?

[R] survey statistics, rate/proportions with standard errors

[R] AT&T Researchers and the New York Times

[R] The AT&T researchers and the New York Times

[R] Ashlee Vance's article on R in the New York Times

[R] Estimating the standard error when you have sampling weights.

24 matches

Site Navigation

Mail list logo

Footer information