[R] mgcv::gam() scale parameter estimates for quasibinomial error models

2021-04-14 Thread John Maindonald
For both glm() and mgcv::gam() quasibinomial error models, the summary
object has a dispersion value that has the same role as sigma^2 in the
summary object for lm() model fits.

Where some fitted probabilities are small, the `gam()` default scale parameter 
estimates, returned as `scale` (and `sig2`) in the gam object and as 
“dispersion" 
in the summary object, can differ wildly from the Pearson values that `glm()`
works with, and that can be obtained by calling `gam()` with a suitable control
setting (see the code that now follows.)

The following demonstrates the issue:

  ## ‘mgcv’ version 1.8-34
  Cholera <- HistData:: Cholera
  library(mgcv)
  form <- cbind(cholera_deaths, popn-cholera_deaths) ~ 
 water + elevation + poor_rate
  default.gam <- gam(form, data=Cholera, family=quasibinomial)
  pearson.gam <- update(quasibin.gam, control=gam.control(scale.est=“pearson"))

  c(Pearson=pearson.gam$scale, Default=default.gam$scale)
  ##  Pearson Default 
  ## 33.545829  2.919535 

My own calculation (from either fitted model), returns 30.07 for the
(Fletcher 2012) version of the dispersion that was, according to 
Wood’s “Generalized Additive Models” (2nd edn, 2017, p.111),
returned as the GAM scale estimate at the time when the book was 
written. 

The default scale estimates returned by `gam()` vary wildly, relative 
to the relatively stable “pearson" estimates, in data that are simulated
to have comparable dispersion estimates.

For the Cholera data, it would make good sense to fit a model
with quasipoisson errors to the death counts, using log(popn) as an
offset.  The GAM model then uses, with default setting for `scale.est`, 
a scale parameter that is close to that returned by "pearson.gam”. 
SEs (as well as coefficients) are similar to those returned by 
"pearson.gam”. 

The detailed calculations that are documented in the following 
documents may be of some general interest.
  https://www.dropbox.com/s/vl9usat07urbgel/quasibin-gam.pdf?dl=0
  or https://www.dropbox.com/s/s83mh1mut5xc3gk/quasibin-gam.html?dl=0

I am posting this here now before posting a bug report in case I have
missed something important.  It would be useful to be directed to the 
mgcv code used for calculation of what is returned as the Fletcher statistic.

  Rmd file: https://www.dropbox.com/s/ghmsdcvgxp068bs/quasibin-gam.Rmd?dl=0
  .bib file: https://www.dropbox.com/s/r1yjqx0sni2pzjy/quasi.bib?dl=0

John Maindonald email: john.maindon...@anu.edu.au


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-help Digest, Vol 217, Issue 20

2021-03-20 Thread John Maindonald
No, it is not distribution free.  Independent random sampling is assumed.
That is a non-trivial assumption, and one that is very often not true or not
strictly true.


John Maindonald email: 
john.maindon...@anu.edu.au<mailto:john.maindon...@anu.edu.au>


On 21/03/2021, at 00:00, 
r-help-requ...@r-project.org<mailto:r-help-requ...@r-project.org> wrote:

From: Jiefei Wang mailto:szwj...@gmail.com>>
Subject: Re: [R] about a p-value < 2.2e-16
Date: 20 March 2021 at 04:41:33 NZDT
To: Spencer Graves 
mailto:spencer.gra...@effectivedefense.org>>
Cc: Bogdan Tanasa mailto:tan...@gmail.com>>, Vivek Das 
mailto:vd4mm...@gmail.com>>, r-help 
mailto:r-help@r-project.org>>


Hi Spencer,

Thanks for your test results, I do not know the answer as I haven't
used wilcox.test for many years. I do not know if it is possible to compute
the exact distribution of the Wilcoxon rank sum statistic, but I think it
is very likely, as the document of `Wilcoxon` says:

This distribution is obtained as follows. Let x and y be two random,
independent samples of size m and n. Then the Wilcoxon rank sum statistic
is the number of all pairs (x[i], y[j]) for which y[j] is not greater than
x[i]. This statistic takes values between 0 and m * n, and its mean and
variance are m * n / 2 and m * n * (m + n + 1) / 12, respectively.

As a nice feature of the non-parametric statistic, it is usually
distribution-free so you can pick any distribution you like to compute the
same statistic. I wonder if this is the case, but I might be wrong.

Cheers,
Jiefei



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regarding R licensing usage guidance

2019-07-23 Thread John Maindonald
I think this unfair, certainly the ‘odious’ comment.  Very many of us
have been able to contribute because supported to do such work
while occupying relatively comfortable academic positions.  In the
process of getting there, we have benefited enormously from the
work of those who have gone before.  I do not know what Anamika’s
position is, but he may well be trying to build a business from a
much less privileged starting point than others of us may have
enjoyed.  In any case, the comment strikes a tone that is out of
place in respectful discourse.


John Maindonald email: 
john.maindon...@anu.edu.au<mailto:john.maindon...@anu.edu.au>


On 23/07/2019, at 22:00, 
r-help-requ...@r-project.org<mailto:r-help-requ...@r-project.org> wrote:

From: Jeff Newmiller mailto:jdnew...@dcn.davis.ca.us>>
Subject: Re: [R] Regarding R licensing usage guidance
Date: 23 July 2019 at 07:00:26 NZST
To: mailto:r-help@r-project.org>>, ANAMIKA KUMARI 
mailto:anamika1...@gmail.com>>


Your internet skills are pathetic. Search Google for "proprietary use gpl" and 
the first hit is

https://opensource.stackexchange.com/questions/7078/is-it-legal-to-use-gpl-code-in-a-proprietary-closed-source-program-by-putting-i

Note that there are (at least) three obvious alternatives if there is any 
question in your case: release the code under GPL but also sell it with support 
(a la RStudio); only use it yourself (don't distribute it at all); or only use 
R for setting up your models but re-engineer implementations of the run-time 
prediction calculations yourself (often much easier than the initial algorithm 
development).

I think your desperation to steal the hard work of the various R contributors 
seems quite odious.

On July 22, 2019 6:55:31 AM PDT, ANAMIKA KUMARI 
mailto:anamika1...@gmail.com>> wrote:
Hello Team,

This mail is in reference to understanding R  license and also usage of
R
language to develop commercialised product.

I am working on one product of mine that uses R and python language.I
am
trying to understand the licensing issue if any related to R, if I am
going
to commercialise my product and keep my work proprietary.

I need your help to understand it. R comes under GNU-GPL-2.0. Now, do I
need to share source code of my product. if, I am moving planning  to
move
it to production or I can keep my code Proprietary.

Please note that I am just using R and its packages to  develop my own
statistical tool and api and Have not done any changes to existing R
code.

Please refer below for all R-packages used in my code:-

 1.
*R-3.4.4 *
 2. *'spacyr'*
 3.
*'jsonlite' *
 4.
*'lubridate' *
 5.
*'data.table' *
 6.
*'png' *
 7.
*'maps' *
 8.
*'countrycode' *
 9.
*'humaniformat' *
 10.
*'ngram' *
 11.
*'stringr' *
 12.
*'slam' *
 13.
*'tm' *
 14.
*'lsa' *
 15.
*'RTextTools' *
 16.
*'stringi' *
 17.
*'plumber' *
 18. *"Rook"*
 19. *"pdftools"*
 20. *'tokenizers'*
 21. *'zoo'*
 22. *"tidyr"*
 23. *"reqres"*
 24. *"rJava"*
 25. *"tiff"*
 26. *"splitstackshape"*
 27. *"stringdist"*
 28. *"RJSONIO"*
 29. *"ropensci/tabulizer"*
 30. *"staplr"*
 31. *"SparseM"*
 32. *"randomForest"*
 33. *"e1071"*
 34. *"ipred"*
 35. *"caTools"*
 36. *RCMD INSTALL maxent_1.3.3.1.tar.gz*
 37. *RCMD INSTALL tree_1.0-39.tar.gz*
 38. *RCMD INSTALL RTextTools_1.4.2.tar.gz*


*Any help from you will be highly appreciated as I am literally stuck
at a
dead end.*

*Regards*
*Anamika Kumari*

[[alternative HTML version deleted]]

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] p values from GLM

2016-04-03 Thread John Maindonald
How small does a p-value need to be to warrant attention, however?

Witness Fisher’s comment that:
“. . . we may, if we prefer it, draw the line at one in fifty (the 2 per cent 
point), or one in a hundred (the 1 per cent point). Personally, the writer 
prefers to set a low standard of significance at the 5 per cent point, and 
ignore entirely all results which fail to reach this level. A scientific fact 
should be regarded as experimentally established only if a properly designed 
experiment rarely fails to give this level of significance.”
[Fisher RA (1926), “The Arrangement of Field Experiments,” Journal of the 
Ministry of Agriculture of Great Britain, 33, 503-513.]

See the selection of Fisher quotes at http://www.jerrydallal.com/lhsp/p05.htm .

In contexts where a p <= 0.05 becomes more likely under the NULL (not the case 
if the experiment might just as well have been a random number generator), 
small P-values shift the weight of evidence.  An alternative that is apriori 
highly unlikely takes a lot of shifting.


John Maindonald email: 
john.maindon...@anu.edu.au<mailto:john.maindon...@anu.edu.au>


On 3/04/2016, at 22:00, 
r-help-requ...@r-project.org<mailto:r-help-requ...@r-project.org> wrote:

From: Heinz Tuechler mailto:tuech...@gmx.at>>
Subject: Re: [R] p values from GLM
Date: 3 April 2016 11:00:50 NZST
To: Bert Gunter mailto:bgunter.4...@gmail.com>>, Duncan 
Murdoch mailto:murdoch.dun...@gmail.com>>
Cc: r-help mailto:R-help@r-project.org>>



Bert Gunter wrote on 01.04.2016 23:46:
... of course, whether one **should** get them is questionable...

http://www.nature.com/news/statisticians-issue-warning-over-misuse-of-p-values-1.19503#/ref-link-1

This paper repeats the common place statement that a small p-value does not 
necessarily indicate an important finding. Agreed, but maybe I overlooked 
examples of important findings with large p-values.
If there are some, I would be happy to get to know some of them. Otherwise a 
small p-value is no guarantee of importance, but a prerequisite.

best regards,

Heinz


Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Apr 1, 2016 at 3:26 PM, Duncan Murdoch 
mailto:murdoch.dun...@gmail.com>> wrote:
On 01/04/2016 6:14 PM, John Sorkin wrote:
How can I get the p values from a glm ? I want to get the p values so I
can add them to a custom report


  fitwean<-
glm(data[,"JWean"]~data[,"Group"],data=data,family=binomial(link ="logit"))
  summary(fitwean) # This lists the coefficeints, SEs, z and p
values, but I can't isolate the pvalues.
  names(summary(fitwean))  # I see the coefficients, but not the p values
  names(fitmens)  # p values are not found here.

Doesn't summary(fitwean) give a matrix? Then it's
colnames(summary(fitwean)$coefficients) you want, not names(fitwean).

Duncan Murdoch

P.S. If you had given a reproducible example, I'd try it myself.




Thank you!
John

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and
Geriatric Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for the sole use of the
intended recipient(s) and may contain confidential and privileged
information. Any unauthorized use, disclosure or distribution is prohibited.
If you are not the intended recipient, please contact the sender by reply
email and destroy all copies of the original message.
__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html<http:/

Re: [R] SWEAVE - a gentle introduction

2015-11-18 Thread John Maindonald
What’s more, for pdf output one can use R Markdown and judiciously
sneak in html and/or LaTeX (consider however what the processing
steps might do to such markup).


John Maindonald email: 
john.maindon...@anu.edu.au<mailto:john.maindon...@anu.edu.au>


On 19/11/2015, at 00:00, 
r-help-requ...@r-project.org<mailto:r-help-requ...@r-project.org> wrote:

From: Duncan Murdoch mailto:murdoch.dun...@gmail.com>>
Subject: Re: [R] SWEAVE - a gentle introduction
Date: 18 November 2015 08:09:34 NZDT
To: Marc Schwartz mailto:marc_schwa...@me.com>>, John 
Sorkin mailto:jsor...@grecc.umaryland.edu>>
Cc: R-help mailto:r-help@r-project.org>>


On 17/11/2015 10:42 AM, Marc Schwartz wrote:

On Nov 17, 2015, at 9:21 AM, John Sorkin 
mailto:jsor...@grecc.umaryland.edu>> wrote:

I am looking for a gentle introduction to SWEAVE, and would appreciate 
recommendations.
I have an R program that I want to run and have the output and plots in one 
document. I believe this can be accomplished with SWEAVE. Unfortunately I don't 
know HTML, but am willing to learn. . . as I said I need a gentle introduction 
to SWEAVE.
Thank you,
John



John,

A couple of initial comments.

First, you will likely get some recommendations to also consider using Knitr:

  http://yihui.name/knitr/

which I do not use myself (I use Sweave), but to be fair, is worth considering 
as an alternative.

He did, and I'd agree with them.  I've switched to knitr for all new projects 
and some old ones.  knitr should be thought of as Sweave version 2.

Duncan Murdoch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R wont accept my zero count values in the GLM with quasi_poisson dsitribution

2015-07-28 Thread John Maindonald
A further point to note is that with a log link, SEs for comparisons with any 
factor
level where counts are all zero will be huge and meaningless.  This phenomenon
has the name Hauck-Donner effect, though more commonly so identified for
comparisons with categories with very low or very high estimated proportions
for binomial data.  For the poisson or quasipoisson family, use of the sqrt link
avoids this problem.


John Maindonald email: 
john.maindon...@anu.edu.au<mailto:john.maindon...@anu.edu.au>


On 28/07/2015, at 22:00, 
r-help-requ...@r-project.org<mailto:r-help-requ...@r-project.org> wrote:

From: Andrew Robinson 
mailto:a.robin...@ms.unimelb.edu.au>>
Subject: Re: [R] R wont accept my zero count values in the GLM with 
quasi_poisson dsitribution
Date: 28 July 2015 18:59:51 NZST
To: Charlotte 
mailto:charlotte.hu...@griffithuni.edu.au>>
Cc: "R help (r-help@r-project.org<mailto:r-help@r-project.org>)" 
mailto:r-help@r-project.org>>


You have selected the binomial family in the call to glm.  You should
instead try something like

family=quasipoisson(link = "log")

I hope this helps

Andrew

On Tue, Jul 28, 2015 at 4:33 PM, Charlotte <
charlotte.hu...@griffithuni.edu.au<mailto:charlotte.hu...@griffithuni.edu.au>> 
wrote:

Hello

I have count values for abundance which follow a pattern of over-dispersal
with many zero values.  I have read a number of documents which suggest
that
I don't use data transforming methods but rather than I run the GLM with
the
quasi poisson distribution.  So I have written my script and R is telling
me
that Y should be more than 0.

Everything I read tells me to do it this way but I can't get R to agree.
Did I need to add something else to my script to get it to work and keep my
data untransformed? The script I wrote is as follows:

fit <- glm(abundance~Gender,data=teminfest,family=binomial())

then I get this error
Error in eval(expr, envir, enclos) : y values must be 0 <= y <= 1

I don't use R a lot so I am having trouble figuring out what to do next.

I would appreciate some help

Many Thanks
Charlotte






--
View this message in context:
http://r.789695.n4.nabble.com/R-wont-accept-my-zero-count-values-in-the-GLM-with-quasi-poisson-dsitribution-tp4710462.html
Sent from the R help mailing list archive at Nabble.com<http://nabble.com/>.

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.




--
Andrew Robinson
Deputy Director, CEBRA, School of Biosciences
Reader & Associate Professor in Applied Statistics  Tel: (+61) 0403 138 955
School of Mathematics and StatisticsFax: +61-3-8344
4599
University of Melbourne, VIC 3010 Australia
Email: a.robin...@ms.unimelb.edu.au<mailto:a.robin...@ms.unimelb.edu.au>
Website: http://www.ms.unimelb.edu.au/~andrewpr

MSME: http://www.crcpress.com/product/isbn/9781439858028
FAwR: http://www.ms.unimelb.edu.au/~andrewpr/FAwR/
SPuR: http://www.ms.unimelb.edu.au/spuRs/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R: Re: Differences in output of lme() when introducing interactions

2015-07-23 Thread John Maindonald
Do you have legal advice that suing the University (if that is the right 
context)
would actually be a fruitful way forwards, that it would achieve anything
useful within reasonable time and without causing the student severe
financial risk?

What may work in that context is for students to collectively complain that
important aspects of their training and support are being neglected.
With the rapidity of recent technological change, the issue is widespread.
To an extent, able post-docs and PhDs have to lead the charge in getting
training and support updated and brought into the modern world.


John Maindonald email: 
john.maindon...@anu.edu.au<mailto:john.maindon...@anu.edu.au>

On 22/07/2015, at 22:00, 
r-help-requ...@r-project.org<mailto:r-help-requ...@r-project.org> wrote:

Da: li...@dewey.myzen.co.uk<mailto:li...@dewey.myzen.co.uk>
Data: 21-lug-2015 11.58
A: 
"angelo.arc...@virgilio.it<mailto:angelo.arc...@virgilio.it>"mailto:angelo.arc...@virgilio.it>>,
 mailto:bgunter.4...@gmail.com>>
Cc: mailto:r-help@r-project.org>>
Ogg: Re: R: Re: [R] R: Re: Differences in output of lme() when introducing 
interactions

Dear Angelo

I suggest you do an online search for marginality which may help to
explain the relationship between main effects and interactions. As I
said in my original email this is a complicated subject which we are not
going to retype for you.

If you are doing this as a student I suggest you sue your university for
failing to train you appropriately and if it is part of your employment
I suggest you find a better employer.

On 21/07/2015 10:04, 
angelo.arc...@virgilio.it<mailto:angelo.arc...@virgilio.it> wrote:
Dear Bert,
thank you for your feedback. Can you please provide some references
online so I can improve "my ignorance"?
Anyways, please notice that it is not true that I do not know statistics
and regressions at all, and I am strongly
convinced that my question can be of interest for some one else in the
future.

This is what forums serve for, isn't it? This is why people help each
other, isn't it?

Moreover, don't you think that I would not have asked to this R forum if
I had the possibility to ask or pay a statician?
Don't you think I have done already my best to study and learn before
posting this message? Trust me, I have read different
online tutorials on lme and lmer, and I am confident that I have got the
basic concepts. Still I have not found the answer
to solve my problem, so if you know the answer can you please give me
some suggestions that can help me?

I do not have a book where to learn and unfortunately I have to analyze
the results soon. Any help? Any online reference to-the-point
that can help me in solving this problem?

Thank you in advance

Best regards

Angelo


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in lm() with very small (close to zero) regressor

2015-03-29 Thread John Maindonald
There are two issues:
1) Damage was done to accuracy in the calculation of t(X) %*% X
Where the mean to SD ratio is large, maybe even of the order of
100 or so, centre that column.

2) pseudo inverse() goes awry with columns of X that are of the
order of large negative (or, I expect, +ve) powers of ten.
were supplied to it. I�d guess that this has to do with the way that
a check for singularity is implemented.

Ben�s example:

set.seed(101)
n_obs <- 1000
y  <- rnorm(n_obs, 10,2.89)
x1 <- rnorm(n_obs, mean=1.235657e-14,sd=4.5e-17)
x2 <- rnorm(n_obs, 10,3.21)
X  <- cbind(rep(1,n_obs),x1,x2)

coef(lm(y~x1+x2))
##   (Intercept)x1x2
##  1.155959e+01 -1.658420e+14  3.797342e-02

library(corpcor)
t(cbind(coef(lm(y~x1+x2)),
   as.vector(pseudoinverse(t(X) %*% X) %*% t(X) %*% y)))
##  (Intercept)x1 x2
## [1,]   11.559587 -1.658420e+14 0.03797342
## [2,] 9.511348  1.151908e-13  0.03788714

## Examine mean to SD ratios
round(c(mean(x1)/sd(x1), mean(x2)/sd(x2)), 2)
## [1] 263.65   3.28
## Notice that the coefficients for x2, where the mean/sd ratio
## is smaller, roughly agree.

## Damage was done to accuracy in the calculation of t(X) %*% X
## (but there is more to it than that, as we will see).

## Try
xc1 <- scale(x1,center=TRUE, scale=FALSE)
xc2 <- scale(x2,center=TRUE, scale=FALSE)
Xc <- cbind(rep(1,n_obs),x1,x2)
as.vector(pseudoinverse(t(Xc) %*% Xc) %*% t(Xc) %*% y)
## [1] 9.511348e+00 1.151908e-13 3.788714e-02

## Note now, however, that one should be able to dispense with
## the column of 1s, with no change to the coefficients of x1 & x2
Xc0 <- cbind(xc1,xc2)
as.vector(pseudoinverse(t(Xc0) %*% Xc0) %*% t(Xc0) %*% y)
## [1] 1.971167e-20 3.788714e-02

##
## Now try a more sensible scaling for xc1
Xcs <- cbind(rep(1,n_obs), xc1*1e14, xc2)
Xcs0 <- cbind(xc1*1e14, xc2)

t(cbind(coef(lm(y~I(1e14*xc1)+xc2)),
as.vector(pseudoinverse(t(Xcs) %*% Xcs) %*% t(Xcs) %*% y)))
##  (Intercept) I(1e+14 * xc1)xc2
## [1,]9.899249   -1.65842 0.03797342
## [2,]9.899249   -1.65842 0.03797342

## Eureka!

## cf also
as.vector(pseudoinverse(t(Xcs0) %*% Xcs0) %*% t(Xcs0) %*% y)
## [1] -1.65842037  0.03797342


John Maindonald email: 
john.maindon...@anu.edu.au<mailto:john.maindon...@anu.edu.au>

Centre for Mathematics & Its Applications,

Australian National University, Canberra ACT 0200.


On 29/03/2015, at 23:00, 
mailto:r-help-requ...@r-project.org>> 
mailto:r-help-requ...@r-project.org>> wrote:

From: Ben Bolker mailto:bbol...@gmail.com>>
Subject: Re: [R] Error in lm() with very small (close to zero) regressor
Date: 29 March 2015 6:28:22 NZDT
To: mailto:r-h...@stat.math.ethz.ch>>


peter dalgaard  gmail.com<http://gmail.com/>> writes:



On 28 Mar 2015, at 00:32 , RiGui  
business.uzh.ch<http://business.uzh.ch/>> wrote:

Hello everybody,

I have encountered the following problem with lm():

When running lm() with a regressor close to zero -
of the order e-10, the
value of the estimate is of huge absolute value , of order millions.

However, if I write the formula of the OLS estimator,
in matrix notation:
pseudoinverse(t(X)*X) * t(X) * y , the results are correct, meaning the
estimate has value 0.

 How do you know this answer is "correct"?

here is the code:

y  <- rnorm(n_obs, 10,2.89)
x1 <- rnorm(n_obs, 0.01235657,0.45)
x2 <- rnorm(n_obs, 10,3.21)
X  <- cbind(x1,x2)

Eh?  You want
X  <- cbind(rep(1,n_obs),x1,x2)

bFE <- lm(y ~ x1 + x2)
bFE

bOLS <- pseudoinverse(t(X) %*% X) %*% t(X) %*% y
bOLS


Note: I am applying a deviation from the
mean projection to the data, that
is why I have some regressors with such small values.

Thank you for any help!

Raluca

 Is there a reason you can't scale your regressors?


Example not reproducible:


 I agree that the OP's question was not reproducible, but it's
not too hard to make it reproducible. I bothered to use
library("sos"); findFn("pseudoinverse") to find pseudoinverse()
in corpcor:

It is true that we get estimates with very large magnitudes,
but their

set.seed(101)
n_obs <- 1000
y  <- rnorm(n_obs, 10,2.89)
x1 <- rnorm(n_obs, mean=1.235657e-14,sd=4.5e-17)
x2 <- rnorm(n_obs, 10,3.21)
X  <- cbind(x1,x2)

bFE <- lm(y ~ x1 + x2)
bFE
coef(summary(bFE))

Estimate   Std. Error t value  Pr(>|t|)
(Intercept)  1.155959e+01 2.312956e+01  0.49977541 0.6173435
x1  -1.658420e+14 1.872598e+15 -0.08856254 0.9294474
x2   3.797342e-02 2.813000e-02  1.34992593 0.1773461

library("corpcor")
bOLS <- pseudoinverse(t(X) %*% X) %*% t(X) %*% y
bOLS

[,1]
[1,] 9.807664e-16
[2,] 8.880273e-01

And if we scale the predictor:

bFE2 <- lm(y ~ I(1e14*x1) + x2)
coef(summary(bFE2))

Estimate Std. Error t value  Pr(&g

Re: [R] the making of _R_ eBooks

2015-03-24 Thread John Maindonald
Thanks for that.  Useful to have that question asked and to get that
information.


John Maindonald email: 
john.maindon...@anu.edu.au<mailto:john.maindon...@anu.edu.au>


On 25/03/2015, at 0:00, 
r-help-requ...@r-project.org<mailto:r-help-requ...@r-project.org> wrote:

From: John McKown 
mailto:john.archie.mck...@gmail.com>>
Subject: Re: [R] the making of _R_ eBooks
Date: 24 March 2015 1:36:38 NZDT
To: "Dr. Wolfgang Lindner" mailto:lindn...@t-online.de>>
Cc: Help R mailto:r-help@r-project.org>>


On Mon, Mar 23, 2015 at 3:50 AM, Dr. Wolfgang Lindner
mailto:lindn...@t-online.de>> wrote:
Dear list members,

I like the look and feel of the eBook versions of the R manuals very much.
So I would like to generate eBooks (teaching material etc) in that look.

I am not an expert. But I have looked at the source, so I can give you
some information.


Q1: is there a description how the _R_ ebooks have been produced?

Looking at the source, it appears that the source manuals are in a
document markup language called "GNU Texinfo".
https://www.gnu.org/software/texinfo/
You can think of this as something akin to, but different from, HTML
or "markdown" encoding. Texinfo is an evolution by the system first
designed by Richard Stallman of MIT. He is the driving force behind
the GPL and most of the GNU software which forms the basis of the user
space commands for Linux and the *BSD operating systems. Texinfo is
then converted to TeX. TeX is the typesetting language designed by Dr.
Donald Knuth. TeX, nominally, is converted into a DVI printer control
language (DeVice Independent). But in the case of creating a PDF file,
there is a processor called "pdftex",
http://en.wikipedia.org/wiki/PdfTeX, which produces a PDF file as
output . A good site for TeX is https://tug.org/

Texinfo has the plus of also having processor which will convert it to
UNIX "man" (manual) pages and HTML web pages. So one "source" document
can generate three different types of output document file types.

Most people use a enhanced TeX called LaTeX instead of "plain TeX"
when using TeX. LaTeX can be read up on here:
http://www.latex-project.org/ A good TeX document processor is
TeXstudio at http://texstudio.sourceforge.net/ . I use this one myself
(which is not necessary a strong endorsement because I'm nobody
special).

I feel the need to warn you that TeX is very powerful and, at least to
me, quite difficult, with a fairly step learning curve. Which may be
why the R project uses Texinfo because it is quite a bit easier to
learn.


Q2: which (free) software was used for them?

See the links above. On Fedora Linux, I get the TeX oriented software
from a bunch of packages which start with "texlive". More information,
including the processors for Linux, Windows, and Mac are at
https://www.tug.org/texlive/

Q3: any other recommendations?

You might consider LyX.
http://www.lyx.org/
LyX is a document processor. It would likely be easier to use than the
above if you are used to MS Word or other word processing system. It
is cross platform: Linux, Windows, and Mac. It stores files in its own
textual format, which is somewhat human readable. LyX, like Texinfo,
translates its format into TeX as an intermediate on its way to its
ultimate destination. I am still learning LyX, but I personally like
it.

Your mention of LibreOffice is also a fairly good one. I, personally,
use LibreOffice. But I don't use it for big documents. I have a
learned aversion for word processors because it is so easy for them to
be misused. In my opinion, a good document needs good metadata in it
as well as just "looking pretty". Word processor users tend to focus
on the format and not the content. That's just my opinion, based on
what I've seen where I work.


Seaching the internet gives me e.g.
[1]
https://sites.google.com/site/richardbyrnepdsite/ebooks-and-audiobooks/create-your-own-ebooks
[2]  
opensource.com/life/13/8/how-create-ebook-open-source-way<http://opensource.com/life/13/8/how-create-ebook-open-source-way>
[3] http://scottnesbitt.net/ubuntublog/creating-a-ebook-with-libreoffice-writer/

but I m not sure, if there are better possibilities..

Thanks for any hint or link by expert R users.

Oh, well, that excludes me. I'm not an expert. But maybe it was helpful anyway.


Wolfgang Lindner
Leichlingen, Germany

--
If you sent twitter messages while exploring, are you on a textpedition?

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rpart and randomforest results

2014-04-08 Thread John Maindonald
The function rpart may well overfit if the value of the CP statistic
is left at its default.  Use the functions printcp() and plotcP() to
check how the cross-validation estimate of relative ‘error’ (xerror)
changes with the number of splits (NB that the CP that leads to a
further split changes monotonically with the number of splits).
The ‘rel error’ column from printcp() can be hopelessly optimistic.


John Maindonald email: 
john.maindon...@anu.edu.au<mailto:john.maindon...@anu.edu.au>

phone : +61 2 (6125)3473fax  : +61 2(6125)5549

Centre for Mathematics & Its Applications, Room 1194,

John Dedman Mathematical Sciences Building (Building 27)

Australian National University, Canberra ACT 0200.


On 8/04/2014, at 8:00 pm, 
r-help-requ...@r-project.org<mailto:r-help-requ...@r-project.org> wrote:

From: r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org> 
[mailto:r-help-boun...@r-project.org] On Behalf Of Schillo, Sonja
Sent: Thursday, April 03, 2014 3:58 PM
To: Mitchell Maltenfort
Cc: r-help@r-project.org<mailto:r-help@r-project.org>
Subject: Re: [R] rpart and randomforest results

Hi,

the random forest should do that, you're totally right. As far as I know it 
does so by randomly selecting the variables considered for a split (but here we 
set the option for how many variables to consider at each split to the number 
of variables available so that I thought that the random forest does not have 
the chance to randomly select the variables). The next thing that randomforest 
does is bootstrapping. But here again we set the option to the number of cases 
we have in the data set so that no bootstrapping should be done.
We tried to take all the "randomness" from the randomforest away.

Is that plausible and does anyone have another idea?

Thanks
Sonja


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drawing a line in xyplot

2012-04-09 Thread John Maindonald
PS: The following shows possibilities that are available using latticeExtra 
layering:

## Best make type and attend factors
xdat = data.frame(mortality =c(5, 
8,7,5,8,10,11,6,4,5,20,25,27,30,35,32,28,21,20,34,11,15,18,12,15,12,10,15,19,20),
 type=
factor(c(1, 1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3)),
attend = factor(c(1, 
0,1,1,1,1,0,0,0,1,0,1,0,0,0,0,1,0,0,0,1,1,1,0,1,0,1,0,0,0)))


gph <- xyplot (mortality ~ attend|type, pch=16,
 data = xdat, aspect=2:1,layout=c(3,1))

one4all <- layer(avy <- median(xdat$mortality),
  panel.segments(0.1, avy, 0.3, avy, col="red",lwd=4),
  panel.segments(0.7, avy, 1, avy, col="red",lwd=4))

medbytype <- layer(avy <- median(y),
  panel.segments(0.1, avy, 0.3, avy, col="red",lwd=4),
  panel.segments(0.7, avy, 1, avy, col="red",lwd=4))

interact <- layer(panel.average(x, y, fun=median, col='red', lwd=4))

Compare

gph + one4all
(shows overall median lines in all 3 panels)

gph + medbytype
(shows separate median lines for the separate panels)

gph+interact
(gives a form of interaction plot)

NB x (if its values are used) and y are local to the individual panel

NB also that layer() accepts a data argument.  The following is an
alternative way to calculate one4all:

one4all <- layer(data=xdat, avy <- median(mortality),
  panel.segments(0.1, avy, 0.3, avy, col="red",lwd=4),
  panel.segments(0.7, avy, 1, avy, col="red", lwd=4))

John Maindonald.


> This can be simplified by using the layering abilities that Felix Andrews 
> made available in latticeExtra.  These are too little known.  These pretty
> much make it unnecessary to resort to trellis.focus(), at least in such
> cases as this.  These layering abilities are too little known:
> 
> library(latticeExtra)
> x11(height=8,width=11)
> xdat = data.frame(mortality =c(5, 
> 8,7,5,8,10,11,6,4,5,20,25,27,30,35,32,28,21,20,34,11,15,18,12,15,12,10,15,19,20),
>  type=
> c(1, 1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3), attend = 
> c(1, 0,1,1,1,1,0,0,0,1,0,1,0,0,0,0,1,0,0,0,1,1,1,0,1,0,1,0,0,0))
> gph <- xyplot ( mortality ~ attend|type,
> panel=function(x,y)
> {
> panel.grid(lty=5)
> panel.xyplot(x,y,pch=16,jitter.x=TRUE,col=1)
> for(i in 1:3)
> {
> panel.segments(x0=c(0.7, 1.7),
> x1=c(1.3, 2.3),
> y0=with(xdat, c(tapply(mortality[type==i], attend[type==i], median))),
> y1=with(xdat, c(tapply(mortality[type==i], attend[type==i],
> median))),col="red",lwd=4)
> } 
> },
> data = xdat, aspect=2:1,layout=c(3,1))
> 
> ## Now create a layer object that will add the further segments.
> addlayer <- layer(panel.segments(x0=1,y0=20,x1=1.2, y1=20),
> panel.segments(x0=2,y0=30,x1=2.2, y1=30))
> 
> gph+addlayer
> 
> The code that produces the object gph would also be simpler
> and easier to follow if some relevant part was separated out into 
> a separate layer.
> 
> See also my notices on layering of lattice objects at:
> http://www.maths.anu.edu.au/%7Ejohnm/r-book/add-graphics.html
> 
> John Maindonald email: john.maindon...@anu.edu.au
> phone : +61 2 (6125)3473fax  : +61 2(6125)5549
> Centre for Mathematics & Its Applications, Room 1194,
> John Dedman Mathematical Sciences Building (Building 27)
> Australian National University, Canberra ACT 0200.
> http://www.maths.anu.edu.au/~johnm
> 
> On 08/04/2012, at 8:00 PM, r-help-requ...@r-project.org wrote:
> 
>> From: David Winsemius 
>> Subject: Re: [R] Drawing a line in xyplot
>> Date: 8 April 2012 2:17:22 PM AEST
>> To: wcheckle 
>> Cc: r-help@r-project.org
>> 
>> 
>> 
>> On Apr 7, 2012, at 10:29 PM, wcheckle wrote:
>> 
>>> Thank you David, the bwplot option does what I need:
>>> 
>>> x11(height=8,width=11)
>>> bwplot ( mortality~ attend|type,
>>> pch=95,cex=5,col=2,
>>> par.settings=list(
>>> box.rectangle = list(col = "transparent"),
>>> box.umbrella = list(col = "transparent"),
>>> plot.symbol = list(col = "transparent")
>>> ),
>>> panel=function(x,y,...){
>>> panel.grid(lty=5)
>>> panel.xyplot(x,y,pch=16,jitter.x=TRUE,col=1)
>>> panel.bwplot(x,y,...)
>>> },
>>> data = x,aspect=2:1,layout=c(3,1))
>>> 
>>> 
>>> However, I am interested in also learning how to do it in xyplot as well.  I
>>> wasn’t able to follow the last two set of instructions (condition on
>>> packet.number and loop over segments), wondering if I can ask for your help
>>> for the correct code (my attempt resulted in all three mean lines within
>>> each p

Re: [R] Drawing a line in xyplot

2012-04-08 Thread John Maindonald
This can be simplified by using the layering abilities that Felix Andrews 
made available in latticeExtra.  These are too little known.  These pretty
much make it unnecessary to resort to trellis.focus(), at least in such
cases as this.  These layering abilities are too little known:

library(latticeExtra)
x11(height=8,width=11)
xdat = data.frame(mortality =c(5, 
8,7,5,8,10,11,6,4,5,20,25,27,30,35,32,28,21,20,34,11,15,18,12,15,12,10,15,19,20),
 type=
c(1, 1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3), attend = c(1, 
0,1,1,1,1,0,0,0,1,0,1,0,0,0,0,1,0,0,0,1,1,1,0,1,0,1,0,0,0))
gph <- xyplot ( mortality ~ attend|type,
panel=function(x,y)
{
panel.grid(lty=5)
panel.xyplot(x,y,pch=16,jitter.x=TRUE,col=1)
for(i in 1:3)
{
panel.segments(x0=c(0.7, 1.7),
x1=c(1.3, 2.3),
y0=with(xdat, c(tapply(mortality[type==i], attend[type==i], median))),
y1=with(xdat, c(tapply(mortality[type==i], attend[type==i],
median))),col="red",lwd=4)
}   
},
data = xdat, aspect=2:1,layout=c(3,1))

## Now create a layer object that will add the further segments.
addlayer <- layer(panel.segments(x0=1,y0=20,x1=1.2, y1=20),
panel.segments(x0=2,y0=30,x1=2.2, y1=30))

gph+addlayer

The code that produces the object gph would also be simpler
and easier to follow if some relevant part was separated out into 
a separate layer.

See also my notices on layering of lattice objects at:
http://www.maths.anu.edu.au/%7Ejohnm/r-book/add-graphics.html

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

On 08/04/2012, at 8:00 PM, r-help-requ...@r-project.org wrote:

> From: David Winsemius 
> Subject: Re: [R] Drawing a line in xyplot
> Date: 8 April 2012 2:17:22 PM AEST
> To: wcheckle 
> Cc: r-help@r-project.org
> 
> 
> 
> On Apr 7, 2012, at 10:29 PM, wcheckle wrote:
> 
>> Thank you David, the bwplot option does what I need:
>> 
>> x11(height=8,width=11)
>> bwplot ( mortality~ attend|type,
>> pch=95,cex=5,col=2,
>> par.settings=list(
>> box.rectangle = list(col = "transparent"),
>> box.umbrella = list(col = "transparent"),
>> plot.symbol = list(col = "transparent")
>> ),
>> panel=function(x,y,...){
>> panel.grid(lty=5)
>> panel.xyplot(x,y,pch=16,jitter.x=TRUE,col=1)
>> panel.bwplot(x,y,...)
>> },
>> data = x,aspect=2:1,layout=c(3,1))
>> 
>> 
>> However, I am interested in also learning how to do it in xyplot as well.  I
>> wasn’t able to follow the last two set of instructions (condition on
>> packet.number and loop over segments), wondering if I can ask for your help
>> for the correct code (my attempt resulted in all three mean lines within
>> each panel):
>> 
>> x11(height=8,width=11)
>> xyplot ( mortality ~ attend|type,
>> panel=function(x,y)
>> {
>> panel.grid(lty=5)
>> panel.xyplot(x,y,pch=16,jitter.x=TRUE,col=1)
>> for(i in 1:3)
>> {
>> panel.segments(x0=c(0.7, 1.7),
>> x1=c(1.3, 2.3),
>> y0=c(tapply(x$mortality[x$type==i], x$attend[x$type==i], median)),
>> y1=c(tapply(x$mortality[x$type==i], x$attend[x$type==i],
>> median)),col="red",lwd=4)
>> }
>> },
>> data = x,aspect=2:1,layout=c(3,1))
>> 
>> thank you again. I also found your info on data.frame useful.
> 
> I haven't figured out how to do it with the interaction of 'attend' and 
> ''type' from inside xyplot. I'm thinking I might be able to succeed using 
> trellis.focus() to address separate "columns" within a particular panel.
> 
> This will draw  segments at (1,20) and (2,30) without resorting to low level 
> grid/viewport stuff.
> 
> trellis.unfocus(); trellis.focus("panel", 1, 1)
> do.call("panel.segments", list(x0=1,y0=20,x1=1.2, y1=20))
> trellis.unfocus()
> trellis.unfocus(); trellis.focus("panel", 1, 1)
> do.call("panel.segments", list(x0=2,y0=30,x1=2.2, y1=30))
> trellis.unfocus()
> 
> -- 
> David
> 
> 
> David Winsemius, MD
> West Hartford, CT


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RC33 8th Int Conf on Social Science Methodology -- The R System ...

2011-11-02 Thread John Maindonald
I wish to draw attention to an R-related session that is planned for the 
RC33 Eighth International Conference on Social Science Methodology, 
to be held over July 9 - July 13 2012, at the University of Sydney.

"
The focus of the conference is on innovations and current best practice in all 
aspects of social science research methodology. It provides an opportunity to 
reflect on contemporary methods, as applied in a range of settings and 
disciplinary contexts, to hear about emerging methods, tools, techniques and 
technologies, and to discover what resources are available to social science 
researchers and users of research.
"

The title for the planned session is:
"The R System as a Platform for Analysis and Development of Analysis 
Methodology"

http://conference.acspri.org.au/index.php/rc33/2012/schedConf/trackPolicies

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cv.lm syntax error

2011-03-09 Thread John Maindonald
It means that dat$v1, dat$v2, . . . are not columns in the data frame df

Specify the formula as v1 ~ v2+v2+v3+v4+v5+factor
Then (assuming that factor is a column object of the right length)
you should be fine.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

> From: agent dunham 
> Date: 9 March 2011 3:47:10 AM AEDT
> To: r-help@r-project.org
> Subject: Re: [R] cv.lm syntax error
> 
> 
> Thanks for your answer, but I type the following: 
> 
>> dfmod.1 <- data.frame(dat$v1,dat$v2,dat$v3,dat$v4, dat$v5,factor)
> 
>> CVlm(df= dfmod.1, form.lm = formula(dat$v1 ~ dat$v2+dat$v3+ dat$v4+
>> dat$v5+ factor), m=3, seed=29, plotit=TRUE, printit =TRUE) 
> 
> Error en `[.data.frame`(df, , ynam) : undefined columns selected
> 
> What does it mean?
> 
> u...@host.com
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/cv-lm-syntax-error-tp3334889p3341767.html
> Sent from the R help mailing list archive at Nabble.com.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-help Digest, Vol 97, Issue 11

2011-03-09 Thread John Maindonald
You might look at the function confusion() in the DAAGxtras package.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

On 09/03/2011, at 10:00 PM, r-help-requ...@r-project.org wrote:

> From: "Shi, Tao" 
> Date: 9 March 2011 7:22:49 AM AEDT
> To: r-help@r-project.org
> Subject: [R] confusion matrix
> 
> 
> Hi list,
> 
> Is there already a function somewhere to output the confusion matrix from two 
> input vectors?  "table" always automatically delete rows or columns with all 
> 0's.  For example, I would like the columns for "10" and "30" added back.
> 
> Thanks!
> 
> ...Tao
> 
> 
> 
>  20 40 50 60 70 80 90 100
>  10   0  0  1  0  0  0  1   0
>  20   1  2  0  4  0  0  0   1
>  30   0  0  0  0  0  0  1   1
>  40   0  0  1  1  2  0  0   0
>  50   0  1  0  0  0  1  0   0
>  60   0  0  0  0  0  0  1   1
>  70   0  0  1  1  1  1  1   1
>  80   0  1  0  0  0  0  0   1
>  90   0  0  0  0  0  0  0   3
>  100  0  0  0  0  1  0  0   3
> 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] \Sweaveopts error

2010-11-27 Thread John Maindonald
Actually, I had until I created this file had everything on one line,
or had separate \SweaveOpts commands.

Putting everything on one line ensures that the graphics file
goes to the subdirectory snArt, but the file test1.tex is unchanged.

Note however the difference between the files test1.tex and
test.tex, irrespective of the 1 line or 2 issue.  The first of the
\SweaveOpts lines had an effect, unless I am missing something.

John.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

On 28/11/2010, at 12:40 PM, Duncan Murdoch wrote:

> On 27/11/2010 7:57 PM, John Maindonald wrote:
>> Actually, I spoke too soon.  The files process without obvious error,
>> but keep.source=TRUE is ignored.  I have posted small files
>> test1.Rnw and test2.Rnw that can be used to demonstrate the
>> problems at:
>>   http://wwwmaths.anu.edu.au/~johnm/r/issues/
> 
> In the test1.Rnw file, your \SweaveOpts statement spans two lines. Sweave() 
> doesn't recognize it, so it's completely ignored.
> 
> I don't think this is new behaviour; it's a consequence of the way Sweave 
> looks for \SweaveOpts via regular expression.
> 
> Not quite sure what's going on in the second one yet...
> 
> Duncan Murdoch
> 
>> 
>> Sweave("test1")  ## includes SweaveOpts settings
>> 
>> Sweave("test2", keep.source=TRUE)
>>   ## SweaveOpts settings have been removed.
>> 
>> Comments do not appear in the LaTeX file that results, the code
>> is reformatted and the graph from test1 goes into the working
>> directory.
>> 
>> Notice also the NA that mysteriously appears in the second
>> line of code in output .tex file test2.tex
>> 
>> John.
>> 
>> John Maindonald email: john.maindon...@anu.edu.au
>> phone : +61 2 (6125)3473fax  : +61 2(6125)5549
>> Centre for Mathematics&  Its Applications, Room 1194,
>> John Dedman Mathematical Sciences Building (Building 27)
>> Australian National University, Canberra ACT 0200.
>> http://www.maths.anu.edu.au/~johnm
>> 
>> On 26/11/2010, at 4:26 PM, John Maindonald wrote:
>> 
>>> Yes, that has fixed the problem. (2010-11-24 r53659)
>>> 
>>> Thanks.
>>> 
>>> John Maindonald email: john.maindon...@anu.edu.au
>>> phone : +61 2 (6125)3473fax  : +61 2(6125)5549
>>> Centre for Mathematics&  Its Applications, Room 1194,
>>> John Dedman Mathematical Sciences Building (Building 27)
>>> Australian National University, Canberra ACT 0200.
>>> http://www.maths.anu.edu.au/~johnm
>>> 
>>> On 25/11/2010, at 10:56 PM, Duncan Murdoch wrote:
>>> 
>>>> On 25/11/2010 6:34 AM, John Maindonald wrote:
>>>>> I have a file 4lmetc.Rnw, intended for inclusion in a LaTeX document,
>>>>> that starts:
>>>> 
>>>> I think this may have been fixed in the patched version.  Could you give 
>>>> it a try to confirm?  If not, please send me a simplified version of the 
>>>> file, and I'll see what's going wrong.
>>>> 
>>>> Duncan Murdoch
>>>> 
>>>> 
>>>>> 
>>>>> \SweaveOpts{engine=R, keep.source=TRUE}
>>>>> \SweaveOpts{eps=FALSE, prefix.string=snArt/4lmetc}
>>>>> 
>>>>> The attempt to process the file through Sweave generates the error:
>>>>> 
>>>>>> Sweave("4lmetc")
>>>>> Writing to file 4lmetc.tex
>>>>> Processing code chunks ...
>>>>> 1 : keep.source term verbatim
>>>>> Error in file(srcfile$filename, open = "rt", encoding = encoding) :
>>>>> cannot open the connection
>>>>> In addition: Warning message:
>>>>> In file(srcfile$filename, open = "rt", encoding = encoding) :
>>>>> cannot open file '4lmetc': No such file or directory
>>>>> 
>>>>> The same file processes through Stangle() without problems.
>>>>> If I comment out the \Sweaveopts lines, there is no problem,
>>>>> except that I do not get the options that I want.
>>>>> 
>>>>> This processed fine in R-2.11.1
>>>>> 
>>>>>> sessionInfo()
>>>>> R version 2.12.0 (201

Re: [R] \Sweaveopts error

2010-11-27 Thread John Maindonald
Actually, I spoke too soon.  The files process without obvious error, 
but keep.source=TRUE is ignored.  I have posted small files 
test1.Rnw and test2.Rnw that can be used to demonstrate the 
problems at:
  http://wwwmaths.anu.edu.au/~johnm/r/issues/

Sweave("test1")  ## includes SweaveOpts settings

Sweave("test2", keep.source=TRUE)   
  ## SweaveOpts settings have been removed.

Comments do not appear in the LaTeX file that results, the code 
is reformatted and the graph from test1 goes into the working 
directory.

Notice also the NA that mysteriously appears in the second
line of code in output .tex file test2.tex

John.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

On 26/11/2010, at 4:26 PM, John Maindonald wrote:

> Yes, that has fixed the problem. (2010-11-24 r53659)
> 
> Thanks.
> 
> John Maindonald email: john.maindon...@anu.edu.au
> phone : +61 2 (6125)3473fax  : +61 2(6125)5549
> Centre for Mathematics & Its Applications, Room 1194,
> John Dedman Mathematical Sciences Building (Building 27)
> Australian National University, Canberra ACT 0200.
> http://www.maths.anu.edu.au/~johnm
> 
> On 25/11/2010, at 10:56 PM, Duncan Murdoch wrote:
> 
>> On 25/11/2010 6:34 AM, John Maindonald wrote:
>>> I have a file 4lmetc.Rnw, intended for inclusion in a LaTeX document,
>>> that starts:
>> 
>> I think this may have been fixed in the patched version.  Could you give it 
>> a try to confirm?  If not, please send me a simplified version of the file, 
>> and I'll see what's going wrong.
>> 
>> Duncan Murdoch
>> 
>> 
>>> 
>>> \SweaveOpts{engine=R, keep.source=TRUE}
>>> \SweaveOpts{eps=FALSE, prefix.string=snArt/4lmetc}
>>> 
>>> The attempt to process the file through Sweave generates the error:
>>> 
>>>> Sweave("4lmetc")
>>> Writing to file 4lmetc.tex
>>> Processing code chunks ...
>>> 1 : keep.source term verbatim
>>> Error in file(srcfile$filename, open = "rt", encoding = encoding) :
>>> cannot open the connection
>>> In addition: Warning message:
>>> In file(srcfile$filename, open = "rt", encoding = encoding) :
>>> cannot open file '4lmetc': No such file or directory
>>> 
>>> The same file processes through Stangle() without problems.
>>> If I comment out the \Sweaveopts lines, there is no problem,
>>> except that I do not get the options that I want.
>>> 
>>> This processed fine in R-2.11.1
>>> 
>>>> sessionInfo()
>>> R version 2.12.0 (2010-10-15)
>>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>>> 
>>> locale:
>>> [1] C
>>> 
>>> attached base packages:
>>> [1] stats graphics  grDevices utils datasets  grid  methods
>>> [8] base
>>> 
>>> other attached packages:
>>> [1] lattice_0.19-13 DAAG_1.02   randomForest_4.5-36
>>> [4] rpart_3.1-46MASS_7.3-8  reshape_0.8.3
>>> [7] plyr_1.2.1  proto_0.3-8
>>> 
>>> loaded via a namespace (and not attached):
>>> [1] ggplot2_0.8.8   latticeExtra_0.6-14
>>> 
>>> Is there a workaround?
>>> 
>>> John Maindonald email: john.maindon...@anu.edu.au
>>> phone : +61 2 (6125)3473fax  : +61 2(6125)5549
>>> Centre for Mathematics&  Its Applications, Room 1194,
>>> John Dedman Mathematical Sciences Building (Building 27)
>>> Australian National University, Canberra ACT 0200.
>>> http://www.maths.anu.edu.au/~johnm
>>> 
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] \Sweaveopts error

2010-11-25 Thread John Maindonald
Yes, that has fixed the problem. (2010-11-24 r53659)

Thanks.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

On 25/11/2010, at 10:56 PM, Duncan Murdoch wrote:

> On 25/11/2010 6:34 AM, John Maindonald wrote:
>> I have a file 4lmetc.Rnw, intended for inclusion in a LaTeX document,
>> that starts:
> 
> I think this may have been fixed in the patched version.  Could you give it a 
> try to confirm?  If not, please send me a simplified version of the file, and 
> I'll see what's going wrong.
> 
> Duncan Murdoch
> 
> 
>> 
>> \SweaveOpts{engine=R, keep.source=TRUE}
>> \SweaveOpts{eps=FALSE, prefix.string=snArt/4lmetc}
>> 
>> The attempt to process the file through Sweave generates the error:
>> 
>>> Sweave("4lmetc")
>> Writing to file 4lmetc.tex
>> Processing code chunks ...
>> 1 : keep.source term verbatim
>> Error in file(srcfile$filename, open = "rt", encoding = encoding) :
>>  cannot open the connection
>> In addition: Warning message:
>> In file(srcfile$filename, open = "rt", encoding = encoding) :
>>  cannot open file '4lmetc': No such file or directory
>> 
>> The same file processes through Stangle() without problems.
>> If I comment out the \Sweaveopts lines, there is no problem,
>> except that I do not get the options that I want.
>> 
>> This processed fine in R-2.11.1
>> 
>>> sessionInfo()
>> R version 2.12.0 (2010-10-15)
>> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>> 
>> locale:
>> [1] C
>> 
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  grid  methods
>> [8] base
>> 
>> other attached packages:
>> [1] lattice_0.19-13 DAAG_1.02   randomForest_4.5-36
>> [4] rpart_3.1-46MASS_7.3-8  reshape_0.8.3
>> [7] plyr_1.2.1  proto_0.3-8
>> 
>> loaded via a namespace (and not attached):
>> [1] ggplot2_0.8.8   latticeExtra_0.6-14
>> 
>> Is there a workaround?
>> 
>> John Maindonald email: john.maindon...@anu.edu.au
>> phone : +61 2 (6125)3473fax  : +61 2(6125)5549
>> Centre for Mathematics&  Its Applications, Room 1194,
>> John Dedman Mathematical Sciences Building (Building 27)
>> Australian National University, Canberra ACT 0200.
>> http://www.maths.anu.edu.au/~johnm
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] \Sweaveopts error

2010-11-25 Thread John Maindonald
I have a file 4lmetc.Rnw, intended for inclusion in a LaTeX document,
that starts:

\SweaveOpts{engine=R, keep.source=TRUE}
\SweaveOpts{eps=FALSE, prefix.string=snArt/4lmetc}

The attempt to process the file through Sweave generates the error:

> Sweave("4lmetc")
Writing to file 4lmetc.tex
Processing code chunks ...
 1 : keep.source term verbatim
Error in file(srcfile$filename, open = "rt", encoding = encoding) : 
  cannot open the connection
In addition: Warning message:
In file(srcfile$filename, open = "rt", encoding = encoding) :
  cannot open file '4lmetc': No such file or directory

The same file processes through Stangle() without problems.
If I comment out the \Sweaveopts lines, there is no problem,
except that I do not get the options that I want.

This processed fine in R-2.11.1

> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] C

attached base packages:
[1] stats graphics  grDevices utils datasets  grid  methods  
[8] base 

other attached packages:
[1] lattice_0.19-13 DAAG_1.02   randomForest_4.5-36
[4] rpart_3.1-46MASS_7.3-8  reshape_0.8.3  
[7] plyr_1.2.1  proto_0.3-8

loaded via a namespace (and not attached):
[1] ggplot2_0.8.8   latticeExtra_0.6-14

Is there a workaround?

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] L-shaped boxes with lattice graphs?

2010-11-14 Thread John Maindonald
Can anyone suggest an equivalent, for lattice graphs,
of the base graphics argument bty="l"? 

NB that I am leaving off the box around the strip,
with a strip function:
stripfun <- function(which.given,which.panel,
 factor.levels=as.expression(levlist), ...){
  panel.text(x=0, y=0.5,
 lab = as.expression(levlist[which.panel[which.given]]), adj=0)
}

e.g.
levlist <- list("A", "B")
xyplot(11:14 ~ 1:4 | rep(1:2,2), scales=list(x=list(alternating=c(1,1), 
relation="sliced")), strip=stripfun, layout=c(1,2)) 


John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Why software fails in scientific research

2010-02-28 Thread John Maindonald
I came across this notice of an upcoming webinar.   The issues identified in 
the 
first paragraph below seem to me exactly those that the R project is designed
to address.  The claim that "most research software is barely fit for purpose 
compared to equivalent systems in the commercial world" seems to me not
quite accurate!  Comments!


WEBINAR SERIES 
A Crack in the Code: Why software fails in scientific research, and how to fix 
it. 
Thursday, March 25, 2010, 3:00 PM GMT 
http://physicsworld.com/cws/go/webinar9

"
In the 60 years since the invention of the digital computer, millions of lines 
of code have been developed to support scientific research. Although an 
increasingly important part of almost all research projects, most research 
software is barely fit for purpose compared to equivalent systems in the 
commercial world. The code is hard to understand or maintain, lacking 
documentation and version control, and is continually ‘re-invented’ as the code 
writers move on to new jobs. This represents a tremendous waste of the already 
inadequate resources that are put into its development. We will investigate how 
this situation has come about, why it is important to the future of research, 
and what can be done about it. 

Robert McGreevy will draw on his extensive experience at the STFC ISIS 
Facility, and explain how these issues are being addressed for the benefit of 
research science globally. Nicholas Draper, consultant at Tessella, will then 
expand on this, using the example of the Mantid project at ISIS. 
"

Tessella (www.tessella.com) is a technology and consultancy firm, based in 
Oxford.

ISIS (International Species Information System) (www.isis.org) has as its 
mission the facilitation of "international collaboration in the collection and 
sharing of knowledge on animals and their environments for zoos, aquariums and 
related organizationsvalues the use of objective data to benefit conservation, 
science, animal welfare, education, and collection management."

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] parsing pdf files

2010-01-10 Thread John Maindonald
Oblivious to the problems that Barry notes, I have used pdftotext,
from Xpdf at http://www.foolabs.com/xpdf/download.html
without apparent problem; this under MacOS X.  For my purposes,
I need to retain the CTRL/Fs that indicate page throws.  Other
converters that I have investigate seem not to retain such
information.  I use it with command line options thus:

pdftotext -layout -eol unix rnotes.pdf rnotes.txt

Given a pdf for a manuscript,  I have an R function that can then
create an index of functions that appear with their opening and 
closing parentheses.  (It may pick up a few strays, which can be
weeded out.)

The only issue I've found has been for use with the listings package.
In the LaTeX source, I set the option 'columns' to 'fullflexible', in order 
to generate a pdf file that does not have the unwanted hidden spaces,
which are then carried across to the text file.  

{\lstset{language=R, xleftmargin=2pt,
 basicstyle=\ttfamily,
 columns=fullflexible,   % Omit for final manuscript
 showstringspaces=false}}
{}

The setting 'columns=fullflexible'  messes up the formatting
in places where I use tabbing, so that I need to change
'columns' back to the default for the final pdf.

Here is the function that does most of the work. Like the function
that follows, it could no doubt be greatly tidied up:

locatefun <-
function(txlines){
idtxt <- "\\.?[a-zA-Z][a-zA-Z0-9]*(\\.[a-zA-Z]+)*\\("
z <- regexpr("\014", txlines)
z[z>0] <- 1
z[z<=0] <- 0
page <- cumsum(z)+1
k <- 0
findfun <- function(tx){
mn <- t(sapply(tx, function(x)
   {m <- regexpr(idtxt, x); 
c(m, attr(m, "match.length"))}))
mn[,2] <- mn[,1]+mn[,2]
rownames(mn) <- paste(1:dim(mn)[1])
mn
}
for(i in 1:100){
mn <- findfun(txlines)
if(all(mn[,1]==-1))break
here <- mn[,1]>0
page <- page[here]
txlines <- txlines[here]
mn <- mn[here, , drop=FALSE]
m1 <- regexpr("\\(", txlines)-1
tx1 <- substring(txlines,mn[,1],m1)
if(i==1)xy <- data.frame(nam=I(tx1), page=page) else
xy <- rbind(xy, data.frame(nam=I(tx1), page=page))
txlines <- substring(txlines,mn[,2])
here2 <- nchar(txlines)>0
txlines <- txlines[here2]
page <- page[here2]
if(length(txlines)==0)break
}
zz <- !xy[,1]%in% c("","al","Pr","T", "F", "n","P", "y", "A",
"transformation", "left","f","site.","a","b", "II",
"ARCH", "ARMA", "MA")
xy <- xy[zz,]
nam <- xy$nam
ch <- substring(nam,1,1)
nam[ch%in%c("="," ",",")] <- substring(nam[ch%in%c("="," ",",")],2)
xy$nam <- nam
ord <- order(xy[,2])
xy[ord,]
}

Here is the function that calls findfun:

 makeFunIndex <-
function(sourceFile="rnotes.txt",
 frompath="~/_notes/rnotes/", fileout=NULL,
 availfun=funpack,
 offset=0){
## pdftotext -layout -eol unix rnotes.pdf rnotes.txt
len <- nchar(sourceFile)
lfrom <- nchar(frompath)
if(substring(frompath, lfrom, lfrom)=="/")frompath <-
substring(frompath, 1, lfrom-1)
if(is.null(fileout)){
if (substring(sourceFile,len - 3, len-3) == ".") 
fnam <- substring(sourceFile, 1, len - 4) else fnam <- 
sourceFile
fileout <- paste(fnam, ".fdx", sep = "")
fdxfile <- paste(fileout, sep="/")
fndfile <- paste(fnam, ".fnd", sep = "")
}
sourceFile <- paste(frompath, sourceFile, sep="/")
print(paste("Send output to", fndfile))
tx <- readLines(sourceFile, warn=FALSE)
entrymat <- locatefun(tx)
backn <- regexpr("\\n",entrymat[,1],fixed=TRUE)
entrymat <- entrymat[backn < 0,]
entrymat[,2] <- entrymat[,2] - offset
entrymat[,1] <- gsub("_","\\_",entrymat[,1], fixed=TRUE)
nmatch <- match(entrymat[,1], availfun[,2], nomatch=0)
use <- nmatch > 0
print("Unmatched functions:")
print(unique(entrymat[!use,1]))
entrymat[use,1] <- paste(entrymat[use

Re: [R] R on Linux, and R on Windows , any difference in maturity+stability?

2009-10-06 Thread John Maindonald
I had a large job time ago that ran fine under MacOS X.
I'd expect the same to be true under Linux.  It would run
under Windows XP only if XP had been freshly rebooted.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 06/10/2009, at 9:00 PM, r-help-requ...@r-project.org wrote:

> From: Liviu Andronic 
> Date: 6 October 2009 6:46:33 PM AEDT
> To: Robert Wilkins 
> Cc: r-help@r-project.org
> Subject: Re: [R] R on Linux, and R on Windows , any difference in  
> maturity+stability?
>
>
> On 10/6/09, Robert Wilkins  wrote:
>> Will R have more glitches on one operating system as opposed to
>> another,
>>
> Probably not.
>
>> or is it pretty much the same?
>>
> Depending on the complexity of the code, it is pretty much the same. I
> recently had a (relatively simple) group project, with the three of us
> on different OSs: Win, Mac and Linux. We did not encounter one
> platform specific issue.
> Liviu


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Use of R in Schools

2009-09-18 Thread John Maindonald

I am looking for information on experimentation with the use
of R in the teaching of statistics and science in schools.  Any
leads would be very welcome.  I am certain that there is such
experimentation.

I've made this inquiry on r-sig-teaching, with no response.
John.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R User Group in Canberra

2009-07-31 Thread John Maindonald
Please add

AUSTRALIA

Canberra: Canberra R Users Group (CANRUG) http://canrug.togaware.com/

Cheers
john.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 31/07/2009, at 8:00 PM, r-help-requ...@r-project.org wrote:

> listings


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Strip labels: use xyplot() to plot columns in parallel with outer=TRUE

2009-05-09 Thread John Maindonald

The following tinkers with the strip labels, where the
different panels are for different levelf of a conditioning
factor.

tau <- (0:5)/2.5; m <- length(tau); n <- 200; SD <- 2
x0 <- rnorm(n, mean=12.5, sd=SD)
matdf <- data.frame(
   x = as.vector(sapply((0:5)/2.5, function(s)x0+rnorm(n, sd=2*s))),
   y <- rep(15+2.5*x0, m), taugp = factor(rep(tau, rep(n,m
names(matdf) <- c("x","y","taugp")
lab <- c(list("0 (No error in x)"),
lapply(tau[-1], function(x)substitute(A*s[z], list(A=x
xyplot(y ~ x | taugp, data=matdf,
   strip=strip.custom(strip.names=TRUE,
var.name="Add error with SD", sep=expression(" = "),
factor.levels=as.expression(lab)))
Is there any way to get custom labels when the same is done by
plotting, variables in parallel?:

df <- unstack(matdf, x ~ taugp)
df$y <- 15+2.5*x0
lab2 <- c(list("0 (No error in x)"),
lapply(tau[-1], function(x)substitute("Add error with SD" ==  
A*s[z],

  list(A=x
form <- formula(paste("y ~ ", paste(paste("X", tau, sep=""),
   collapse="+")))
xyplot(form, data=df, outer=TRUE)

I'd hoped that the following would do the trick, but the first label
is repeated in each panel, and the variable names are added:

xyplot(form, data=df, outer=TRUE, strip=strip.custom(strip.names=TRUE,
  var.name=as.expression(lab2)))

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Accuracy of R and other platforms

2009-03-23 Thread John Maindonald
These comparisons are very simplistic.  In most contexts, it would  
make much better sense to measure "accuracy" in standard error units,  
rather than in number of digits.

There doubtless are specialist applications where the 15th digit (or  
even the 10th!) are important.  But the check of accuracy really has  
then to be tuned to that application.  Results in cases where the  
"accuracy" beyond (optimistically) the sixth digit is of no  
consequence are unlikely to give much clue on performance in such  
specialist applications.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 23/03/2009, at 10:00 PM, r-help-requ...@r-project.org wrote:

> From: Karl Ove Hufthammer 
> Date: 23 March 2009 2:58:23 AM
> To: r-h...@stat.math.ethz.ch
> Subject: Re: [R] Accuracy of R and other platforms
>
>
> Alejandro C. Frery:
>
>> @ARTICLE{AlmironSilvaMM:2009,
>> author = {Almiron, M. and Almeida, E. S. and Miranda, M.},
>> title = {The Reliability of Statistical Functions in Four Software
>> Packages Freely used in Numerical Computation},
>> journal = {Brazilian Journal of Probability and Statistics},
>> year = {in press},
>> volume = {Special Issue on Statistical Image and Signal Processing},
>> url = {http://www.imstat.org/bjps/}}
>>
>> is freely available under the "Future Papers" link. It makes a nice
>> comparison of the numerical properties of R, Ox, Octave and Python.
>
> Thanks for posting this. I’m happy to see that the results for R were
> generally excellent, and almost always better than for the three other
> software packages.
>
> But there were a few cases where R did not turn out to be the winner.
> Rather surprising that Ox was better than R for computing the
> autocorrelation coefficient for two of the datasets, given its  
> terrible
> results for the standard deviation. Anybody have any ideas why?
>
> -- 
> Karl Ove Hufthammer


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using a noisy variable in regression (not an R question)

2009-03-08 Thread John Maindonald
One can "just include it in the regression", but the potential problems
for interpretation are surely greater than those indicated.  Inclusion  
of
X1 = T1+E1 may cause X2 to appear significant when in fact it is having
no effect at all.  Or the true effect can be reversed in sign.  This  
happens
because X1 and X2 are correlated.  Maybe this is implicit in what Jon
is saying.

See Carroll, Ruppert and Stefanski:
Measurement Error in Nonlinear Models (2004, pp.52-55).  The error in E1
may need to be fairly large relative to SD(T1) for this to be an  
issue.  My notes
at http://www.maths.anu.edu.au/%7Ejohnm/r-book/2edn/xtras/xtras.pdf
have brief comments, and code that can be used to illustrate the point.

I support Stephen Kolassa's suggestions re using simulation for
sensitivity analysis, though I think this can also be done analytically.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 08/03/2009, at 10:00 PM, r-help-requ...@r-project.org wrote:

> From: Jonathan Baron 
> Date: 8 March 2009 5:21:55 AM
> To: Juliet Hannah 
> Cc: r-help@r-project.org
> Subject: Re: [R] using a noisy variable in regression (not an R  
> question)
>
>
> If you form categories, you add even more error, specifically, the
> variation in the distance between each number and the category
> boundary.
>
> What's wrong with just including it in the regression?
>
> Yes, the measure X1 will account for less variance than the underlying
> variable of real interest (T1, each individual's mean, perhaps), but
> X1 could still be useful in two ways.  One, it might be a significant
> predictor of the dependent variable Y despite the error.  Two, it
> might increase the sensitivity of the model to other predictors (X2,
> X3...) by accounting for what would otherwise be error.
>
> What you cannot conclude in this case (when you measure a predictor
> with error) is that the effect of (say) X2 is not accounted for by its
> correlation with T1.  Some people try to conclude this when X2 remains
> a significant predictor of Y when X1 is included in the model.  The
> trouble is that X1 is an error-prone measure of T1, so the full effect
> of T1 is not removed by inclusion of X1.
>
> Jon
>
> On 03/07/09 12:49, Juliet Hannah wrote:
>> Hi, This is not an R question, but I've seen opinions given on non R
>> topics, so I wanted
>> to give it a try. :)
>>
>> How would one treat a variable that was measured once, but is known  
>> to
>> fluctuate a lot?
>> For example, I want to include a hormone in my regression as an
>> explanatory variable. However, this
>> hormone varies in its levels throughout a day. Nevertheless, its  
>> levels differ
>> substantially between individuals so that there is information  
>> there to use.
>>
>> One simple thing to try would be to form categories, but I assume
>> there are better ways to handle this. Has anyone worked with such  
>> data, or could
>> anyone suggest some keywords that may be helpful in searching for  
>> this
>> topic. Thanks
>> for your input.
>>
>> Regards,
>>
>> Juliet
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> -- 
> Jonathan Baron, Professor of Psychology, University of Pennsylvania
> Home page: http://www.sas.upenn.edu/~baron
> Editor: Judgment and Decision Making (http://journal.sjdm.org)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] The Origins of R

2009-02-03 Thread John Maindonald
In another thread on this list, various wild allegations have been  
made, relating to the New York Times article on R.  I object both to  
the subject line and to the content of several of the messages, and  
will not repeat or quote any of that content.  It smacks to me of  
mischief making.


Discussion has centered around the following quote from the NY Times  
article:
“According to them, the notion of devising something like R sprang up  
during a hallway conversation. They both wanted technology better  
suited for their statistics students, who needed to analyze data and  
produce graphical models of the information. Most comparable software  
had been designed by computer scientists and proved hard to use.”
The comment that "the notion of devising something like R sprang up  
during a hallway conversation" is strictly true.  Certainly, this  
seems like a very plausible account.  I'd have more difficulty  
believing that the notion was communicated to them in separate  
dreams.  Part of the wanted technology was freedom for students to  
take the software home, or copy it down from the web.
There was a further story to be told, about the origins of the  
language that Ross and Robert implemented and adapted.  The NY writer  
pretty much left out that part of the story (S did get a mention, but  
its connection with R did not), but did remedy this omission in a  
follow-up.
Nor did the article do much to acknowledge the workers and work that  
has gone into R's continuing development. Getting the attributions  
"right" is difficult.  Even if "right" according to common conventions  
(and one can argue as to just what the conventions are, especially in  
the matter of computer language development), they are unlikely to be  
totally fair.  Stigler's Law of Eponomy has wide sway!


In the preface to the first and second edition of "Data Analysis and  
Graphics Using R", we have:
"The R system implements a dialect of the S language that was  
developed at AT&T Bell Laboratories by Rick Becker, John Chambers and  
Allan Wilks".
The only 1st edition attribution to Ihaka and Gentleman was in Chapter  
12: "For citing R in a publication, use Ihaka and Gentleman (1996)".   
[NB: Type citation() to see the form of citation that should now be  
used.]
That was as it now strikes me unfair to Ross and Robert, but no-one  
complained.  Perhaps no-one ever read that far through the preface!


There's an excellent brief summary of the history of R, and its  
connections with S, in Section 1.4 of John Chambers' "Software for  
Data Analysis".Appendix A has further details on the development  
of S, a kind of pre-history of R.


John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sweave style file

2008-12-26 Thread John Maindonald
It would, I consider, be useful to note, on the help pages for Sweave  
and RweaveLatex, that the path to Sweave.sty in the R installation can  
be extracted as paste(R.home(),"share/texmf/", sep="/") or its  
equivalent.


John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R] adding a new dataset to the default R distribution

2008-12-04 Thread John Maindonald
Making data, especially data that have been the subject of published  
papers, widely available, can be a useful spinoff from the R project,  
another gift to the scientific community beyond the provision of  
computing and analytic tools.  Nowadays, in a complete publication of  
a scientific result, there is every reason for the data to be part of  
that publication.  The Gentleman and Lang 2004 paper "Statistical  
Analyses and Reproducible Research" takes this further still, making a  
compelling case for opening the analysis to ready view. 
(http://www.bepress.com/bioconductor/paper2/ 
)  How else can critics know what analysis was done, and whether the  
data do really support the claimed conclusions?

As I see it, the first recourse should be use of archives that  
individual communities may establish.  Instructions on how to input  
the data into R would be a useful small item of ancillary  
information.  Links to such archives (under Data Archives, maybe)  
might be included on CRAN.  The Open Archaeology project would seem a  
good umbrella for the archiving of archaeology data.

Where there is no available repository, or there are reasons for  
putting the data into an R package, one possibility is to advertise on  
this list: "Orphan dataset, looking for a good home".  In this case I  
have offered to include the data in the DAAGxtras package, and I am  
open to further such requests.  Perhaps however, there should be a   
"miscdata" or suchlike package to which such datasets can be  
submitted? All it would require is for someone to offer to act as  
Keeper of the Miscellaneous Data".

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 04/12/2008, at 10:00 PM, [EMAIL PROTECTED] wrote:

> From: Stefano Costa <[EMAIL PROTECTED]>
> Date: 3 December 2008 9:29:13 PM
> To: r-help@r-project.org
> Subject: [R] adding a new dataset to the default R distribution
>
>
> Hi,
> I am a student in archaeology with some interest in statistics and R.
>
> Recently I've obtained the permission to distribute in the public  
> domain
> a small dataset (named "spearheads" with 40 obs. of  14 variables)  
> that
> was used in an introductory statistics book for archaeologists
> (published in 1994).
>
> I've rewritten most of the exercises of that book in R and made them
> available at http://wiki.iosa.it/diggingnumbers:start along with the
> original dataset, but I was wondering if there is a standard procedure
> for getting a new dataset included in the default R distribution, like
> "cars" and others.
>
> Please add me to CC if replying because I'm not subscribed to the  
> list.
>
> Best regards,
> Stefano
>
> -- 
> Stefano Costa
> http://www.iosa.it/ Open Archaeology


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Congratulations, Ross!

2008-11-11 Thread John Maindonald
"Associate Professor Ihaka, of the Department of Statistics, was  
recognised with the Royal Society’s Pickering medal for his part in  
developing the statistical computing software R. R, created over 15  
years ago by Dr Ihaka and colleague Robert Gentleman (now at the Fred  
Hutchinson Cancer Research Center, Seattle), is free software used by  
academics, industry and government worldwide to analyse numerical data  
and present it in graphical forms. Computer programmers globally have  
contributed to R, with over 1000 industry-specific add-ons created."


from
http://www.auckland.ac.nz/uoa/about/news/articles/2008/11/rsnz_honours.cfm
12 November 2008.

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R seven years ago

2008-10-09 Thread John Maindonald
So what is the answer to the question: "Can success continue"?

I suspect that R is now so firmly entrenched that it will
inevitably continue, in one or other incarnation, for a long
time to come.  The negative factors that John Fox lists
will surely, in time, make some changes inevitable.  Will
these come from force of circumstance rather than from
conscious planning?

In an August 12 message I posted details of R citation rates that
I had gleaned, following a lead from Simon Blomberg, from Web
of Science.  This, or some such measure, seems to me important
as giving a handle on the penetration of R into statistical
application areas.

The numbers I obtained [I&G = Ihaka & Gentleman 1996; RSTAT is
the citation suggested by citation()] were:

I&G: 1998=4,
  1999=15,
   2000=17,
   2001=39,
   2002=119,
   2003=276
RSTAT+I&G: 2004:68+455 = 523
2005:433+512 = 945
2006:1049+426 = 1475
2007:1605+410 = 2015
2008, (to ~Aug10):1389+255 = 1644

cit <- c("1998" = 4, "1999" = 15, "2000" = 17, "2001" = 39, "2002" =  
119,
"2003" = 276, "2004" = 523,"2005" = 945,"2006" = 1475, "2007" =  
2015,
    "2008"=1644)

These will not be all that accurate; there will be omissions
and duplications.

Growth is close to exponential.

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 09/10/2008, at 9:00 PM, [EMAIL PROTECTED] wrote:

> From: Peter Dalgaard <[EMAIL PROTECTED]>
> Date: 9 October 2008 5:42:19 AM
> To: [EMAIL PROTECTED]
> Cc: "r-help@R-project.org" 
> Subject: Re: [R] R seven years ago
>
>
> (Ted Harding) wrote:
>> On 08-Oct-08 18:00:27, Liviu Andronic wrote:
>>
>>> Hello everyone,
>>>
>>> As some may know, today Google unveiled its 2001 search index [1]. I
>>> was curious to see how was R like at that time, and was not
>>> disappointed. Compared to today's main page [2], seven years ago the
>>> page looked [3] a bit rudimentary, especially the graphic. (It is  
>>> wort
>>> noting that structurally the pages are very similar.) What  
>>> definitely
>>> changed is the `Contributed packages' section. Then R featured 29
>>> contributed packages [4], while now it features 1500+ [5]. It was
>>> surprising to realize the growth of R during the past seven years.
>>>
>>> Regards,
>>> Liviu
>>>
>>> [1] http://www.google.com/search2001.html
>>> [2] http://www.r-project.org/
>>> [3] http://web.archive.org/web/20010722202756/www.r-project.org/
>>> [4]
>>> http://web.archive.org/web/20010525004023/cran.r-project.org/bin/macos/c
>>> ontrib/src/
>>> [5] http://cran.at.r-project.org/web/packages/
>>>
>>
>> Many thanks for this, Liviu! One might also compare the mailing list
>> usage:
>>
>> [R-help 1997]:   484 messages
>> [R-help 2001]:  4309 messages
>> [R-help 2007]: 26250
>>   1721+1909+2196+2145+2210+2309+
>>   2142+2246+2028+2711+2602+2031
>>
>> So we now get more posts in a week than we did in the whole of 1997!
>>
>>
> Those not present at the useR in Dortmund might want to skim John  
> Fox's talk
>
> http://www.statistik.uni-dortmund.de/useR-2008/slides/Fox.pdf
>
> (Actually, he did something at the end to avoid ending on a negative
> note. Flipped back to one of the increasing graphs, I suppose.)
>
> -- 
>   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
>  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)  
> 35327918
> ~~ - ([EMAIL PROTECTED])  FAX: (+45)  
> 35327907


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Citation rates

2008-08-11 Thread John Maindonald

Following some discussion with Simon Blomberg, I've done
a few Web of Science citation searches.  A topic search for
R LANG* STAT* seems to turn up most of the references to
"R: A Language and Environment for Statistical Computing"
"R Development Core Team" gets transformed into an
astonishing variety of variations.  Searching for citations
of the 1996 Ihaka and Gentleman paper (most references
up to and including 2004) turns up many fewer quirks.

What other forms of reference should be investigated?

Anyway, here are the numbers by year (there may a some
duplication.
1998: I&G: 4 15 17 39 119 276
2004: RSTAT+I&G: 68+455 433+512 1049+426 1605+410 1389+255
  523 945   
1475   2015  1644
cit <- c("1998" = 4, "1999" = 15, "2000" = 17, "2001" = 39, "2002" =  
119,
+  "2003" = 276,"2004" = 523,"2005" = 945,"2006" = 1475,  
"2007" = 2015,

+  "2008"=1644)

[~4550 references to R LANG* STAT*; ~2530 to I&G)

On a rate per year basis, the 2008 figure scales up to 2691.
This does not however allow for growth over the course of the year.

The number of references grew by 37% from 2006 to 2007.
On current trends, the 2007-2008 increase seems likely to
be much larger than that.

The figures probably underestimate the contribution from
Bioconductor related work.  A direct search for
Bioconductor-related papers did not turn however up
enough papers to make too much difference to the numbers.

Here are some other summary figures, for graphing using
whatever form of presentation appeals most (the second
number is for the I&G paper)

country <- c(usa=1540+903, germany=539+304, england=507+328,
france=468+337,  canada=345+147, australia=329+169,
switzerland=279+121)

subj <- c(ecology=924+349, statsANDprob=488+270,  
geneticsANDheredity=488+279,

  envScience=298+119, CSapplicatiions=269+108, zoology=267+111,
  plantSciences=250+108, biochemANDmolbio=229+200,
  mathANDcompBIO=224+143,  
biotechANDappliedmicrobiology=223+159,

  evolutionaryBIO=210+117)

There's a great deal more summary information that might
be extracted.  What is a good way, with readily available data,
to standardize the country data.

Environmental Science no doubt comes up tops because it
is a coarser grouping than many other areas.

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dividing by 0

2008-07-26 Thread John Maindonald

I suspect you should be smoothing the series in a manner that
replaces zeros by some usually small larger number before
you start.  Without more details on what you are trying to do, it
is impossible to know what is sensible.  You are proposing to
leave all smoothing ("rolling"?) till later; why not do some
smoothing at the start?

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 26 Jul 2008, at 8:00 PM, [EMAIL PROTECTED] wrote:


From: nmarti <[EMAIL PROTECTED]>
Date: 26 July 2008 1:42:09 AM
To: r-help@r-project.org
Subject: Re: [R] Dividing by 0


I'm well aware these are not errors, I guess I miss-wrote.
I understand your concern.  Thanks for passionately looking out for  
my well

being, you saved my life.

My variable has about 10,000 elements and sometime for the first 100  
to 500

elements there is lots of 0's, so I end up with lots of NA/NaN/Inf's.
However, when I try to use "Rolling" calculations I recieve error  
messages
because the "Rolling" functions reject the NA/NaN/Inf's.  So, I need  
0's in
place of the NA/NaN/Inf's so I can run the "Rolling" calculations.   
I can't
just delete these observations, because it messes up lots of other  
other

things within these dataframes.

I'm well aware these "Rolling" calculations will be wrong in the  
beginning
of the dataframe, so I just throw these out.  The rolling window is  
only
about 50 odservations, so out of 10,000, I still end up with ample  
correct

data and calculations.

So is this still idiotic?
Thanks again for your concern.  Now that you understand my situation a
little better, you might be less distracted today and be able to sleep
better tonight.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] lme nesting/interaction advice

2008-05-15 Thread John Maindonald

I've been looking back over this discussion.
Another model that one can fit using lme is:

> lme(score~Machine, random=list(Worker=pdIdent(~0+Machine)),
+weights=varIdent(form=~1|Machine), data=Machines)
Linear mixed-effects model fit by REML
 Data: Machines
 Log-restricted-likelihood: -108.9547
 Fixed: score ~ Machine
(Intercept)MachineBMachineC
 52.367.97   13.916667

Random effects:
Formula: ~0 + Machine | Worker
Structure: Multiple of an Identity
   MachineA MachineB MachineC Residual
StdDev:  6.06209  6.06209  6.06209 1.148581

Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | Machine
Parameter estimates:
   A B C
1.000 0.8713263 0.5859709
Number of Observations: 54
Number of Groups: 6


This insists (I think) that conditional on the random effect for the  
worker,

the worker/machine random effects be independent,
but allows them to have different variances.  I am wondering whether
it is possible to fit such a model using lmer().

[In this example the large estimated correlations suggest that it is not
a sensible model.]

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 14 May 2008, at 2:49 AM, Douglas Bates wrote:

On Mon, May 12, 2008 at 5:39 PM, Rolf Turner  
<[EMAIL PROTECTED]> wrote:


On 13/05/2008, at 4:09 AM, Douglas Bates wrote:


I'm entering this discussion late so I may be discussing issues that
have already been addressed.

As I understand it, Federico, you began by describing a model for  
data
in which two factors have a fixed set of levels and one factor has  
an

extensible, or "random", set of levels and you wanted to fit a model
that you described as

y ~ effect1 * effect2 * effect3

The problem is that this specification is not complete.


 At *last* (as Owl said to Rabbit) we're getting somewhere!!!

 I always knew that there was some basic fundamental point
 about this business that I (and I believe many others) were
 simply missing.  But I could not for the life of me get anyone
 to explain to me what that point was.  Or to put it another
 way, I was never able to frame a question that would illuminate
 just what it was that I wasn't getting.

 I now may be at a stage where I can start asking the right
 questions.


An interaction of factors with fixed levels and a factor with random
levels can mean, in the lmer specification,

lmer(y ~ effect1 * effect2 + (1| effect3) + (1| 
effect1:effect2:effect3),

...)

or

lmer(y ~ effect1 * effect2 + (effect1*effect2 | effect3), ...)

or other variations.  When you specify a random effect or an random
interaction term you must, either explicitly or implicitly, specify
the form of the variance-covariance matrix associated with those
random effects.

The "advantage" that other software may provide for you is that it
chooses the model for you but that, of course, means that you only
have the one choice.


 Now may I start asking what I hope are questions that will lift
 the fog a bit?

 Let us for specificity consider a three-way model with two
 fixed effects and one random effect from the good old  
Rothamstead

style
 agricultural experiment context:  Suppose we have a number of
 species/breeds of wheat (say) and a number of fertilizers.
 These are fixed effects.  And we have a number of fields  
(blocks)

 --- a random effect.  Each breed-fertilizer combination is
 applied a number of times in each field.  We ***assume*** that
 that the field or block effect is homogeneous throughout.  This
 may or may not be a ``good'' assumption, but it's not completely
 ridiculous and would often be made in practice.  And probably
 *was* made at Rothamstead.  The response would be something like
 yield in bushels per acre.

 The way that I would write the ``full'' model for this setting,
 in mathematical style is:

 Y_ijkl = mu + alpha_i + beta_j + (alpha.beta)_ij + C_k +  
(alpha.C)_ik

 + (beta.C)_jk + (alpha.beta.C)_ijk + E_ijkl

 The alpha_i and beta_j are parameters corresponding to breed and
fertilizer
 respectively; the C_k are random effects corresponding to  
fields or

blocks.
 Any effect ``involving'' C is also random.

 The assumptions made by the Package-Which-Must-Not-Be-Named  
are (I

think)
 that

 C_k ~ N(0,sigma_C^2)
 (alpha.C)_ik ~ N(0,sigma_aC^2)
 (beta.C)jk ~ N(0,sigma_bC^2)
 (alpha.beta.C)_ijk ~ N(0,sigma_abC^2)
 E_ijkl ~ N(0,sigma^2)

 and these random variables are *all independent*.

 A ... perhaps I'm on the wa

[R] predict lmer

2008-05-10 Thread John Maindonald
The following function is designed to work with a logit link.  It can
easily be generalized to work with any link.  The SEs and CIs are
evaluated accounting for all sources of random variation. The plot
may not be much help unless there is just one explanatory variate.

`ciplot` <-
  function(obj=glmm2, data=data.site, xcol=2, nam="litter"){
cilim <- function(obj, xcol){
  b <- fixef(obj)
  vcov <- summary(obj)@vcov
  X <- unique(model.matrix(obj))
  hat <- X%*%b
  pval <- exp(hat)/(1+exp(hat))   # NB, designed for logit link
  U <- chol(as.matrix(summary(obj)@vcov))
  se <- sqrt(apply(X%*%t(U), 1, function(x)sum(x^2)))
  list(hat=hat, se=se, x=X[,xcol])
   }
limfo <- cilim(obj, xcol)
hat <- limfo$hat
se <- limfo$se
x <- limfo$x
upper <- hat+2*se
lower <- hat-2*se
ord <- order(x)
plot(x, hat, yaxt="n", type="l", xlab=nam, ylab="")
rug(x)
lines(x[ord], lower[ord])
lines(x[ord], upper[ord])
ploc <- c(0.01, 0.05, 0.1, 0.2, 0.5, 0.8, 0.9)
axis(2, at=log(ploc/(1-ploc)), labels=paste(ploc), las=2)
}

## Usage
glmm2 <- lmer(rcr ~ litter + (1 | Farm), family=binomial,  
data=data.site)
ciplot(obj=glmm2)

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 8 May 2008, at 8:00 PM, [EMAIL PROTECTED] wrote:
> From: "May, Roel" <[EMAIL PROTECTED]>
> Date: 8 May 2008 12:23:15 AM
> To: r-help@r-project.org
> Subject: [R] predict lmer
>
>
> Hi,
>
> I am using lmer to analyze habitat selection in wolverines using the
> following model:
>
> (me.fit.of <-
> lmer(USED~1+STEP+ALT+ALT2+relM+relM:ALT+(1|ID)+(1| 
> ID:TRKPT2),data=vdata,
> control=list(usePQL=TRUE),family=poisson,method="Laplace"))
>
> Here, the habitat selection is calaculated using a so-called discrete
> choice model where each used location has a certain number of
> alternatives which the animal could have chosen. These sets of  
> locations
> are captured using the TRKPT2 random grouping. However, these sets are
> also clustered over the different individuals (ID). USED is my binary
> dependent variable which is 1 for used locations and zero for unused
> locations. The other are my predictors.
>
> I would like to predict the model fit at different values of the
> predictors, but does anyone know whether it is possible to do this? I
> have looked around at the R-sites and in help but it seems that there
> doesn't exist a predict function for lmer???
>
> I hope someone can help me with this; point me to the right  
> functions or
> tell me to just forget it
>
> Thanks in advance!
>
> Cheers Roel
>
> Roel May
> Norwegian Institute for Nature Research
> Tungasletta 2, NO-7089 Trondheim, Norway


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] REvolution

2008-01-26 Thread John Maindonald
Does anyone know any more than is in the following press release
about REvolution Computing and their commercialization of R?

http://www.intel.com/capital/news/releases/080122.htm

"Intel Capital, the global investment arm of Intel Corporation, today  
announced that it has invested in the Series A financing of REvolution  
Computing, creator of parallel computing software for computational  
statistics"

"R has become the de-facto statistical language and REvolution  
Computing's R-based solutions provide the necessary scale and support  
for its growing commercial usage,” said Andre M. Boisvert, Revolution  
Computing board member. “With this investment, Revolution Computing  
can deliver the type of performance that has been missing in existing  
computational statistics offerings."

"For more information on Revolution Computing, RPro, and ParallelR,  
visit
www.revolution-computing.com"

At present though, unless you have better success than me, all you  
will get from the Revolution Computing site is a revolutionary song!

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Symbolic substitution in parallel; use infinity symbol?

2008-01-01 Thread John Maindonald
Actually this works beautifully as it stands:

breaks <- c(-Inf, -1, 1, Inf)

zz <- lapply(breaks, function(x)  if(x==-Inf) quote(-infinity) else
 if (x==Inf) quote(infinity) else  format(x))

lbl <- mapply(function(x,y)
bquote("(" * .(x) * "," * .(y) * "]"),
 zz[-length(zz)], zz[-1], SIMPLIFY=FALSE)

y <- rnorm(40)
x0 <- cut(y, breaks=breaks)
plot(x0, y, xaxt="n")

axis(1, at=1:3, labels=as.expression(lbl))

Thanks, Peter.

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 31 Dec 2007, at 12:35 PM, Peter Dalgaard wrote:

> John Maindonald wrote:
>> I'd like to be able to modify axlab in (C) below so that 'Inf'
>> is replaced by the infinity symbol.
>>
>> y <- rnorm(40)
>> breaks <- c(-Inf, -1, 1, Inf)
>> x <- cut(y, breaks=breaks)
>> plot(unclass(x), y, xaxt="n", xlab="")
>>
>> ## A: The following gives the axis labels "(-Inf, 1]", etc.
>> axis(1, at=1:3, labels=expression("(-Inf,-1]", "(-1,1]", "(1, Inf]"))
>>
>> ## B: The following replaces Inf by the infinity symbol
>> axis(1, at=1:3, labels=expression("(" * -infinity * ", " * -1 * "]",
>> "(" * -1 * ", 1]", "(1, " *  
>> infinity  * "]"),
>>line=1.25, lty=0)
>>
>> ## C: The following gives the axis labels "(-Inf, 1]", etc.,
>> ## in a more automated manner.
>> axlab <- lapply(levels(x), function(x)substitute(a, list(a=x)))
>> # Can alternatively use bquote()
>> axis(1, at=1:3, labels=as.expression(axlab), line=2.25, lty=0)
>>
>> However I have been unable to modify axlab so that the infinity
>> symbol appears in place of 'Inf'.  Is there is some relatively
>> straightforward way to do this?  The issue is of course more
>> general than this specific example.
>>
> Here's an idea, leaving some tinkering for you:
>
> breaks <- c(-Inf, -1, 1, Inf)
>
> zz <- lapply(breaks, function(x)  if(x==-Inf) quote(-infinity)  
> else
> if (x==Inf) quote(infinity) else  format(x))
>
> lbl <- mapply(function(x,y)
>bquote("(" * .(x) * "," * .(y) * "]"),
> zz[-4], zz[-1], SIMPLIFY=FALSE)
>
>
> -- 
>  O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)  
> 35327918
> ~~ - ([EMAIL PROTECTED])  FAX: (+45)  
> 35327907
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Symbolic substitution in parallel; use infinity symbol?

2007-12-30 Thread John Maindonald

I'd like to be able to modify axlab in (C) below so that 'Inf'
is replaced by the infinity symbol.

y <- rnorm(40)
breaks <- c(-Inf, -1, 1, Inf)
x <- cut(y, breaks=breaks)
plot(unclass(x), y, xaxt="n", xlab="")

## A: The following gives the axis labels "(-Inf, 1]", etc.
axis(1, at=1:3, labels=expression("(-Inf,-1]", "(-1,1]", "(1, Inf]"))

## B: The following replaces Inf by the infinity symbol
axis(1, at=1:3, labels=expression("(" * -infinity * ", " * -1 * "]",
 "(" * -1 * ", 1]", "(1, " * infinity  
* "]"),
line=1.25, lty=0)

## C: The following gives the axis labels "(-Inf, 1]", etc.,
## in a more automated manner.
axlab <- lapply(levels(x), function(x)substitute(a, list(a=x)))
# Can alternatively use bquote()
axis(1, at=1:3, labels=as.expression(axlab), line=2.25, lty=0)

However I have been unable to modify axlab so that the infinity
symbol appears in place of 'Inf'.  Is there is some relatively
straightforward way to do this?  The issue is of course more
general than this specific example.

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reminiscing on 20 years using S

2007-12-27 Thread John Maindonald
My first exposure to S was on an AT&T 3B2 (a 3B2/100,
I think), at the Auckland (Mt Albert) Applied Mathematics
Division Station of the NZ Dept of Scientific and Industrial
Research.  The AMD Head Office in Wellington had one
also.  There may have been one or more others; I cannot
remember. This would have been in 1983, maybe.

It was a superbly engineered machine, but the sofware
(System V, version 3.2) had its problems.  If you back
deleted too far along the command line, something
unpleasant (losing the line? or worse?) happened.
On typing 1+1 at the S command line, it took a second
to get an answer.

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 27 Dec 2007, at 10:00 PM, [EMAIL PROTECTED] wrote:

> From: roger koenker <[EMAIL PROTECTED]>
> Date: 27 December 2007 9:56:45 AM
> To: Greg Snow <[EMAIL PROTECTED]>
> Cc: R-help list <[EMAIL PROTECTED]>
> Subject: Re: [R] Reminiscing on 20 years using S
>
> On Dec 26, 2007, at 2:05 PM, Greg Snow wrote:
>
>> I realized earlier this year (2007) that it was in 1987 that I first
>> started using an early version of S (it was ported to VMS and was  
>> called
>> success).  That means that I have been using some variant of S (to
>> various degrees) for over 20 years now (I don't feel that old).
>
> Boxing day somehow seems appropriate for this thread.  R.I.P. to all  
> those old boxes
> of yesteryore and the software that ran on them -- and yet there is  
> always a residual  archaeological curiosity.
>
> I discovered recently that the MIT athena network contains a circa  
> 1989 version
> of S:  http://stuff.mit.edu/afs/athena/astaff/project/Sdev/S/  which  
> made me wonder
> whether there was any likelihood that one could recreate "S Thu Dec   
> 7 16:49:47 EST 1989".
> Curiosity is one thing, time to dig through the layers of ancient  
> civilizations is quite another.
> But if anyone would like to offer a (preferably educated) guess  
> about the feasibility of  such a project, like I said, I would be  
> curious.
>
>
> url:www.econ.uiuc.edu/~rogerRoger Koenker
> email   [EMAIL PROTECTED]   Department of  
> Economics
> vox:217-333-4558University of Illinois
> fax:217-244-6678Champaign, IL 61820

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Packages - a great resource, but hard to find the right one

2007-11-23 Thread John Maindonald
R as a whole could benefit from systematic attention.
There may be scope for several special issues of JSS.

- R overview and philosophy

- R penetration and influence, in the statistics community,
   in machine learning, and in a variety of application areas.

- R as a vehicle for fostering communication between
researchers in diverse areas.
[A great thing about R-help, though nowadays this role
 is passing across to other lists such as R-sig-ME, is that
 it facilitates and even forces communication between
 application area specialists, and between those
 specialists and statistics professionals.  This may be
 temporary; we may see the R community fragment into
 diverse communities that focus on their own specialist
 interests? Scope for a sociological study, perhaps?)

- "Who is using R?", as reflected in published scientific literature.
 (I'd like to see a wiki or somesuch where authors
 are encouraged to give details of published analyses
 that have used R.)

- Where is R headed?  How will the shaping of its direction
 proceed?  Will it be a matter of step by step change and
 improvement, or is it (will it be) possible to lay out in
 advance an outline of the directions that its future
 development can be expected to take.

- Traps for new (and old) users.

- Books and papers on R.

- Then onto packages!  I guess what may be in order is
 something like the expansion of a task view into an
 extended paper.


John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 23 Nov 2007, at 10:00 PM, [EMAIL PROTECTED] wrote:

> From: Antony Unwin <[EMAIL PROTECTED]>
> Date: 23 November 2007 7:50:52 PM
> To: [EMAIL PROTECTED]
> Subject: Re: [R] Packages - a great resource, but hard to find the  
> right one
>
>
> Johannes Hüsing wrote
>
>>> Above all there are lots of packages.  As the software editor of the
>>> Journal of Statistical Software I suggested we should review R
>>> packages.
>>
>> You mean: prior to submission?
>
> No.
>
>>> No one has shown any enthusiasm for this suggestion, but I
>>> think it would help.  Any volunteers?
>>
>> Thing is, I may like to volunteer, but not in the "here's a
>> package for you to review by week 32" way. Rather in the way that
>> I search a package which fits my problem.
>
> That's what I was hoping for.
>
>> One package lets me down
>> and I'd like to know other users and the maintainer about it.
>> The other one works black magic and I'd like to drop a raving
>> review about it. This needs an infrastructure with a low barrier
>> to entry. A wiki is not the worst idea if the initial infrastructure
>> is geared at addressing problems rather than packages.
>
> We should differentiate between rave reviews of features that just
> happened to be very useful to someone and reviews of a package as a
> whole.  Both have their place and at the moment we don't have either.
>
> If you are willing to review an R package or aspects of R for JSS
> please let me know.
>
> Antony Unwin
> Professor of Computer-Oriented Statistics and Data Analysis,
> Mathematics Institute,
> University of Augsburg,
> Germany


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] mixed model testing

2007-11-07 Thread John Maindonald
Whether or not you need a mixed model, e.g. random versus
fixed slopes, depends on how you intend to use results.

Suppose you have lines of depression vs lawn roller weight
calculated for a number of lawns. If the data will always be
used to make predictions for one of those same lawns, a
fixed slopes model is fine.

If you want to use the data to make a prediction for another
lawn from the same "population" (the population from which
this lawn is a random sample, right?), you need to model
the slope as a random effect.

Now for a more subtle point:

In the prediction for another lawn situation, it is possible that
the slope random effect can be zero, and analysts do very
commonly make this sort of assumption, maybe without
realizing that this is what they are doing.  You can test whether
the slope random effect is zero but, especially if you have data
from a few lawns only, failure to reject the null (zero random
effect) is not a secure basis for inferences that assume that
the slope is indeed zero. The "test for zero random effect, then
infer" is open to Box's pithy objection that
"... to make preliminary tests on variances is rather like putting to
sea in a rowing boat to find out whether conditions are sufficiently
calm for an ocean liner to leave port".


John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.


On 8 Nov 2007, at 1:55 AM, Irene Mantzouni wrote:

> Is there a formal way to prove the need of a mixed model, apart from  
> e.g. comparing the intervals estimated by lmList fit?
> For example, should I compare (with AIC ML?) a model with seperately  
> (unpooled) estimated fixed slopes (i.e.using an index for each  
> group) with a model that treats this parameter as a random effect  
> (both models treat the remaining parameters as random)?
>
> Thank you!
>
> ___
> [EMAIL PROTECTED] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] postscript(), used from a pre R-2.6.0 workspace

2007-10-30 Thread John Maindonald
I find that if start R-2.6.0 in a workspace with no .RData file,
load one of my workspaces from R-2.5.1 or earlier into R-2.6.0,
and then before doing anything else type postscript(file="z.ps"),
I get::

 > ls()
character(0)
 > load(url("http://www.maths.anu.edu.au/~johnm/r/test5.RData";))
 > postscript(file="z.ps")
Error in postscript(file = "z.ps") : Invalid font type
In addition: Warning messages:
1: closing unused connection 3 
(gzcon(http://www.maths.anu.edu.au/~johnm/r/test5.RData) 
)
2: In postscript(file = "z.ps") :
   font family not found in PostScript font database
3: In postscript(file = "z.ps") :
   font family not found in PostScript font database

Or R may be started in a workspace with such a .RData file,
or with a .RData file from R-2.5.1 or earlier.

This behavior persists even if I take such an image file, load it,
clear the workspace, and save it.  An instance of such a .RData
file has the url:
http://www.maths.anu.edu.au/~johnm//r/test5.RData

It makes no difference whether I load the file into R-2.6.0
on a MacOS X or Windows system, or whether I use the current
R-patched (2007-10-27 r43288).  I do not know whether the
source of the original .RData file matters, or how it may depend
on what I have done in that earlier workspace.

For moving my .RData files across without messing up use of
postscript() [or pdf()],  a workaround is to dump the contents of the
earlier workspace, then start R in a working directory whose
.RData file, if any, has the proper R-2.6.0 pedigree, and source
the dumped file.  I do not want to have to directly run the function
that creates the database each time I start a session.

Has anyone else hit this?

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.