Re: [Rd] Environment setting _R_CHECK_DEPENDS_ONLY_='true'
A footnote, following an off-list exchange with Prof Ripley, is that I needed needed to have Suggested or other additional packages installed somewhere other than .Library . "The following variables control checks for undeclared/unconditional use of other packages. They work by setting up a temporary library directory and setting .libPaths() to just that and .Library, so are only effective if additional packages are installed somewhere other than .Library.” [I am not sure of the source of this quote.] If vignettes make extensive use of Suggested packages, then exiting early from vignettes when access would otherwise be required to Suggested package [under knitr, one can use knitr::knit_exit()] can be an alternative to leaving out checking of vignettes in order to speed up initial testing. On MacOS Mojave with a bash shell env _R_CHECK_DEPENDS_ONLY_=true R CMD check qra_0.2.4.tar.gz works like a charm. Some other Unix systems will omit the ‘=' John Maindonald email: john.maindon...@anu.edu.au<mailto:john.maindon...@anu.edu.au> On 21/10/2021, at 02:31, Dirk Eddelbuettel mailto:e...@debian.org>> wrote: On 20 October 2021 at 09:31, Sebastian Meyer wrote: | If you set the environment variable inside a running R process, it will | only affect that process and child processes, but not an independent R | process launched from a shell like you seem to be doing here: Yes. That is somewhat common, if obscure, knowledge by those bitten before. Maybe a line or two could be / should be added to the docs to that effect? | How to set environment variables is system-specific. On a Unix-like | system, you could use the command | | _R_CHECK_DEPENDS_ONLY_=true R CMD check qra_0.2.4.tar.gz | | to set the environment variable for this R process. | See, e.g., https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FEnvironment_variabledata=04%7C01%7Cjohn.maindonald%40anu.edu.au%7Cb519af02f1df454df49208d993cdea27%7Ce37d725cab5c46249ae5f0533e486437%7C0%7C0%7C637703335008269211%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000sdata=aCCVvvnWQaRxtzxuUJ5lDKcPfMU2BzCJnDRC%2BTa4TnI%3Dreserved=0. R does have hooks for this, I had these for a few years now: ~/.R/check.Renviron ~/.R/check.Renviron-Rdevel Again, might be worthwhile documenting it in the Inst+Admin manual (if it isn' already, I don't recall right now). Dirk -- https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdirk.eddelbuettel.com%2Fdata=04%7C01%7Cjohn.maindonald%40anu.edu.au%7Cb519af02f1df454df49208d993cdea27%7Ce37d725cab5c46249ae5f0533e486437%7C0%7C0%7C637703335008269211%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000sdata=5MTPe0%2Ftqou%2B0DI8%2F7C4NYtM3tJCb4Vpwbe4klWiTco%3Dreserved=0 | @eddelbuettel | e...@debian.org<mailto:e...@debian.org> [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Environment setting _R_CHECK_DEPENDS_ONLY_='true'
Setting Sys,setenv('_R_CHECK_DEPENDS_ONLY_'=‘true’) or Sys,setenv('_R_CHECK_DEPENDS_ONLY_’=TRUE) (either appear to be acceptable) appears to have no effect when I do, e.g. $R CMD check qra_0.2.4.tar.gz * using log directory ‘/Users/johnm1/pkgs/qra.Rcheck’ * using R version 4.1.1 (2021-08-10) * using platform: x86_64-apple-darwin17.0 (64-bit) * using session charset: UTF-8 . . . (This should have failed.) I’d have expected that the "On most systems . . .” mentioned in the Writing R extensions manual (1.1.3.1 Suggested packages) would include my setup. Any insight on what I am missing will be welcome. John Maindonald email: john.maindon...@anu.edu.au __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [R-pkg-devel] Namespace is imported from in YAML header, but attracts Note that is is not imported from
That is very helpful --- thank you John On Fri, 24 Sept 2021 at 17:42, Maëlle SALMON wrote: > Hello, > > It's better to get rid of this NOTE, by listing bookdown in > VignetteBuilder and Suggests, not Imports see > https://blog.r-hub.io/2020/06/03/vignettes/#infrastructure--dependencies-for-vignettes > That's actually what you did in another package > https://github.com/cran/gamclass/blob/master/DESCRIPTION (it's a > coincidence I found a package of yours via code search in CRAN GitHub > mirror :-) ). > > Maëlle. > > > Den fredag 24 september 2021 03:00:57 CEST, John H Maindonald < > jhmaindon...@gmail.com> skrev: > > > > > > On the Atlas and Linux builds of my package `qra` that has just been > posted > on CRAN, I am getting the message: > > > Namespace in Imports field not imported from: ‘bookdown’ > > All declared Imports should be used. > > This, in spite of the fct that the YAML header in two of the Rmd files for > the vignettes has: > > > output: > > bookdown::html_document2: > >theme: cayman > > Do I need to worry about this? > > John Maindonald > __ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel > > [[alternative HTML version deleted]] __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
[Rd] R CMD check message: The following files should probably not be installed
Sorry. This, and the description in the �Writing R Extensions� manual, leaves me completely mystified. Is it that I have to remove the PDFs that are created when I run �R CMD build�, and somehow ensure that they are rebuilt when the package is installed? Do I need a Makefile? John Maindonald email: john.maindon...@anu.edu.aumailto:john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 26 Jan 2015, at 22:00, r-devel-requ...@r-project.orgmailto:r-devel-requ...@r-project.org r-devel-requ...@r-project.orgmailto:r-devel-requ...@r-project.org wrote: From: Prof Brian Ripley rip...@stats.ox.ac.ukmailto:rip...@stats.ox.ac.uk Subject: Re: [Rd] R CMD check message: The following files should probably not be installed Date: 26 January 2015 19:52:12 AEDT To: r-devel@r-project.orgmailto:r-devel@r-project.org On 25/01/2015 23:25, John Maindonald wrote: I am doing [R version 3.1.2 (2014-10-31) -- Pumpkin Helmet�; Platform: x86_64-apple-darwin10.8.0 (64-bit)] R CMD build DAAGviz R CMD check DAAGviz_1.0.3.tar.gz Without a .Rinstignore file, I get: The following files should probably not be installed: �figs10.pdf�, �figs11.pdf�, �figs12.pdf�, �figs13.pdf�, �figs14.pdf�, �figs5.pdf�, �figs6.pdf�, �figs9.pdf� Consider the use of a .Rinstignore file: see �Writing R Extensions�, or move the vignette sources from �inst/doc� to �vignettes�. The vignette sources were in �vignettes� when DAAGviz_1.0.3.tar.gz was created. There was nothing in the �inst/doc� directory. If I have in my .Rinstignore file inst/doc/.*[.]pdf That filters out more than the files warned about. I guess you meant inst/doc/figs.*[.]pdf But the question has to be: how did the files get copied into inst/doc? Maybe 'When R CMD build builds the vignettes, it copies these and the vignette sources from directory vignettes to inst/doc. To install any other files from the vignettes directory, include a file vignettes/.install_extras which specifies these as Perl-like regular expressions on one or more lines. (See the description of the .Rinstignore file for full details.)' suggests how? then I get: * checking package vignettes in �inst/doc� ... WARNING Package vignettes without corresponding PDF/HTML: . . . What am I missing? Can I ignore the The following files should probably not be installed� message? Not if you want to submit the package to CRAN. John Maindonald email: john.maindon...@anu.edu.aumailto:john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.orgmailto:R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 143, Issue 25
OK, I see now that I was supposed to twig that the reference was to putting the ‘.Rnw' files back into the vignettes directory from the inst/doc directory where they’d been placed in the course of creating the tar.gz file. I am still trying to work out what I need to put into ‘.Rinstignore’ so that ‘.install_extras’ is not installed. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 26 Jan 2015, at 22:00, r-devel-requ...@r-project.org r-devel-requ...@r-project.org wrote: From: Prof Brian Ripley rip...@stats.ox.ac.uk Subject: Re: [Rd] R CMD check message: The following files should probably not be installed Date: 26 January 2015 19:52:12 AEDT To: r-devel@r-project.org On 25/01/2015 23:25, John Maindonald wrote: I am doing [R version 3.1.2 (2014-10-31) -- Pumpkin Helmet”; Platform: x86_64-apple-darwin10.8.0 (64-bit)] R CMD build DAAGviz R CMD check DAAGviz_1.0.3.tar.gz Without a .Rinstignore file, I get: The following files should probably not be installed: ‘figs10.pdf’, ‘figs11.pdf’, ‘figs12.pdf’, ‘figs13.pdf’, ‘figs14.pdf’, ‘figs5.pdf’, ‘figs6.pdf’, ‘figs9.pdf’ Consider the use of a .Rinstignore file: see ‘Writing R Extensions’, or move the vignette sources from ‘inst/doc’ to ‘vignettes’. The vignette sources were in ‘vignettes’ when DAAGviz_1.0.3.tar.gz was created. There was nothing in the ‘inst/doc’ directory. If I have in my .Rinstignore file inst/doc/.*[.]pdf That filters out more than the files warned about. I guess you meant inst/doc/figs.*[.]pdf But the question has to be: how did the files get copied into inst/doc? Maybe 'When R CMD build builds the vignettes, it copies these and the vignette sources from directory vignettes to inst/doc. To install any other files from the vignettes directory, include a file vignettes/.install_extras which specifies these as Perl-like regular expressions on one or more lines. (See the description of the .Rinstignore file for full details.)' suggests how? then I get: * checking package vignettes in ‘inst/doc’ ... WARNING Package vignettes without corresponding PDF/HTML: . . . What am I missing? Can I ignore the The following files should probably not be installed” message? Not if you want to submit the package to CRAN. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford 1 South Parks Road, Oxford OX1 3TG, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] How do I prevent '.install_extras' from being installed?
So now I have: vignettes/.install_extras: inst/doc/figs.*[.]Rnw$ .Rinstignore: [.]DS_Store inst/doc/.*[.]pdf$ inst/doc/Sweavel.sty$ inst/doc/[.]install_extras$ Everything is fine except that 'R CMD check …’ generates the note: Found the following hidden files and directories: inst/doc/.install_extras These were most likely included in error. See section ‘Package structure’ in the ‘Writing R Extensions’ manual. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] R CMD check message: The following files should probably not be installed
I am doing [R version 3.1.2 (2014-10-31) -- Pumpkin Helmet”; Platform: x86_64-apple-darwin10.8.0 (64-bit)] R CMD build DAAGviz R CMD check DAAGviz_1.0.3.tar.gz Without a .Rinstignore file, I get: The following files should probably not be installed: ‘figs10.pdf’, ‘figs11.pdf’, ‘figs12.pdf’, ‘figs13.pdf’, ‘figs14.pdf’, ‘figs5.pdf’, ‘figs6.pdf’, ‘figs9.pdf’ Consider the use of a .Rinstignore file: see ‘Writing R Extensions’, or move the vignette sources from ‘inst/doc’ to ‘vignettes’. The vignette sources were in ‘vignettes’ when DAAGviz_1.0.3.tar.gz was created. There was nothing in the ‘inst/doc’ directory. If I have in my .Rinstignore file inst/doc/.*[.]pdf then I get: * checking package vignettes in ‘inst/doc’ ... WARNING Package vignettes without corresponding PDF/HTML: . . . What am I missing? Can I ignore the The following files should probably not be installed” message? John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Correction in help(factanal)
Thus factor analysis is in essence a model for the correlation matrix of x, Σ = Λ'Λ + Ψ This should surely be Σ = ΛΛ' + Ψ Also line 3 under “Details” says for a p–element row-vector x, … x is here surely a column vector, albeit the transpose of a row vector from the data matrix. cf page 322 of “Modern Applied Statistics with S”, 4th edn. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] no visible binding for global variable for data sets in a package
Re solution 2, the following is in the function tabFarsDead() the latest (0.55) version of gamclass: data('FARS', package='gamclass', envir=environment()) FARS - get(FARS, envir=environment()) The second statement is, strictly, redundant, but it makes the syntax checker happy. Another possibility might be: FARS - NULL data('FARS', package='gamclass', envir=environment()) I do not know whether this passes. An FAQ that offers preferred solutions to such chestnuts, or a web page, or a blog, would seem to me useful. John Maindonald email: john.maindon...@anu.edu.aumailto:john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 27 Aug 2014, at 20:00, r-devel-requ...@r-project.orgmailto:r-devel-requ...@r-project.org r-devel-requ...@r-project.orgmailto:r-devel-requ...@r-project.org wrote: From: Martin Maechler maech...@stat.math.ethz.chmailto:maech...@stat.math.ethz.ch Subject: Re: [Rd] no visible binding for global variable for data sets in a package Date: 27 August 2014 19:24:36 AEST To: Michael Friendly frien...@yorku.camailto:frien...@yorku.ca Cc: r-devel r-devel@r-project.orgmailto:r-devel@r-project.org Reply-To: Martin Maechler maech...@stat.math.ethz.chmailto:maech...@stat.math.ethz.ch Michael Friendly frien...@yorku.camailto:frien...@yorku.ca on Tue, 26 Aug 2014 17:58:34 -0400 writes: I'm updating the Lahman package of baseball statistics to the 2013 release. In addition to the main data sets, the package also contains several convenience functions that make use of these data sets. These now trigger the notes below from R CMD check run with Win builder, R-devel. How can I avoid these? * using R Under development (unstable) (2014-08-25 r66471) * using platform: x86_64-w64-mingw32 (64-bit) ... * checking R code for possible problems ... NOTE Label: no visible binding for global variable 'battingLabels' Label: no visible binding for global variable 'pitchingLabels' Label: no visible binding for global variable 'fieldingLabels' battingStats: no visible binding for global variable 'Batting' battingStats: no visible global function definition for 'mutate' playerInfo: no visible binding for global variable 'Master' teamInfo: no visible binding for global variable 'Teams' One such function: ## function for accessing variable labels Label - function(var, labels=rbind(battingLabels, pitchingLabels, fieldingLabels)) { wanted - which(labels[,1]==var) if (length(wanted)) labels[wanted[1],2] else var } and you are using the data sets you mentioned before, (and the checking has been changed recently here). This is a bit subtle: Your data sets are part of your package (thanks to the default lazyData), but *not* part of the namespace of your package. Now, the reasoning goes as following: if someone uses a function from your package, say Label() above, by Lahman::Label(..) and your package has not been attached to the search path, your user will get an error, as the datasets are not found by Label(). If you consider something like Lahman::Label(..) for a bit and the emphasis we put on R functions being the primary entities, you can understand the current, i.e. new, R CMD check warnings. I see the following two options for you: 1) export all these data sets from your NAMESPACE For this (I thinK), you must define them in Lahman/R/ possibly via a Lahman/R/sysdata.rda 2) rewrite your functions such that ensure the data sets are loaded when they are used. 2) actually works by adding stopifnot(require(Lahman, quietly=TRUE)) as first line in Label() and other such functions. It works in the sense that Lahman::Label(yearID) will work even when Lahman is not in the search path, but R-devel CMD check will still give the same NOTE, though you can argue that that note is actally a false positive. Not sure about another elegant way to make 2) work, apart from using data() on each of the datasets inside the function. As I haven't tried it, that may *still* give a (false) NOTE.. This is a somewhat interesting problem, and I wonder if everyone else has solved it with '1)' rather than a version of '2)'. Martin [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 137, Issue 25
Finding and not unnecessarily duplicating existing functionality is important also from a user perspective. Negative binomial regression provides a somewhat extreme example of existing overlap between packages, with the scope that this creates for confusing users, especially as the notation is not consistent between these different implementations. In addition to MASS::glm.nb(), note msme::negbinomial(), aod::negbin() and gamlss::gamlss(). The gamlss function fits two different types of NB model, either family = NBI (quadratic; var = mu(1 + sigma * mu)) as I think for all the functions above, or family=NBII (linear; var = mu(1 + sigma)). Also note the somewhat special purpose function glmnb.fit() in the stat mod package, which requires preliminary setup steps. John Maindonald email: john.maindon...@anu.edu.aumailto:john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 28/07/2014, at 8:00 pm, Darren Norris wrote: From: Darren Norris doo...@hotmail.commailto:doo...@hotmail.com Subject: Re: [Rd] [Wishlist] a 'PackageDevelopment' Task View Date: 28 July 2014 6:19:08 am AEST To: r-devel@r-project.orgmailto:r-devel@r-project.org Hi Luca, Based on previous comments seems like 1) there should be a multi-functional/general category to cover packages like devtools 2) I think finding existing function code ( e.g in cran packages / github ) is necessary and saves many hours in package development (no one wants to develop a package and then discover they have just reinvented the wheel). So including packages like sos seems justified and helpful. Best, Darren -- View this message in context: http://r.789695.n4.nabble.com/Wishlist-a-PackageDevelopment-Task-View-tp4694537p4694625.html Sent from the R devel mailing list archive at Nabble.comhttp://nabble.com/. [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Error in Writing R Extensions
In Section 1.4.2 of Writing R Extensions %\VignetteEngine{knitr::knitr} should be %\VignetteEngine{knitr::knit} sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-apple-darwin10.8.0 (64-bit) Is this sort of thing best reported here, or is a huge report in order? John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Failure to get compactPDF to compact a pdf file
I am failing to get compactPDF to make any change to a pdf file that, a/c to the message from the CRAN upload site, can be very substantially compacted. Any ideas what may be wrong? I have also tried recreating the pdf file. I also tried R CMD build --resave-data --compact-vignettes DAAG The data files compact alright (but I get the 'significantly better compression' warning message that might suggest that this is not happening), but the pdf file appears to go into the package unmodified. tools::compactPDF('/Users/johnm/packages/DAAG/inst/doc/', gs_quality = ebook) dir('/Users/johnm/packages/DAAG/inst/doc/') [1] rockArt.pdf sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.14.1 From the Unix command line: jhm:doc johnm$ ls -lt /Users/johnm/packages/DAAG/inst/doc total 1368 -rw-r--r--@ 1 johnm staff 696762 2 Aug 12:35 rockArt.pdf Message from the CRAN upload site: * checking sizes of PDF files under ‘inst/doc’ ... NOTE ‘gs’ made some significant size reductions: compacted ‘rockArt.pdf’ from 680Kb to 58Kb consider running tools::compactPDF(gs_quality = ebook) on these files John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Failure to get compactPDF to compact a pdf file
Quoting from the R-2.14.1 help page for compactPDF: This by default makes use of 'qpdf', available from URL: http://qpdf.sourceforge.net/ (including as a Windows binary) and included with the CRAN Mac OS X distribution of R. If 'gs_cmd' is non-empty, GhostScript will used instead. The defaults are: compactPDF(paths, qpdf = Sys.getenv(R_QPDF, qpdf), gs_cmd = Sys.getenv(R_GSCMD, ), gs_quality = c(printer, ebook, screen), gs_extras = character()) Sys.getenv(R_QPDF, qpdf) [1] /Library/Frameworks/R.framework/Resources/bin/qpdf Sys.getenv(R_GSCMD, ) [1] Thus, as far as I can see, compactPDF is set up (on my system) to use qpdf to compress. I take it then that the Writing R Extensions manual [2.14.1 (2011-12-22)] is anticipating what is in R-devel: The --compact-vignettes option will run tools::compactPDF over the PDF files in inst/doc (and its subdirectories) to losslessly compress them. This is not enabled by default (it can be selected by environment variable _R_BUILD_COMPACT_VIGNETTES_) and needs qpdf(http://qpdf.sourceforge.net/) to be available. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm On 24/01/2012, at 11:22 PM, Prof Brian Ripley wrote: On 24/01/2012 08:30, John Maindonald wrote: I am failing to get compactPDF to make any change to a pdf file that, a/c to the message from the CRAN upload site, can be very substantially compacted. Any ideas what may be wrong? AFAICS you are quoting a message from R-devel, which tries to find 'gs' for you. In R 2.14.1 you need to tell compactPDF where it is (assuming you do have it installed): see its help. I have also tried recreating the pdf file. I also tried R CMD build --resave-data --compact-vignettes DAAG Again, in R-devel you can do R CMD build --compact-vignettes=gs (assuming that is in your path or R_GSCMD is set), but not in R 2.14.1. But I have already told you directly (and been ignored) that the problem is the excessive resolution of the embedded bitmap image which needs to be down-sampled. The data files compact alright (but I get the 'significantly better compression' warning message that might suggest that this is not happening), but the pdf file appears to go into the package unmodified. tools::compactPDF('/Users/johnm/packages/DAAG/inst/doc/', gs_quality = ebook) dir('/Users/johnm/packages/DAAG/inst/doc/') [1] rockArt.pdf sessionInfo() R version 2.14.1 (2011-12-22) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.14.1 From the Unix command line: jhm:doc johnm$ ls -lt /Users/johnm/packages/DAAG/inst/doc total 1368 -rw-r--r--@ 1 johnm staff 696762 2 Aug 12:35 rockArt.pdf Message from the CRAN upload site: * checking sizes of PDF files under ‘inst/doc’ ... NOTE ‘gs’ made some significant size reductions: compacted ‘rockArt.pdf’ from 680Kb to 58Kb consider running tools::compactPDF(gs_quality = ebook) on these files John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 100, Issue 28
I get the same style of path as Hadley. This is on Windows 7 Home Premium with SP1. I start R by clicking on the R-2.31.0 icon. I'd assumed that it was a change that came with R-2.13.0! (On 32-bit Windows XP, which I have just checked, I do indeed get the 8.3 paths.) R.home() [1] C:/Programs/R/R-2.13.0 sessionInfo() R version 2.13.0 (2011-04-13) Platform: x86_64-pc-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_Australia.1252 [2] LC_CTYPE=English_Australia.1252 [3] LC_MONETARY=English_Australia.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_Australia.1252 attached base packages: [1] stats graphics grDevices utils datasets [6] methods base John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm From: Duncan Murdoch murdoch.dun...@gmail.com Date: 29 June 2011 10:17:46 AM AEST To: Hadley Wickham had...@rice.edu Cc: Simon Urbanek simon.urba...@r-project.org, r-devel@r-project.org Subject: Re: [Rd] Small bug in install.packages? On 28/06/2011 5:42 PM, Hadley Wickham wrote: Isn't R.home() 8.3 path anyway? I don't think so: R.home(bin) [1] C:/Program Files/R/R-2.13.0/bin/i386 Weird. Like others, I see 8.3 pathnames. R gets those from a Windows call; what version of Windows are you using? Duncan Murdoch [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] anova.lm fails with test=Cp
For unknown sigma^2, a version that is a modification of AIC may be preferred, i.e. n log(RSS/n) + 2p - n I notice that this is what is given in Maindonald and Braun (2010) Data Analysis Graphics Using R, 3rd edition. Cf: Venables and Ripley, MASS, 4th edn, p.174. VR do however stop short of actually saying that Cp should be modified in the same way as AIC when sigma^2 has to be estimated. Better still, perhaps, give the AIC statistic. This would make the output consistent with dropterm(), drop1() and add1(). Or if Cp is to stay, allow AIC as a a further test. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm On 08/05/2011, at 6:15 PM, peter dalgaard wrote: On May 8, 2011, at 09:25 , John Maindonald wrote: Here is an example, modified from the help page to use test=Cp: fit0 - lm(sr ~ 1, data = LifeCycleSavings) fit1 - update(fit0, . ~ . + pop15) fit2 - update(fit1, . ~ . + pop75) anova(fit0, fit1, fit2, test=Cp) Error in `[.data.frame`(table, , Resid. Dev) : undefined columns selected Yes, the Resid. Dev column is only there in analysis of deviance tables. For the lm() case, it looks like you should have RSS. This has probably been there forever. Just goes to show how often people use these things... Also, now that I'm looking at it, are we calculating it correctly in any case? We have cbind(table, Cp = table[, Resid. Dev] + 2 * scale * (n - table[, Resid. Df])) whereas all the references I can find have Cp=RSS/MS-N+2P, so the above would actually be scale*Cp+N. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] anova.lm fails with test=Cp
Here is an example, modified from the help page to use test=Cp: fit0 - lm(sr ~ 1, data = LifeCycleSavings) fit1 - update(fit0, . ~ . + pop15) fit2 - update(fit1, . ~ . + pop75) anova(fit0, fit1, fit2, test=Cp) Error in `[.data.frame`(table, , Resid. Dev) : undefined columns selected sessionInfo() R version 2.13.0 Patched (2011-04-28 r55678) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.13.0 -- The help page says for test: a character string specifying the test statistic to be used. Can be one of F, Chisq or Cp, with partial matching allowed, or NULL for no test. test=Cp is, following the help page, intended to work? Setting the scale parameter does not help. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Compression of largish expression array files in the DAAGbio/inst/doc directory?
Thanks. That seems to work. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm On 09/04/2011, at 4:58 PM, Prof Brian Ripley wrote: As far as I can see read.maimages is built on top of R's own file-reading facilties, and they all read compressed (but not zipped) files as from R 2.10.0. So simply use gzip -9 coral55?.spot and rename the files back to *.spot. If you need more compression, use xz -9e. (You can also do this in R: readLines() on the file, writeLines() using gzfile or xzfile.) You will need to make the package 'Depends: R (= 2.10)'. On Sat, 9 Apr 2011, John Maindonald wrote: The inst/doc directory of the DAAG package has 6 files coral551.spot, ... that are around 0.85 MB each. It would be useful to be able to zip then, but that as matters stand interferes with the use of the Sweave file that uses them to demonstrate input of expression array data that is in the spot format. They do not automatically get unzipped when required. I have checked that read.maimages (in limma) does not, unless I have missed something, have an option for reading zipped files. Is there any way to get around this without substantially complicating the exposition in marray-notes.pdf (also in the inst/doc subdirectory)? John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Compression of largish expression array files in the DAAGbio/inst/doc directory?
The inst/doc directory of the DAAG package has 6 files coral551.spot, ... that are around 0.85 MB each. It would be useful to be able to zip then, but that as matters stand interferes with the use of the Sweave file that uses them to demonstrate input of expression array data that is in the spot format. They do not automatically get unzipped when required. I have checked that read.maimages (in limma) does not, unless I have missed something, have an option for reading zipped files. Is there any way to get around this without substantially complicating the exposition in marray-notes.pdf (also in the inst/doc subdirectory)? John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Standardized Pearson residuals
One can easily test for the binary case and not give the statistic in that case. A general point is that if one gave no output that was not open to abuse, there'd be nothing given at all! One would not be giving any output at all from poisson or binomial models, given that data that really calls for quasi links (or a glmm with observation level random effects) is in my experience the rule rather than the exception! At the very least, why not a function dispersion() or pearsonchisquare() that gives this information. Apologies that I misattributed this. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm On 16/03/2011, at 12:41 AM, peter dalgaard wrote: On Mar 15, 2011, at 13:42 , John Maindonald wrote: Peter Dalgaard: It would also be nice for teaching purposes if glm or summary.glm had a pearsonchisq component and a corresponding extractor function, but I can imagine that there might be arguments against it that haven't occured to me. Plus, I doubt that anyone wants to touch glm unless it's to repair a bug. If I'm wrong about all that though, ... Umm, that was Brett, actually. This would remedy what I have long judged a deficiency in summary.glm(). The information is important for diagnostic purposes. One should not have to fit a model with a quasi error, or suss out how to calculate the Pearson chi square from the glm model object, to discover that the information in the model object is inconsistent with simple binomial or poisson assumptions. It could be somewhere between useless and misleading in cases like binary logistic regression though. (Same thing goes for the test against the saturated model: Sometimes it makes sense and sometimes not.) John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm On 15/03/2011, at 10:00 PM, r-devel-requ...@r-project.org wrote: From: Brett Presnell presn...@stat.ufl.edu Date: 15 March 2011 2:40:29 PM AEDT To: peter dalgaard pda...@gmail.com Cc: r-devel@r-project.org Subject: Re: [Rd] Standardized Pearson residuals Thanks Peter. I have just a couple of minor comments, and another possible feature request, although it's one that I don't think will be implemented. peter dalgaard pda...@gmail.com writes: On Mar 14, 2011, at 22:25 , Brett Presnell wrote: Is there any reason that rstandard.glm doesn't have a pearson option? And if not, can it be added? Probably... I have been wondering about that too. I'm even puzzled why it isn't the default. Deviance residuals don't have quite the properties that one might expect, e.g. in this situation, the absolute residuals sum pairwise to zero, so you'd expect that the standardized residuals be identical in absolute value y - 1:4 r - c(0,0,1,1) c - c(0,1,0,1) rstandard(glm(y~r+c,poisson)) 1 2 3 4 -0.2901432 0.2767287 0.2784603 -0.2839995 in comparison, i - influence(glm(y~r+c,poisson)) i$pear.res/sqrt(1-i$hat) 1 2 3 4 -0.2817181 0.2817181 0.2817181 -0.2817181 The only thing is that I'm always wary of tampering with this stuff, for fear of finding out the hard way why thing are the way they are I'm sure that's wise, but it would be nice to get it in as an option, even if it's not the default Background: I'm currently teaching an undergrad/grad-service course from Agresti's Introduction to Categorical Data Analysis (2nd edn) and deviance residuals are not used in the text. For now I'll just provide the students with a simple function to use, but I prefer to use R's native capabilities whenever possible. Incidentally, chisq.test will have a stdres component in 2.13.0 for much the same reason. Thank you. That's one more thing I won't have to provide code for anymore. Coincidentally, Agresti mentioned this to me a week or two ago as something that he felt was missing, so that's at least two people who will be happy to see this added. It would also be nice for teaching purposes if glm or summary.glm had a pearsonchisq component and a corresponding extractor function, but I can imagine that there might be arguments against it that haven't occured to me. Plus, I doubt that anyone wants to touch glm unless it's to repair a bug. If I'm wrong about all that though, ... BTW, as I go along I'm trying to collect a lot of the datasets from the examples and exercises in the text into an R package (icda). It's far from complete
Re: [Rd] Standardized Pearson residuals
Peter Dalgaard: It would also be nice for teaching purposes if glm or summary.glm had a pearsonchisq component and a corresponding extractor function, but I can imagine that there might be arguments against it that haven't occured to me. Plus, I doubt that anyone wants to touch glm unless it's to repair a bug. If I'm wrong about all that though, ... This would remedy what I have long judged a deficiency in summary.glm(). The information is important for diagnostic purposes. One should not have to fit a model with a quasi error, or suss out how to calculate the Pearson chi square from the glm model object, to discover that the information in the model object is inconsistent with simple binomial or poisson assumptions. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm On 15/03/2011, at 10:00 PM, r-devel-requ...@r-project.org wrote: From: Brett Presnell presn...@stat.ufl.edu Date: 15 March 2011 2:40:29 PM AEDT To: peter dalgaard pda...@gmail.com Cc: r-devel@r-project.org Subject: Re: [Rd] Standardized Pearson residuals Thanks Peter. I have just a couple of minor comments, and another possible feature request, although it's one that I don't think will be implemented. peter dalgaard pda...@gmail.com writes: On Mar 14, 2011, at 22:25 , Brett Presnell wrote: Is there any reason that rstandard.glm doesn't have a pearson option? And if not, can it be added? Probably... I have been wondering about that too. I'm even puzzled why it isn't the default. Deviance residuals don't have quite the properties that one might expect, e.g. in this situation, the absolute residuals sum pairwise to zero, so you'd expect that the standardized residuals be identical in absolute value y - 1:4 r - c(0,0,1,1) c - c(0,1,0,1) rstandard(glm(y~r+c,poisson)) 1 2 3 4 -0.2901432 0.2767287 0.2784603 -0.2839995 in comparison, i - influence(glm(y~r+c,poisson)) i$pear.res/sqrt(1-i$hat) 1 2 3 4 -0.2817181 0.2817181 0.2817181 -0.2817181 The only thing is that I'm always wary of tampering with this stuff, for fear of finding out the hard way why thing are the way they are I'm sure that's wise, but it would be nice to get it in as an option, even if it's not the default Background: I'm currently teaching an undergrad/grad-service course from Agresti's Introduction to Categorical Data Analysis (2nd edn) and deviance residuals are not used in the text. For now I'll just provide the students with a simple function to use, but I prefer to use R's native capabilities whenever possible. Incidentally, chisq.test will have a stdres component in 2.13.0 for much the same reason. Thank you. That's one more thing I won't have to provide code for anymore. Coincidentally, Agresti mentioned this to me a week or two ago as something that he felt was missing, so that's at least two people who will be happy to see this added. It would also be nice for teaching purposes if glm or summary.glm had a pearsonchisq component and a corresponding extractor function, but I can imagine that there might be arguments against it that haven't occured to me. Plus, I doubt that anyone wants to touch glm unless it's to repair a bug. If I'm wrong about all that though, ... BTW, as I go along I'm trying to collect a lot of the datasets from the examples and exercises in the text into an R package (icda). It's far from complete and what is there needed tidying up, but I hope to eventually to round it into shape and put it on CRAN, assuming that Agresti approves and that there are no copyright issues. I think something along the following lines should do it: rstandard.glm - function(model, infl=influence(model, do.coef=FALSE), type=c(deviance, pearson), ...) { type - match.arg(type) res - switch(type, pearson = infl$pear.res, infl$dev.res) res - res/sqrt(1-infl$hat) res[is.infinite(res)] - NaN res } [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] keep.source when semicolons separate statements on the one line
The following is 'semicolon.Rnw' \SweaveOpts{engine=R, keep.source=TRUE} xycig-A, eval=f, echo=f= library(SMIR); data(bronchit); library(KernSmooth) @ % Code for panel A is code-xycig-A, eval=f, echo=t= xycig-A @ % Sweave(semicolon) yields the following 'semicolon.tex' Code for panel A is \begin{Schunk} \begin{Sinput} library(SMIR); data(bronchit); library(KernSmooth) library(SMIR); data(bronchit); library(KernSmooth) library(SMIR); data(bronchit); library(KernSmooth) \end{Sinput} \end{Schunk} (I have omitted three blank lines at the start) With keep.source=FALSE, the commands are split onto separate lines, and there is no repetition. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] keep.source when semicolons separate statements on the one line; PS
I forgot to add the sessionInfo() information: sessionInfo() R version 2.12.1 Patched (2011-01-22 r54081) Platform: x86_64-pc-mingw32/x64 (64-bit) . . . The following is 'semicolon.Rnw' \SweaveOpts{engine=R, keep.source=TRUE} xycig-A, eval=f, echo=f= library(SMIR); data(bronchit); library(KernSmooth) @ % Code for panel A is code-xycig-A, eval=f, echo=t= xycig-A @ % Sweave(semicolon) yields the following 'semicolon.tex' Code for panel A is \begin{Schunk} \begin{Sinput} library(SMIR); data(bronchit); library(KernSmooth) library(SMIR); data(bronchit); library(KernSmooth) library(SMIR); data(bronchit); library(KernSmooth) \end{Sinput} \end{Schunk} (I have omitted three blank lines at the start) With keep.source=FALSE, the commands are split onto separate lines, and there is no repetition. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Bug report 14459 -- procedure for handling follow-up issues
Although the specific behaviour that was reported has been fixed, bugs remain in Sweave's processing of comment lines when keep.source=TRUE This is in some senses a follow-up from earlier bugs. Hence the query -- what is the preferred procedure, to submit a new bug report? (Another option might be to add a comment to the web page for bug 14459.) Is there now a preference to submit via the web page, rather than send a message to r-b...@r-project.org? If so, the relevant paragraph in the FAQ surely requires updating: On Unix-like systems a bug report can be generated using the function bug.report(). This automatically includes the version information and sends the bug to the correct address. Alternatively the bug report can be emailed to r-b...@r-project.org or submitted to the Web page at http://bugs.R-project.org/. Please try including results of sessionInfo() in your bug report. I have posted files test10.Rnw, test11.Rnw, and test12.Rnw that demonstrate the bugs at http://www.maths.anu.edu.au/~johnm/r/issues/ The output files test10.tex, test11.tex and test12.tex are from r53870 on x86_64-apple-darwin9.8.0/x86_64 (64-bit) test10.Rnw has a code chunk that begins and ends with a comment. An NA appears following the final comment. This disappears if I remove the initial comment line. test11.Rnw follows a comment line with a named code chunk. The comment line does not appear in the output. test12.Rnw places a line of code between the comment line and the named code chunk. The comment line does now appear in the output. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Bug report 14459 -- procedure for handling follow-up issues
Thanks. It is useful to have a list of items that are outstanding. I will experiment a bit more, but may revert to using R-2.11.1 for running Sweave(). Did any of these issues arise for R-2.11.1? John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm On 22/12/2010, at 3:14 AM, Duncan Murdoch wrote: On 21/12/2010 3:23 AM, John Maindonald wrote: Although the specific behaviour that was reported has been fixed, bugs remain in Sweave's processing of comment lines when keep.source=TRUE This is in some senses a follow-up from earlier bugs. Hence the query -- what is the preferred procedure, to submit a new bug report? (Another option might be to add a comment to the web page for bug 14459.) Is there now a preference to submit via the web page, rather than send a message to r-b...@r-project.org? If so, the relevant paragraph in the FAQ surely requires updating: On Unix-like systems a bug report can be generated using the function bug.report(). This automatically includes the version information and sends the bug to the correct address. Alternatively the bug report can be emailed to r-b...@r-project.org or submitted to the Web page at http://bugs.R-project.org/. Please try including results of sessionInfo() in your bug report. I have posted files test10.Rnw, test11.Rnw, and test12.Rnw that demonstrate the bugs at http://www.maths.anu.edu.au/~johnm/r/issues/ The output files test10.tex, test11.tex and test12.tex are from r53870 on x86_64-apple-darwin9.8.0/x86_64 (64-bit) test10.Rnw has a code chunk that begins and ends with a comment. An NA appears following the final comment. This disappears if I remove the initial comment line. This is now fixed. It was a different bug than 14459. test11.Rnw follows a comment line with a named code chunk. The comment line does not appear in the output. test12.Rnw places a line of code between the comment line and the named code chunk. The comment line does now appear in the output. These look like a different issue, and are still unfixed, and are unlikely to be fixed soon. The problem is that the handling of source references in Sweave is messy, and needs a major cleanup, which takes time. Between now and at least mid-February I won't have the time it would take, and I don't know anyone else who would do it. So I would not bet on these fixes getting done before 2.13.0. The problems I know about are these: - if you use a named chunk chunkname in another, you won't get leading and trailing comments on the named chunk. - if you mix named chunks and \SweaveInput, you won't get the original source at all in the expanded chunks. Your examples look like the first of these. I had thought the comments had to be in the chunk to get lost, but apparently not. Just to make priorities clear: in the short term I will fix bugs where NAs show up inappropriately. I will not fix bugs involving dropping leading or trailing comments when there are simple workarounds. (The workaround in your case is not to use the named chunk.) Duncan Murdoch John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. http://www.maths.anu.edu.au/~johnm __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Wishlist for plot.lm() (PR#13560)
Full_Name: John Maindonald Version: R-2.8.1 OS: MacOS X 10.5.6 Submission from: (NULL) (203.173.3.75) The following code demonstrates an annoyance with plot.lm(): library(DAAGxtras) x11(width=3.75, height=4) nihills.lm - lm(log(time) ~ log(dist) + log(climb), data = nihills) plot(nihills.lm, which=5) OR try the following xy - data.frame(x=c(3,1:5), y=c(-2, 1:5)) plot(lm(y ~ x, data=xy), which=5) The Cook's distance text overplots the label for the point with the smallest residual. This is an issue when the size of the plot is much less than the default, and the pointsize is not reduced proportionately. I suggest the following: xx - hii xx[xx = 1] - NA ## Insert new code fracht - (1.25*par()$cin[2])/par()$pin[2] ylim[1] - ylim[1] - diff(ylim)*max(0, fracht-0.04) ## End insert new code plot(xx, rsp, xlim = c(0, max(xx, na.rm = TRUE)), ylim = ylim, main = main, xlab = Leverage, ylab = ylab5, type = n, ...) Then, about 15 lines further down, replace legend(bottomleft, legend = Cook's distance, lty = 2, col = 2, bty = n) by legend(bottomleft, legend = Cook's distance, lty = 2, col = 2, text.col=2, bty = n, y.intersp=0.5) # This changes the legend color to agree with the line color Another possibility, suggested by John Fox, is to replace the caption by Cook's distance contours, and omit the legend entirely. Both John Fox and myself are comfortable with either of these fixes. Test the changes with: x11() nihills.lm - lm(log(time) ~ log(dist) + log(climb), data = nihills) plot(nihills.lm, which=5) xy - data.frame(x=c(3,1:5), y=c(-2, 1:5)) plot(lm(y ~ x, data=xy), which=5) x11(width=3.75, height=4) plot(nihills.lm, which=5) plot(lm(y ~ x, data=xy), which=5) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] plot.lm: Cook's distance label can overplot point labels
Actually, the contours and the smooth are currently printed with col=2. This prints satisfactorily in grayscale.Colours (orange and darkred as well as col=2) are also used in termplot. Does the stricture against colour extend to grayscale? Does it apply to lines as well as text? John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 19/02/2009, at 5:58 PM, Prof Brian Ripley wrote: On Wed, 18 Feb 2009, John Fox wrote: Dear John, -Original Message- From: John Maindonald [mailto:john.maindon...@anu.edu.au] Sent: February-18-09 4:57 PM To: John Fox Cc: 'Martin Maechler'; r-devel@r-project.org Subject: Re: [Rd] plot.lm: Cook's distance label can overplot point labels Dear John - The title above the graph is also redundant for the first of the plots; do we want to be totally consistent? I am not sure. Why not? A foolish consistency is the hobgoblin of little minds, but maybe this isn't a foolish consistency. It occurs to me that the text Cook's distance, as well as the contours, might be in red. That would provide a nice visual cue (for those who aren't colour blind). Or using a black-and-white device. We have not hitherto assumed a colour device in 'stats' graphics, and given how often they are printed I don't think we want to start. As so often, it seems that what looks good is in the eye of the beholder. If the two of you can agree on something that you both see is a definite improvement, please provide a patch and examples to try to persuade everyone else. (As a Wishlist item on R-bugs, so it gets recorded.) Best, John Regards John. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 18/02/2009, at 12:27 PM, John Fox wrote: Dear John, It occurs to me that the title above the graph, Residuals vs. Leverage, is entirely redundant since the x-axis is labelled Leverage and the y- axis Studentized residuals. Why not use the title above the graph for Cook's distance countours? Regards, John -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org ] On Behalf Of John Maindonald Sent: February-17-09 5:54 PM To: r-devel@r-project.org Cc: Martin Maechler Subject: [Rd] plot.lm: Cook's distance label can overplot point labels The following code demonstrates an annoyance with plot.lm(): library(DAAGxtras) x11(width=3.75, height=4) nihills.lm - lm(log(time) ~ log(dist) + log(climb), data = nihills) plot(nihills.lm, which=5) OR try the following xy - data.frame(x=c(3,1:5), y=c(-2, 1:5)) plot(lm(y ~ x, data=xy), which=5) The Cook's distance text overplots the label for the point with the smallest residual. This is an issue when the size of the plot is much less than the default, and the pointsize is not reduced proportionately. I suggest the following: xx - hii xx[xx = 1] - NA ## Insert new code fracht - (1.25*par()$cin[2])/par()$pin[2] ylim[1] - ylim[1] - diff(ylim)*max(0, fracht-0.04) ## End insert new code plot(xx, rsp, xlim = c(0, max(xx, na.rm = TRUE)), ylim = ylim, main = main, xlab = Leverage, ylab = ylab5, type = n, ...) Then, about 15 lines further down, replace legend(bottomleft, legend = Cook's distance, lty = 2, col = 2, bty = n) by legend(bottomleft, legend = Cook's distance, lty = 2, col = 2, bty = n, y.intersp=0.5) If this second change is not made, then one wants fracht - (1.5*par() $cin[2])/par()$pin[2] I prefer the Cook's distance text to be a bit closer to the x- axis, as it separates it more clearly from any point labels. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] plot.lm: Cook's distance label can overplot point labels
Dear John - The title above the graph is also redundant for the first of the plots; do we want to be totally consistent? I am not sure. It occurs to me that the text Cook's distance, as well as the contours, might be in red. Regards John. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 18/02/2009, at 12:27 PM, John Fox wrote: Dear John, It occurs to me that the title above the graph, Residuals vs. Leverage, is entirely redundant since the x-axis is labelled Leverage and the y- axis Studentized residuals. Why not use the title above the graph for Cook's distance countours? Regards, John -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org ] On Behalf Of John Maindonald Sent: February-17-09 5:54 PM To: r-devel@r-project.org Cc: Martin Maechler Subject: [Rd] plot.lm: Cook's distance label can overplot point labels The following code demonstrates an annoyance with plot.lm(): library(DAAGxtras) x11(width=3.75, height=4) nihills.lm - lm(log(time) ~ log(dist) + log(climb), data = nihills) plot(nihills.lm, which=5) OR try the following xy - data.frame(x=c(3,1:5), y=c(-2, 1:5)) plot(lm(y ~ x, data=xy), which=5) The Cook's distance text overplots the label for the point with the smallest residual. This is an issue when the size of the plot is much less than the default, and the pointsize is not reduced proportionately. I suggest the following: xx - hii xx[xx = 1] - NA ## Insert new code fracht - (1.25*par()$cin[2])/par()$pin[2] ylim[1] - ylim[1] - diff(ylim)*max(0, fracht-0.04) ## End insert new code plot(xx, rsp, xlim = c(0, max(xx, na.rm = TRUE)), ylim = ylim, main = main, xlab = Leverage, ylab = ylab5, type = n, ...) Then, about 15 lines further down, replace legend(bottomleft, legend = Cook's distance, lty = 2, col = 2, bty = n) by legend(bottomleft, legend = Cook's distance, lty = 2, col = 2, bty = n, y.intersp=0.5) If this second change is not made, then one wants fracht - (1.5*par() $cin[2])/par()$pin[2] I prefer the Cook's distance text to be a bit closer to the x-axis, as it separates it more clearly from any point labels. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] plot.lm: Cook's distance label can overplot point labels
The following code demonstrates an annoyance with plot.lm(): library(DAAGxtras) x11(width=3.75, height=4) nihills.lm - lm(log(time) ~ log(dist) + log(climb), data = nihills) plot(nihills.lm, which=5) OR try the following xy - data.frame(x=c(3,1:5), y=c(-2, 1:5)) plot(lm(y ~ x, data=xy), which=5) The Cook's distance text overplots the label for the point with the smallest residual. This is an issue when the size of the plot is much less than the default, and the pointsize is not reduced proportionately. I suggest the following: xx - hii xx[xx = 1] - NA ## Insert new code fracht - (1.25*par()$cin[2])/par()$pin[2] ylim[1] - ylim[1] - diff(ylim)*max(0, fracht-0.04) ## End insert new code plot(xx, rsp, xlim = c(0, max(xx, na.rm = TRUE)), ylim = ylim, main = main, xlab = Leverage, ylab = ylab5, type = n, ...) Then, about 15 lines further down, replace legend(bottomleft, legend = Cook's distance, lty = 2, col = 2, bty = n) by legend(bottomleft, legend = Cook's distance, lty = 2, col = 2, bty = n, y.intersp=0.5) If this second change is not made, then one wants fracht - (1.5*par() $cin[2])/par()$pin[2] I prefer the Cook's distance text to be a bit closer to the x-axis, as it separates it more clearly from any point labels. John Maindonald email: john.maindon...@anu.edu.au phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel Digest, Vol 62, Issue 24
The columns of the model matrix are all orthogonal. So the problem lies with poly(), not with lm(). x = rep(1:5,3) y = rnorm(15) z - model.matrix(lm(y ~ poly(x, 12))) x = rep(1:5,3) y = rnorm(15) z - model.matrix(lm(y ~ poly(x, 12))) round(crossprod(z),15) (Intercept) poly(x, 12)1 poly(x, 12)2 poly(x, 12)3 poly(x, 12)4 (Intercept)1500 00 poly(x, 12)1010 00 poly(x, 12)2001 00 poly(x, 12)3000 10 poly(x, 12)4000 01 poly(x, 12)5000 00 poly(x, 12)6000 00 poly(x, 12)7000 00 poly(x, 12)8000 00 poly(x, 12)9000 00 poly(x, 12)10 000 00 poly(x, 12)11 000 00 poly(x, 12)12 000 00 poly(x, 12)5 poly(x, 12)6 poly(x, 12)7 poly(x, 12)8 poly(x, 12)9 (Intercept) 000 00 poly(x, 12)1 000 00 poly(x, 12)2 000 00 poly(x, 12)3 000 00 poly(x, 12)4 000 00 poly(x, 12)5 100 00 poly(x, 12)6 010 00 poly(x, 12)7 001 00 poly(x, 12)8 000 10 poly(x, 12)9 000 01 poly(x, 12)10000 00 poly(x, 12)11000 00 poly(x, 12)12000 00 poly(x, 12)10 poly(x, 12)11 poly(x, 12)12 (Intercept) 0 0 0 poly(x, 12)1 0 0 0 poly(x, 12)2 0 0 0 poly(x, 12)3 0 0 0 poly(x, 12)4 0 0 0 poly(x, 12)5 0 0 0 poly(x, 12)6 0 0 0 poly(x, 12)7 0 0 0 poly(x, 12)8 0 0 0 poly(x, 12)9 0 0 0 poly(x, 12)10 1 0 0 poly(x, 12)11 0 1 0 poly(x, 12)12 0 0 1 John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 24 Apr 2008, at 8:00 PM, [EMAIL PROTECTED] wrote: From: [EMAIL PROTECTED] Date: 24 April 2008 3:05:28 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: [Rd] poly() can exceed degree k - 1 for k distinct points (PR#11251) The poly() function can create more variables than can be fitted when there are replicated values. In the example below, 'x' has only 5 distinct values, but I can apparently fit a 12th-degree polynomial with no error messages or even nonzero coefficients: R x = rep(1:5,3) R y = rnorm(15) R lm(y ~ poly(x, 12)) Call: lm(formula = y ~ poly(x, 12)) Coefficients: (Intercept) poly(x, 12)1 poly(x, 12)2 poly(x, 12)3 -0.274420.35822 -0.264122.11780 poly(x, 12)4 poly(x, 12)5 poly(x, 12)6 poly(x, 12)7 1.83117 -0.09260 -0.485721.94030 poly(x, 12)8 poly(x, 12)9 poly(x, 12)10 poly(x, 12)11 -0.88297 -1.045560.74289 -0.01422 poly(x, 12)12 -0.46548 If I try the same with raw=TRUE, only a 4th-degree polynomial is obtained: R lm(y ~ poly(x, 12, raw=TRUE)) Call: lm(formula = y ~ poly(x, 12, raw = TRUE)) Coefficients: (Intercept) poly(x, 12, raw = TRUE)1 9.7527 -22.0971 poly(x, 12, raw = TRUE)2 poly(x, 12, raw = TRUE)3 15.3293-4.1005 poly(x
Re: [Rd] R-devel Digest, Vol 62, Issue 24
Actually, this may be a useful feature! It allows calculation of a basis for the orthogonal complement of the space spanned by model.matrix(lm(y ~ poly(x,12)). However, the default ought surely to be to disallow df k-1 in poly(x,df), where k = length(unique(x)). John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 24 Apr 2008, at 8:00 PM, [EMAIL PROTECTED] wrote: From: [EMAIL PROTECTED] Date: 24 April 2008 3:05:28 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: [Rd] poly() can exceed degree k - 1 for k distinct points (PR#11251) The poly() function can create more variables than can be fitted when there are replicated values. In the example below, 'x' has only 5 distinct values, but I can apparently fit a 12th-degree polynomial with no error messages or even nonzero coefficients: R x = rep(1:5,3) R y = rnorm(15) R lm(y ~ poly(x, 12)) Call: lm(formula = y ~ poly(x, 12)) Coefficients: (Intercept) poly(x, 12)1 poly(x, 12)2 poly(x, 12)3 -0.274420.35822 -0.264122.11780 poly(x, 12)4 poly(x, 12)5 poly(x, 12)6 poly(x, 12)7 1.83117 -0.09260 -0.485721.94030 poly(x, 12)8 poly(x, 12)9 poly(x, 12)10 poly(x, 12)11 -0.88297 -1.045560.74289 -0.01422 poly(x, 12)12 -0.46548 snip snip [I thought I submitted this via the website yesterday, but I can find no trace of it. I apologize if this is a duplicate, but I don't think it is.] -- Russell V. Lenth, Professor Department of Statistics Actuarial Science(319)335-0814FAX (319)335-3017 The University of Iowa [EMAIL PROTECTED] Iowa City, IA 52242 USA http://www.stat.uiowa.edu/~rlenth/ [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] html help fails for named vector objects (PR#9927)
help(letters, htmlhelp=TRUE) fails. Under the Mac OSX gui, the message is 'Help for the topic a was not found.' Under the version documented below, and under Windows, the message is No documentation for 'a' in specified packages and libraries: repeated for all the elements of letters, then followed by you could try 'help.search(a)', again repeated for all elements of letters. The outcome seems similar for any character vector (including matrix) object, e.g. the matrix 'primateDNA' in the DAAGbio package. The following have the expected result help(letters, htmlhelp=TRUE) help(letters, htmlhelp=FALSE) The same result is obtained with R-2.5.1. --please do not edit the information below-- Version: platform = i386-apple-darwin8.10.1 arch = i386 os = darwin8.10.1 system = i386, darwin8.10.1 status = beta major = 2 minor = 6.0 year = 2007 month = 09 day = 22 svn rev = 42941 language = R version.string = R version 2.6.0 beta (2007-09-22 r42941) Locale: C Search Path: .GlobalEnv, package:testpkg, package:stats, package:graphics, package:grDevices, package:utils, package:datasets, package:methods, Autoloads, package:base John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] substitute and expression (Peter Dalgaard)
In this connection, note the following a4 - 4 plotThis - bquote(alpha==.(a), list(a=a4)) do.call(plot, list(1:10, main=do.call(expression, c(plotThis do.call(plot, list(1:10, main=do.call(expression, plotThis))) Error in do.call(expression, plotThis) : second argument must be a list ## Whereas plotThis has class call, c(plotThis) has class list class(plotThis) [1] call class(c(plotThis)) [1] list ## Thus, the following is possible: do.call(plot, list(1:10, main=do.call(expression, list(plotThis Marc Schwartz pointed out to me., some considerable time ago, that one could use bquote() and .() to create the elements of a list object whose elements can be plotted in parallel as required, e.g., for axis labels, thus: plot(1:2, 1:2, xaxt=n) arg1 - bquote( .(x), list(x=1.5)) arg2 - bquote(= .(x), list(x=1.5)) axis(1, at=1:2, labels=do.call(expression, list(arg1, arg2))) For a unified approach to use of do.call(expression, ...), maybe one should use bquote() and .()? John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 18 Jul 2007, at 8:00 PM, [EMAIL PROTECTED] wrote: From: Peter Dalgaard [EMAIL PROTECTED] Date: 18 July 2007 1:39:50 AM To: Deepayan Sarkar [EMAIL PROTECTED] Cc: R Development Mailing List [EMAIL PROTECTED] Subject: Re: [Rd] substitute and expression Deepayan Sarkar wrote: On 7/16/07, Peter Dalgaard [EMAIL PROTECTED] wrote: Deepayan Sarkar wrote: Hi, I'm trying to understand whether the use of substitute() is appropriate/documented for plotmath annotation. The following two calls give the same results: plot(1:10, main = expression(alpha == 1)) do.call(plot, list(1:10, main = expression(alpha == 1))) But not these two: plot(1:10, main = substitute(alpha == a, list(a = 2))) do.call(plot, list(1:10, main = substitute(alpha == a, list(a = 2 Error in as.graphicsAnnot(main) : object alpha not found (as a consequence, xyplot(..., main = substitute(alpha)) doesn't currently work.) On the other hand, this works: foo - function(x) plot(1, main = x) foo(substitute(alpha)) I'm not sure how to interpret ?plotmath; it says If the 'text' argument to one of the text-drawing functions ('text', 'mtext', 'axis', 'legend') in R is an expression, the argument is interpreted as a mathematical expression... and uses substitute() in its examples, but is.expression(substitute(alpha == a, list(a = 1))) [1] FALSE I think you need to take plotmath out of the equation and study the difference between objects of mode call and those of mode expression. Consider this: f - function(...)match.call() do.call(f, list(1:10, main = substitute(alpha == a, list(a = 2 function(...)match.call() (1:10, main = alpha == 2) do.call(list, list(1:10, main = substitute(alpha == a, list(a = 2 Error in do.call(list, list(1:10, main = substitute(alpha == a, list(a = 2 : object alpha not found The issue is that function ends up with an argument alpha == 2 which it proceeds to evaluate (lazily), where a direct call sees substitute(.). It is a general problem with the do.call mechanism that it effectively pre-evaluates the argument list, which can confuse functions that rely on accessing the original argument expression. Try, e.g., do.call(plot, list(airquality$Wind, airquality$Ozone)) and watch the axis labels. Right. Lazy evaluation was the piece I was missing. Does it work if you use something like main = substitute(quote(alpha == a), list(a = 2))? Not for xyplot, though I haven't figured out why. Turns out this also doesn't work: plot(y ~ x, data = list(x = 1:10, y = 1:10), main = substitute (alpha)) Error in as.graphicsAnnot(main) : object alpha not found I'll take this to mean that the fact that substitute() works sometimes (for plotmath) is an undocumented side effect of the implementation that should not be relied upon. Probably the correct solution is to use expression objects. More or less the entire reason for their existence is this sort of surprises. plot(y ~ x, data = list(x = 1:10, y = 1:10), main = as.expression(substitute(alpha==a, list(a=2 I'm not going to investigate why this is necessary in connection with plot(), but the core issue is probably e - quote(f(x)) ; e[[2]] - quote(2+2) e f(2 + 2) f - quote(f(2+2)) identical(e,f) [1] TRUE notice that since the two calls are identical, there is no way for e to detect that it was called with x replaced by an object of mode call. Or put differently, objects of mode call tend to lose their personality in connection with computing on the language. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c
Re: [Rd] termplot - changes in defaults
While termplot is under discussion, here's another proposal. I'd like to change the default for partial.resid to TRUE, and for smooth to panel.smooth. I'd be surprised if those changes were to break existing code. John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On Mon, 2 Jul 2007, [EMAIL PROTECTED] wrote: Precisely. Thanks Brian. I did do something like this but not nearly so elegantly. I suggest this become the standard version in the next release. I can't Yes, that was the intention (to go into R-devel). (It was also my intention to attach as plain text, but my Windows mailer seems to have defeated that.) see that it can break any existing code. It's a pity now we can't make ylim = common the default. I suspect we could if I allow a way to get the previous behaviour (ylim=free, I think). Brian Regards, Bill V. Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:[EMAIL PROTECTED] http://www.cmis.csiro.au/bill.venables/ -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Monday, 2 July 2007 7:55 PM To: Venables, Bill (CMIS, Cleveland) Cc: [EMAIL PROTECTED] Subject: Re: [Rd] termplot with uniform y-limits Is the attached the sort of thing you are looking for? It allows ylim to be specified, including as common. On Mon, 2 Jul 2007, [EMAIL PROTECTED] wrote: Does anyone have, or has anyone ever considered making, a version of 'termplot' that allows the user to specify that all plots should have the same y-limits? This seems a natural thing to ask for, as the plots share a y-scale. If you don't have the same y-axes you can easily misread the comparative contributions of the different components. Notes: the current version of termplot does not allow the user to specify ylim. I checked. the plot tools that come with mgcv do this by default. Thanks Simon. Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:[EMAIL PROTECTED] http://www.cmis.csiro.au/bill.venables/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] termplot - changes in defaults
While termplot is under discussion, here's another proposal. I'd like to change the default for partial.resid to TRUE, and for smooth to panel.smooth. I'd be surprised if those changes were to break existing code. John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On Mon, 2 Jul 2007, [EMAIL PROTECTED] wrote: Precisely. Thanks Brian. I did do something like this but not nearly so elegantly. I suggest this become the standard version in the next release. I can't Yes, that was the intention (to go into R-devel). (It was also my intention to attach as plain text, but my Windows mailer seems to have defeated that.) see that it can break any existing code. It's a pity now we can't make ylim = common the default. I suspect we could if I allow a way to get the previous behaviour (ylim=free, I think). Brian Regards, Bill V. Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:[EMAIL PROTECTED] http://www.cmis.csiro.au/bill.venables/ -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Monday, 2 July 2007 7:55 PM To: Venables, Bill (CMIS, Cleveland) Cc: [EMAIL PROTECTED] Subject: Re: [Rd] termplot with uniform y-limits Is the attached the sort of thing you are looking for? It allows ylim to be specified, including as common. On Mon, 2 Jul 2007, [EMAIL PROTECTED] wrote: Does anyone have, or has anyone ever considered making, a version of 'termplot' that allows the user to specify that all plots should have the same y-limits? This seems a natural thing to ask for, as the plots share a y-scale. If you don't have the same y-axes you can easily misread the comparative contributions of the different components. Notes: the current version of termplot does not allow the user to specify ylim. I checked. the plot tools that come with mgcv do this by default. Thanks Simon. Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:[EMAIL PROTECTED] http://www.cmis.csiro.au/bill.venables/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Levels attribute in integer columns created by model.frame()
I get worms.glm - glm(cbind(deaths, (20-deaths)) ~ sex+ doselin, + data=worms, family=binomial) attr(worms.glm, dataClasses) NULL But maybe the result from somewhere within predict.lm() or model.frame() is different. Surely the levels attribute has no relevance to glm's computations with the doselin term. It has treated it as numeric. In my view, either predict() should maintain the stance (pretence?) that it is numeric, or else the call to .checkMFClasses() that follows on the use of glm() should report at least a warning., John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 1 May 2007, at 4:33 PM, Prof Brian Ripley wrote: Stripping attributes from a column in model.frame would be highly undesirable. The mistake was using 'unclass' when the intention was to remove the levels (I presume). The new variable given is correctly reported as not matching that used during fitting. Uuse of traceback() would have shown that the error is not reported from model.frame (as claimed) but from 4: .checkMFClasses(cl, m) 3: predict.lm(object, newdata, se.fit, scale = 1, type = ifelse (type == link, response, type), terms = terms, na.action = na.action) 2: predict.glm(worms.glm, new = data.frame(sex = 1, doselin = 6)) 1: predict(worms.glm, new = data.frame(sex = 1, doselin = 6)) The reason the class is reported as other is clear from attr(worms.glm, dataClasses). This comes from .MFclass. On Tue, 1 May 2007, John Maindonald wrote: The following is evidence of what is surely an undesirable feature. The issue is the handling, in calls to model.frame(), of an explanatory variable that has been derived as an unclassed factor. (Ross Darnell drew this to my attention.) He has already filed a bug report on it, without saying what he thinks the bug is. ## Data are slightly modified from p.191 of MASS worms - data.frame(sex=gl(2,6), Dose=factor(rep(2^(0:5),2)), + deaths=c(1,4,9,13,18,20,0,2,6,10,12,16)) worms$doselin - unclass(worms$Dose) class(worms$doselin) [1] integer attributes(worms$doselin) $levels [1] 1 2 4 8 16 32 worms.glm - glm(cbind(deaths, (20-deaths)) ~ sex+ doselin, + data=worms, family=binomial) predict(worms.glm, new=data.frame(sex=1, doselin=6)) Error: variable 'doselin' was fitted with class other but class numeric was supplied In addition: Warning message: variable 'doselin' is not a factor in: model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) The error is reported in the call to model.frame() from predict.lm() which is called by predict.glm(). It is not clear to me why this call to model.frame identifies the class that should be expected as other. The problem might be fixed by stripping the levels attribute from any column created by model.frame() that is integer or numeric. ### ## Note the following mframe - model.frame(cbind(deaths, (20-deaths)) ~ sex+ doselin, + data=worms) class(mframe$doselin) [1] integer attributes(mframe$doselin) $levels [1] 1 2 4 8 16 32 John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Levels attribute in integer columns created by model.frame()
The following is evidence of what is surely an undesirable feature. The issue is the handling, in calls to model.frame(), of an explanatory variable that has been derived as an unclassed factor. (Ross Darnell drew this to my attention.) ## Data are slightly modified from p.191 of MASS worms - data.frame(sex=gl(2,6), Dose=factor(rep(2^(0:5),2)), + deaths=c(1,4,9,13,18,20,0,2,6,10,12,16)) worms$doselin - unclass(worms$Dose) class(worms$doselin) [1] integer attributes(worms$doselin) $levels [1] 1 2 4 8 16 32 worms.glm - glm(cbind(deaths, (20-deaths)) ~ sex+ doselin, + data=worms, family=binomial) predict(worms.glm, new=data.frame(sex=1, doselin=6)) Error: variable 'doselin' was fitted with class other but class numeric was supplied In addition: Warning message: variable 'doselin' is not a factor in: model.frame.default(Terms, newdata, na.action = na.action, xlev = object$xlevels) The error is reported in the call to model.frame() from predict.lm() which is called by predict.glm(). It is not clear to me why this call to model.frame identifies the class that should be expected as other. The problem might be fixed by stripping the levels attribute from any column created by model.frame() that is integer or numeric. ### ## Note the following mframe - model.frame(cbind(deaths, (20-deaths)) ~ sex+ doselin, + data=worms) class(mframe$doselin) [1] integer attributes(mframe$doselin) $levels [1] 1 2 4 8 16 32 John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Mathematics Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] read.table() errors with tab as separator (PR#9061)
(1) read.table(), with sep=\t, identifies 13 our of 1400 records, in a file with 1400 records of 3 fields each, as having only 2 fields. This happens under version 2.3.1 for Windows as well as with R 2.3.1 for Mac OS X, and with R-devel under Mac OS X. [R version 2.4.0 Under development (unstable) (2006-07-03 r38478)] (2) Using read.table() with sep=\t, the first 1569 records only of a 1821 record file are input. The file has exactly two fields in each record, and the minimum length of the second field is 1 character. If however I extract lines 1561 to 1650 from the file (the file short.txt below), all 90 lines are input. webtwo - http://www.maths.anu.edu.au/~johnm/testfiles/twotabs.txt; xy - read.table(url(webtwo), sep=\t) Warning message: number of items read is not a multiple of the number of columns z - count.fields(url(webtwo), sep=\t) table(z) z 23 13 1387 table(sapply(strsplit(readLines(url(webtwo)), split=\t), length)) 3 1400 readLines(url(webtwo))[z==2][9:13] # last 5 as a sample (shorter lines) [1] 865\tlinear model (lm)! Cook's distance\t152 [2] 1019\tlinear model (lm)! Cook's distance\t177 [3] 1048\tlinear model (lm)! Cook's distance\t183 [4] 1082\tlinear model (lm)! Cook's distance\t187 [5] 1220\tlinear model (lm)! Cook's distance\t214 weblong - http://www.maths.anu.edu.au/~johnm/testfiles/long.txt; webshort - http://www.maths.anu.edu.au/~johnm/testfiles/short.txt; xyLong - read.table(url(weblong), sep=\t) dim(xyLong)# Should be 1821 x 2 [1] 15692 xyShort - read.table(url(webshort), sep=\t) dim(xyShort) # Should be, and will be, 90 x 2 [1] 90 2 long - readLines(url(weblong)) short - readLines(url(webshort)) length(long) [1] 1821 length(short) [1] 90 all(long[1561:1650]==short) # short is lines 1561:1650 of long [1] TRUE ## Moreover strsplit() can pick up the \t's correctly lsplit - strsplit(long, \t) table(sapply(lsplit, length)) 2 1821 # Try also table(sapply(lsplit, function(x)x[2])) --please do not edit the information below-- Version: platform = powerpc-apple-darwin8.6.0 arch = powerpc os = darwin8.6.0 system = powerpc, darwin8.6.0 status = major = 2 minor = 3.1 year = 2006 month = 06 day = 01 svn rev = 38247 language = R version.string = Version 2.3.1 (2006-06-01) Locale: C Search Path: .GlobalEnv, package:lattice, package:methods, package:stats, package:graphics, package:grDevices, package:utils, package:datasets, Autoloads, package:base __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Citation of R packages
On 5 Feb 2006, at 2:27 AM, [EMAIL PROTECTED] wrote: On Mon, 30 Jan 2006 10:06:52 +1100 (EST), John Maindonald (JM) wrote: The bibtex citations provided by citation() do not work all that well in cases where there is no printed document to reference: That's why there is a warning at the end that they will need manual editing ... IMHO they at least save you some typing effort in many cases. They are certainly a useful start. (1) A version field is needed, as the note field is required for other purposes, currently trying to sort out nuances that cannot be sorted out in the author list (author, compiler, implementor of R version, contributor, ...) and maybe giving a cross-reference to a book or paper that is somehow relevant. Why should a reference cross-reference another reference? Could you give an example? Where there is a published paper or a book (such as MASS), or a manual for which a url can be given, my decision was to include that in the main list of references, but not to include references there that were references to the package itself, which as you suggest below can be a reference to the concatenated help pages. It seemed anyway useful to have a separate list of packages. For consistency, these were always references to the package, with a cross-reference to any relevant document in the references to papers. (2) Maybe the author field should be more nuanced, or maybe ... author fields of bibtex entries have a strict format (names separated by and), what do you mean by more nuanced? Those named in the list of authors may be any combination of: the authors of an R package, the authors of an original S version, the person or persons responsible for an R port, the authors of the Fortran code, compiler (s), and contributors of ideas. For John Fox's car, citation() gives the following: author = {John Fox. I am grateful to Douglas Bates and David Firth and Michael Friendly and Gregor Gorjanc and Georges Monette and Henric Nilsson and Brian Ripley and Sanford Weisberg and and Achim Zeleis for various suggestions and contributions.}, For Rcmdr: author = {John Fox and with contributions from Michael Ash and Philippe Grosjean and Martin Maechler and Dan Putler and and Peter Wolf.}, For car, maybe John Fox should be identified as author. For Rcmdr, maybe the other persons that are named should be added? For leaps: author = {Thomas Lumley using Fortran code by Alan Miller}, It seems reasonable to cite Lumley and Miller as authors. Should there be a note that identifies Miller as the contributor of the Fortran code? Should the name(s) of porters (usually from S) be included as author (s)? Or should their contribution be acknowledged in the note field? Or ... Possibilities are to cite all those individuals as author, or to cite John Fox only, with any combination of no additional information in the note field, or using the note field to explain who did what. The citation() function leaves it unclear who are to be acknowledged as authors, and in fact (3) In compiling a list of packages, name order seems preferable, and one wants the title first (achieved by relocating the format.title field in the manual FUNCTION in the .bst file (4) manual seems not an ideal name for the class, if there is no manual. A package always has a reference manual, the concatenated help pages certainly qualify as such and can be downloaded in PDF format from CRAN. The ISBN rules even allow to assign an ISBN number to the online help of a software package which also can serve as the ISBN number of the *software itself* (which we did for base R). I'd prefer some consistency in the way that R packages are referenced. Thus, if reference for one package is to the concatenated help pages, do it that way for all of them. Maybe what is needed is a package or suchlike class, and several alternative .bst files that handle the needed listings. I know at least one other person who is wrestling with this, and others on this list must be wrestling with it. I am certainly open for discussions and any suggestions for improvements, but it must be within the standard bibtex entry types, we cannot write our own entry types and .bst files. Many journals require the usage of their own (or standard) bibtex styles, and the entries we produce must work with those. If R creates nonstandard bibtex entries even more manual work will be necessary in many cases. I have no definitive bibtex reference at hand, but the natbib style files (a very popular collection of bibtex styles, at least I definitely want to be compatible with those) define article book booklet conference (= alias for inproceedings) inbook incollection inproceedings manual mastersthesis misc phdthesis proceedings techreport unpublished which coincide with the choices the emacs bibtex mode offers. Out of these only manual, misc
Re: [Rd] Citation of R packages
Even if a CITATION file is included, there is an issue of what to put in it. Authorship of a book or paper is not always the simple matter that might appear. With an R package, it can be a far from simple matter. We are trying to adapt a tool, surely, that was designed for different purposes. 1. I'd like to see the definition of a new BibTeX entry type that has fields for additional author details and version number. There is surely some mechanism for getting agreement on a new entry type. 2. In any case, there's a message for maintainers of packages to include CITATION files that reflect what they want to appear in any citation, with citation(lattice) as maybe a suitable model? John. John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Mathematical Sciences Institute, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. On 11 Feb 2006, at 5:36 AM, [EMAIL PROTECTED] wrote: On Fri, 10 Feb 2006 21:01:44 +1100, John Maindonald (JM) wrote: [...] Where there is a published paper or a book (such as MASS), or a manual for which a url can be given, my decision was to include that in the main list of references, but not to include references there that were references to the package itself, which as you suggest below can be a reference to the concatenated help pages. The CITATION file of a package may contain as many entries as the author wants, including both a reference to the help pages and to the book (or whatever). It seemed anyway useful to have a separate list of packages. For consistency, these were always references to the package, with a cross-reference to any relevant document in the references to papers. (2) Maybe the author field should be more nuanced, or maybe ... author fields of bibtex entries have a strict format (names separated by and), what do you mean by more nuanced? Those named in the list of authors may be any combination of: the authors of an R package, the authors of an original S version, the person or persons responsible for an R port, the authors of the Fortran code, compiler (s), and contributors of ideas. For John Fox's car, citation() gives the following: author = {John Fox. I am grateful to Douglas Bates and David Firth and Michael Friendly and Gregor Gorjanc and Georges Monette and Henric Nilsson and Brian Ripley and Sanford Weisberg and and Achim Zeleis for various suggestions and contributions.}, For Rcmdr: author = {John Fox and with contributions from Michael Ash and Philippe Grosjean and Martin Maechler and Dan Putler and and Peter Wolf.}, For car, maybe John Fox should be identified as author. For Rcmdr, maybe the other persons that are named should be added? For leaps: author = {Thomas Lumley using Fortran code by Alan Miller}, It seems reasonable to cite Lumley and Miller as authors. Should there be a note that identifies Miller as the contributor of the Fortran code? Should the name(s) of porters (usually from S) be included as author (s)? Or should their contribution be acknowledged in the note field? Or ... Possibilities are to cite all those individuals as author, or to cite John Fox only, with any combination of no additional information in the note field, or using the note field to explain who did what. The citation() function leaves it unclear who are to be acknowledged as authors, and in fact Umm, the problem there is not the citation() function, but that the authors of all those packages obviously have not included a CITATION file in their package which overrides the default (extracted from the DESCRIPTION file). E.g., package flexclust has DESCRIPTION Package: flexclust Version: 0.8-1 Date: 2006-01-11 Author: Friedrich Leisch, parts based on code by Evgenia Dimitriadou but R citation(flexclust) To cite package flexclust in publications use: Friedrich Leisch. A Toolbox for K-Centroids Cluster Analysis. Computational Statistics and Data Analysis, 2006. Accepted for publication. A BibTeX entry for LaTeX users is @Article{, author = {Friedrich Leisch}, title = {A Toolbox for K-Centroids Cluster Analysis}, journal = {Computational Statistics and Data Analysis}, year = {2006}, note = {Accepted for publication}, } because the CITATION file overrides the DESCRIPTION file. Writing a CITATION file is of course also intended for those cases where a proper reference cannot be auto-generated from the DESCRIPTION file. (3) In compiling a list of packages, name order seems preferable, and one wants the title first (achieved by relocating the format.title field in the manual FUNCTION in the .bst file (4) manual seems not an ideal name for the class, if there is no manual. A package always has a reference manual, the concatenated help pages certainly qualify
[Rd] Citation of R packages
The bibtex citations provided by citation() do not work all that well in cases where there is no printed document to reference: (1) A version field is needed, as the note field is required for other purposes, currently trying to sort out nuances that cannot be sorted out in the author list (author, compiler, implementor of R version, contributor, ...) and maybe giving a cross-reference to a book or paper that is somehow relevant. (2) Maybe the author field should be more nuanced, or maybe ... (3) In compiling a list of packages, name order seems preferable, and one wants the title first (achieved by relocating the format.title field in the manual FUNCTION in the .bst file (4) manual seems not an ideal name for the class, if there is no manual. Maybe what is needed is a package or suchlike class, and several alternative .bst files that handle the needed listings. I know at least one other person who is wrestling with this, and others on this list must be wrestling with it. John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Mathematical Sciences Institute, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Bugs/issues with model.tables() (PR#8275)
unique(predict.lm(bal1.aov, type=terms, se=TRUE)$se) trt 1 0.3054198 II (d) In the interests of brevity (sic!), I will limit attention to means: bdes - structure(list(trt = structure(as.integer(c(1, 2, 1, 3, 1, 4, 2, 3, 2, 4, 3, 4)), .Label = c(a, b, c, d), class = factor), blk = structure(as.integer(c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6)), .Label = c(A, B, C, D, E, F), class = factor), y = c(0.8, -1.1, 4.5, 3.3, 4.3, 4.9, 0.6, 3.9, 4.6, 9.4, 3.7, 5.7)), .Names = c(trt, blk, y), row.names = c(1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), class = data.frame) # Crude block means; these are meaningless tapply(bdes$y, bdes$blk, mean) A B C D E F -0.15 3.90 4.60 2.25 7.00 4.70 # Crude treatment means tapply(bdes$y, bdes$trt, mean) abcd 3.20 1.37 3.63 6.67 ## aov fits bdes.aov - aov(y~blk+trt, data=bdes) # Blocks first bdes_trtFirst.aov - aov(y~trt+blk, data=bdes) # trt first model.tables(bdes.aov, type=means)[[table]][[blk]] blk A B C D E F -0.15 3.90 4.60 2.25 7.00 4.70 # Observe that these agree with the above crude block means model.tables(bdes_trtFirst.aov, type=means)[[table]][[trt]] trt abcd 3.20 1.37 3.63 6.67 ## Treatment means, when blocks are taken first model.tables(bdes.aov, type=means)[[table]][[trt]] trt abcd 4.13 2.05 3.73 4.95 ## Also, note their differences diff(model.tables(bdes.aov, type=means)[[table]][[trt]]) trt b c d -2.08 1.68 1.216667 ## Treatment means, from the usual least squares analysis dummy.coef(bdes.aov)[[(Intercept)]] (Intercept) 1.4125 dummy.coef(bdes.aov)[[(Intercept)]]+mean(dummy.coef(bdes.aov)$blk)+ + dummy.coef(bdes.aov)$trt abcd 4.341667 1.216667 3.741667 5.57 diff(dummy.coef(bdes.aov)$trt) b c d -3.125 2.525 1.825 diff(model.tables(bdes.aov, type=means)[[table]][[trt]])/ + diff(dummy.coef(bdes.aov)$trt) trt b c d 0.667 0.667 0.667 # This factor is some simple function of the BIB design parameters, # which I am too lazy or too busy to work out. --please do not edit the information below-- Version: platform = powerpc-apple-darwin7.9.0 arch = powerpc os = darwin7.9.0 system = powerpc, darwin7.9.0 status = major = 2 minor = 2.0 year = 2005 month = 10 day = 06 svn rev = 35749 language = R Locale: C Search Path: .GlobalEnv, cuckoohosts, file:~/r/ch2/.RData, file:../.RData, package:methods, package:stats, package:graphics, package:grDevices, package:utils, package:datasets, Autoloads, package:base John Maindonald email: [EMAIL PROTECTED] phone : +61 2 (6125)3473fax : +61 2(6125)5549 Centre for Bioinformation Science, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] plot(lm): new behavior in R-2.2.0 alpha
Martin - Thanks for your efforts in initiating and managing this discussion. As for the issue of deprecating the plot.lm() pictures in the published books, surely this will have great benefits for the authors. It will help them to sell the new editions of their books that will in due course appear replete with the new plots! For 2.2.0, I have nothing more to add to the comments others have made, I hope we can in due course agree, as a minimum, to put some version of John Fox's vif(), and something akin to Werner Stahl's smooths for up to 20 simulated data sets, into 2.3.0 John Maindonald. On 18 Sep 2005, at 1:29 AM, Martin Maechler wrote: Wst == Werner Stahel [EMAIL PROTECTED] on Fri, 16 Sep 2005 09:37:02 +0200 writes: Wst Dear Martin, dear Johns Thanks for including me into Wst your discussion. Wst I am a strong supporter of Residuals vs. Hii One remaining problem I'd like to address is the balanced AOV situation, ... Wst In order to keep the plots consistent, I suggest to Wst draw a histogram. Other alternatives will or can be Wst interesting in the general case and therefore are not a Wst suitable substitute for this plot. hmm, but all other 3 default plots have (standardized / sqrt) residuals on the y-axis. I'd very much like to keep that for any forth plot. So would we want a horizontal histogram? And do we really want a histogram when we've already got the QQ plot? We need a decent proposal for a 4th plot {instead of R_i vs h_ii , when h_ii are constant} REAL SOON NOW since it's feature freeze on Monday. Of course the current state can be declared a bug and still be fixed but that was not the intention... Also, there are now at least 2 book authors among R-core (and more book authors more generally!) in whose books there are pictures with the old-default 4th plot. So I'd like to have convincing reasons for ``deprecating'' all the plot.lm() pictures in the published books. At the moment, I'd still go for R_i vs i or sqrt|R_i| vs i -- possibly with type = 'h' which could be used to check an important kind of temporal auto-correlation. the latter, because in a 2 x 2 plot arrangement, this gives the same y-axis as default plot 3. Wst Wst Back to currently available methods: Wst John Maindonald discusses different contours. I like Wst the implementation I get currently in R-devel: contours Wst of Cook's distances, since they are popular and we can Wst then argue that the plot of D_i vs. i is no more Wst needed. what about John's proposal of different contour levels than c(0.5, 1) -- note that these *have* been added as arguments to plot.lm() a user could modify. Wst For most plots, I like to see a smoother along with the Wst points. I suggest to add the option to include Wst smoothers, not only as an argument to plot.lm, but even Wst as an option(). I have heared of the intense Wst discussions about options(). With Martin, we arrived Wst at the conclusion that options() should never influence Wst calculations and results, but is suitable to adjust Wst outputs (numerical: digits=, graphical: smooth=) to the Wst user's taste. {and John Fox agreed, `in general'} That could be a possibility, for 2.2.0 only applied to plot.lm() in any case, where plot.lm() would get a new argument add.smooth = getOption(plot.add.smooth) What do people think about the name? it would ``stick with us'' -- so we better choose it well.. (4) Are there other diagnostics that ought to be included in stats? (perhaps in a function other than plot.lm(), which risks being overloaded). One strong claiment is vif() (variance inflation factor), ... ... ... Wst As we focus on plots, my plot method includes the Wst option (default) to add smooths for 20 simulated Wst datasets (according to the fitted model). this and others are really nice. However not for R 2.2.x in any case. I agree that one should rather provide `single-plot' functions and have plot.lm() just call a few of them; instead of having things all part of plot.lm(). There's the slight advantage that you can guarantee some consistence (e.g. in the definition of standardized residuals) and save some computations when have everything in one function, but consistency should be possible otherwise as well... Anyway this is for 2.3.0 or later. Martin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel