Re: [Rd] Documentation for sd (stats) + suggestion
> As far as I can tell, the manual help page for ``sd`` > > ?sd > > does not explicitly mention that the formula for the standard deviation is > the so-called "Bessel-corrected" formula (divide by n-1 rather than n). See Details, where it says "Details: Like 'var' this uses denominator n - 1. " *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Documentation examples for lm and glm
> From: Thomas Yee [mailto:t@auckland.ac.nz] > > Thanks for the discussion. I do feel quite strongly that > the variables should always be a part of a data frame. This seems pretty much a decision for R core, and I think it's useful to have raised the issue. But I, er, feel strongly that strong feelings and 'always' are unsafe in a best practice argument. First, other folk with different use-cases or work practice may see 'best practice' quite differently. So I would pretty much always expect exceptions. Second, for examples of capability, there are too many exceptions in this instance. For example: glm() can take a two-column matrix as a single response variable. lm() can take a matrix as a response variable. lm() can take a complete data frame as a predictor (see ?stackloss) None of these work naturally if everything is in a data frame, and some won’t work at all. Steve E *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Documentation examples for lm and glm
FWIW, before all the examples are changed to data frame variants, I think there's fairly good reason to have at least _one_ example that does _not_ place variables in a data frame. The data argument in lm() is optional. And there is more than one way to manage data in a project. I personally don't much like lots of stray variables lurking about, but if those are the only variables out there and we can be sure they aren't affected by other code, it's hardly essential to create a data frame to hold something you already have. Also, attach() is still part of R, for those folk who have a data frame but want to reference the contents across a wider range of functions without using with() a lot. lm() can reasonably omit the data argument there, too. So while there are good reasons to use data frames, there are also good reasons to provide examples that don't. Steve Ellison > -Original Message- > From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Ben > Bolker > Sent: 13 December 2018 20:36 > To: r-devel@r-project.org > Subject: Re: [Rd] Documentation examples for lm and glm > > > Agree. Or just create the data frame with those variables in it > directly ... > > On 2018-12-13 3:26 p.m., Thomas Yee wrote: > > Hello, > > > > something that has been on my mind for a decade or two has > > been the examples for lm() and glm(). They encourage poor style > > because of mismanagement of data frames. Also, having the > > variables in a data frame means that predict() > > is more likely to work properly. > > > > For lm(), the variables should be put into a data frame. > > As 2 vectors are assigned first in the general workspace they > > should be deleted afterwards. > > > > For the glm(), the data frame d.AD is constructed but not used. Also, > > its 3 components were assigned first in the general workspace, so they > > float around dangerously afterwards like in the lm() example. > > > > Rather than attached improved .Rd files here, they are put at > > www.stat.auckland.ac.nz/~yee/Rdfiles > > You are welcome to use them! > > > > Best, > > > > Thomas > > > > __ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Unexpected argument-matching when some are missing
> Yes, I think all of that is correct. But y _is_ missing in this sense: > > plot(1:10, y=) > > ... > Browse[2]> missing(y) Although I said what I meant by 'missing' vs 'not present', it wasn't exactly what missing() means. My bad. missing() returns TRUE if an argument is not specified in the call _whether or not_ it has a default, hence the behaviour of missing(y) in debug(plot). But we can easily find out whether a default has been assigned: plot(1:10, y=, type=) Browse[2]> y NULL Browse[2]> type "p" ... which is consistent with silent omission of 'y=' and 'type=' Still waiting for a guru... Steve E *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Unexpected argument-matching when some are missing
> > plot(x=1:10, y=) > > plot(x=1:10, y=, 10:1) > > > > In both cases, 'y=' is ignored. In the first, the plot is for y=NULL (so not > 'missing' y) > > In the second case, 10:1 is positionally matched to y despite the > > intervening > 'missing' 'y=' > > > > So it isn't just 'missing'; it's 'not there at all' > > What exactly is the difference between "missing" and "not there at all"? A "missing argument" in R means that an argument with no default value was omitted from the call, and that is what I meant by "missing". But that is not what is happening here. I was talking about "y=" apparently being treated as not present in the call, rather than the argument y being treated as a missing argument. In these examples, plot.default has a default value for y (NULL) so y can never be "missing" in the sense of the 'missing argument' error (compare what happens with plot(y=1:10), which reports x as 'missing'). In the first example, y was (from the plot behaviour) taken as NULL - the default - so was not considered a missing argument. In the second, it was taken as 10:1 - again, non-missing, despite 10:1 being in the normal position for the (character) argument "type". But neither call did anything at all with "y=". Instead, the behaviour is consistent with what would have happened if 'y=' were "not present at all" when counting position or named argument list, rather than if 'y' were an absent required argument. It _looks_ as if the initial call parsing silently ignored the malformed expression "y=" before any argument matching - positional or by name - takes place. But I'm thinking that it'll take an R-core guru to explain what's going on here, so I was going to wait and see. Steve Ellison *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Unexpected argument-matching when some are missing
> When trying out some variations with `[.data.frame` I noticed some (to me) > odd behaviour, Not just in 'myfun' ... plot(x=1:10, y=) plot(x=1:10, y=, 10:1) In both cases, 'y=' is ignored. In the first, the plot is for y=NULL (so not 'missing' y) In the second case, 10:1 is positionally matched to y despite the intervening 'missing' 'y=' So it isn't just 'missing'; it's 'not there at all' Steve E > -Original Message- > From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Emil > Bode > Sent: 29 November 2018 10:09 > To: r-devel@r-project.org > Subject: [Rd] Unexpected argument-matching when some are missing > > When trying out some variations with `[.data.frame` I noticed some (to me) > odd behaviour, which I found out has nothing to do with `[.data.frame`, but > rather with the way arguments are matched, when mixing named/unnamed > and missing/non-missing arguments. Consider the following example: > > > > myfun <- function(x,y,z) { > > print(match.call()) > > cat('x=',if(missing(x)) 'missing' else x, '\n') > > cat('y=',if(missing(y)) 'missing' else y, '\n') > > cat('z=',if(missing(z)) 'missing' else z, '\n') > > } > > myfun(x=, y=, "z's value") > > > > gives: > > > > # myfun(x = "z's value") > > # x= z's value > > # y= missing > > # z= missing > > > > This seems very counterintuitive to me, I expect the arguments x and y to be > missing, and z to get “z’s value”. > > When I call myfun(,y=,"z's value"), x is missing, and y gets “z’s value”. > > Are my expectations wrong or is this a bug? And if my expectations are > wrong, where can I find more information on argument-matching? > > My gut-feeling says to call this a bug, but then I’m surprised no-one else has > encountered it before. > > > > And I don’t have multiple installations to work from, so could somebody else > confirm this (if it’s not my expectations that are wrong) for R-devel/other R- > versions/other platforms? > > My setup: R 3.5.1, MacOS 10.13.6, both Rstudio 1.1.453 and R --vanilla from > Bash > > > > Best regards, > > Emil Bode *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] invisible functions
> 2. change cfun[[1]] <- quote(cord.work) to cfun[[1]] <- > quote(survival:::cord.work). You say this will mess up your test bed. > That suggests that your test bed is broken. This is a perfectly legal > and valid solution. Valid in a package, but forces code to call a loaded library version of a function rather than (say) a 'source'd user-space version that is under development. Being non-specific (ie omitting foo:::) means that test code would pick up the development version in the current user environment by default. That's handy for small mods. S Ellison *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] RFC: make as.difftime more consistent or convenient
> Thank you for your comments! But, what you wrote is known. What do you > want to express with regard to my questions? > > I wrote: > > … there is no appropriate format ..., > > although "weeks" is a legitimate unit of 'difftime': > > > > > as.difftime("12 w", "%...") > > > > > as.difftime("12 weeks") > > Time difference of 12 weeks > > > > 1. What do you think about making the behavior of 'as.difftime' more > > consistent by accepting also formats for "days" and "weeks"? as.difftime calls strptime to apply the format argument. If I wanted to extend the range of formats as.difftime accepts, I'd leave as.difftime alone and look at how strptime could be extended to cover the formats you envisage. But... I wouldn’t do that either. strptime is essentially a call to an .Internal function and very likely reliant on established C code for the already very flexible standard C function strptime, which strptime clearly mirrors intentionally. That usually makes things dangerous to tinker with in the short term and hard to maintain in the long term. So if you want to do something that will readily convert all combinations of things like '12 w', '12W', '12wks', '3m 2d', 1wk 2d', '18d' etc, write that as a stand-alone routine that converts those 'simple' formats directly to difftime objects and call it something like 'strpdifftime', which would allow it to be added (if it's wanted a lot) with minimal impact to existing code. S Ellison *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Wayback and related questions (was: RE: I have corrected a dead link ...)
Appreciated that this is something of a 'private discussion in the open', but the issues here seem to be relevant to almost any website cited as a reference. As such, package authors may find themselves falling foul of some policy we haven't heard of. So ... > There may be one small problem: IIUC, the wayback machine is a > +- private endeavor and really great and phantastic but it does > need (US? tax deductible) donations, https://archive.org/donate/, to > continue thriving. > This makes me hesitate a bit to link to it within the "base R" > documentation. Why, exactly? The donors have paid for the site to be available with minimal restrictions precisely so that people can use it. Were there terms of use that prevent you? Also, on GitHub's GNU ethical repository rating... I _can_ see it as reasonable for sites aspiring to be GNU projects to subscribe to the principles Stallman aspires to; but I cannot see it as sensible for them to refuse to reference sites that do not wish to make the same claims. If the R project cannot use or reference any site that uses non-open code, including minified javascript - which appears to be the principle issue for GitHub - I suspect that you will be obliged to discontinue links to almost every journal, university, charity, government and research establishment site currently in existence as soon as GNU get round to assessing them. I personally have great difficulty seeing that as sensible. But that's a personal opinion. If these really are serious issues, somebody needs to work up a consistent policy for R projects; otherwise we'll all be walking on eggshells. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] side-effect of calling functions via `::`
> -Original Message- > From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Martin > Maechler > ... > >>>>> Lionel Henry <lio...@rstudio.com> > > A package should probably never register a S3 method unless it owns > > either the generic or the class. > > I agree... (and typically it does "own" the class) If that is true and a good general guide, is it worth adding something to that effect to 1.5.2 of "Writing R extensions"? At present, nothing in 1.5.2 requires or recommends that a package using S3method owns either class or generic. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] ifelse() woes ... can we agree on a ifelse2() ?
> Just stating, in 'ifelse', 'test' is not recycled. As I said in "R-intro: > length of 'ifelse' result" > (https://stat.ethz.ch/pipermail/r-devel/2016-September/073136.html), > ifelse(condition, a, b) > returns a vector of the length of 'condition', even if 'a' or 'b' is longer. That is indeed (almost) the documented behaviour. The documented behaviour is slightly more complex; '... returns a value _of the same shape_ as 'test''. IN principle, test can be a matrix, for example. > A concrete version of 'ifelse2' that starts the result from 'yes': > .. still a bit disappointed that nobody has taken a look ... I took a look. The idea leaves (at least) me very uneasy. If you are recycling 'test' as well as arbitrary-length yes and no, results will become frighteningly hard to predict except in very simple cases where you have well-defined and consistent regularities in the data. And where you do, surely passing ifelse a vetor of the right length, generated by rep() applied to a short 'test' vector, will do what you want without messing around with new functions that hide what you're doing. Do you really have a case where 'test' is neither a single logical (that could be used with 'if') nor a vector that can be readily replicated to the desired length with 'rep'? If not, I'd drop the attempt to generate new ifelse-like functions. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [R-pkg-devel] Mis-spelled word with winbuilder
> I have googled 'piecewise' and it seems to be > widely used with that spelling. ... because that is the only correct spelling of 'piecewise' in English... Butu I see Duncan Murdoch has explained why it was flagged; not all English words appear in all English dictionaries. In fact, not all English words appear in _any_ English dictionary. Even the OED does not claim to have much more than two thirds (175k out of at least 250k) (https://en.oxforddictionaries.com/explore/how-many-words-are-there-in-the-english-language ) and that is not even a fifth of the Global Language Monitor estimate (http://www.languagemonitor.com/number-of-words/number-of-words-in-the-english-language-1008879/). Steve E *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [Rd] row names of 'rowsum()'
> 'rowsum()' seems to add row names to the resulting matrix, corresponding to > the respective 'group' values. This is very handy, but it is not documented. > Should the documentation mention it so it could be relied upon as part of API? If you're referring to base::rowSums, the 'value' section of the help page says " A numeric or complex array of suitable size, or a vector if the result is one-dimensional. For the first four functions the 'dimnames' (or 'names' for a vector result) are taken from the original array. " S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[R-pkg-devel] Is S3 class registration essential for CRAN?
A short question: How necessary is explicit S3 class registration for CRAN submission? R-forge's check is giving me a note on this for a handful of methods in a package, and I'm unclear whether it is something that _needs_ to be fixed for CRAN submission by registering all S3 methods, needs to be fixed for _some_, or doesn't need to be changed (the present version on cran is running without apparent issues, but was submitted before the checks expanded to pick this up). S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [Rd] authorship and citation
> > I read the CRAN policies twice, and there > > is no official guideline on how to compile the citation. The policies are about copyright and IP, not credited authorship. There's overlap but they are not the same thing. You can see whether someone is a copyright holder by referring to the license you had and whether there is any of their content remaining. But that might not mean they are package 'authors'. If you reuse code verbatim from another package's function, you _must_ note the copyright - but that does not necessarily make the original author of the code a co-author of your package (though I would expect to see at least an acknowledgement in the particular function's help page). And not all 'authors' need necessarily provide code - they could, for example, have developed the core maths the code implements. Of itself, that does not confer copyright in any part of the package code or help text, but it's very likely they'd deserve credit as co-authors. Common sense would suggest to me that if you are in doubt about whether someone should be on your author list (as opposed to copyright owner list) in a package's citation, you should probably ask them. And if you are considering removing an author, you should very definitely be in doubt because there was a reason they were there. The answers you get from different contributors might be different so it would not be surprising if packages differed in the extent to which they cited contributions or added acknowledgements. In essence, though, if everybody feels fairly treated by the citations within a package, there's no reason for anyone else to complain about it, and if someone feels they have not been properly credited they can - and should - contact the maintainer and say so. So ask before removing someone from your citation. If they say 'no', don’t remove them. S Ellison *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] authorship and citation
> > Is there a point at which the original developer should not stay on the > > author > list? > > Authorship is not just about code. For example, there are functions in R > which > have been completely recoded, but the design and documentation remain. > Copyright can apply to designs and there is shading between > inspiration and infringement. And many of us believe that inspiration > should be credited as a moral even if not a legal obligation. "Once an author on CRAN, always an author on CRAN"* doesn't sound a bad maxim to work to. *This statement should be cited** as S L R Ellison, C L Dodgson, 'RE: [Rd] authorship and citation'. r-devel mailing list, 2015 with Acknowledgements: B Ripley, R-Core, S Urbanek for helpful remarks **following its own precepts :) *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] authorship and citation
> The former co-author contributed, so he is still author and probably copyright > holder and has to be listed among the authors, otherwise it would be a CRAN > policy violation ... It's a bit of a philosophical question right now, but at some point in a developing package's life - particularly one that starts small but is subsequently refactored in growth - there may be no code left that was contributed by the original developer. Is there a point at which the original developer should not stay on the author list? S Ellison *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] 'vapply' not returning list element names when returned element is a length-1 list
-Original Message- From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Dean Attali If i have a function that returns a named list with 2 (or more) elements, then using 'vapply' retains the names of the elements: But if the function only returns one element, then the name foo is lost vapply _always simplifies_ according to the documentation. In the first case (function return value contains more than one element, and each ), vapply simplifies to a matrix of two lists (!). The names foo and hello have been added to the dimnames so you can tell which is which. in the second case the function return value is a single list and not a matrix of lists (a simple list is simpler than a matrix of lists). The name of the list ('foo') has nowhere to go; instead, you would be assigning the list to a named variable and you don't need the name 'foo'. Whether that is inconsistent is something of a matter of perspective. Simplification applied as far as possible will always depend on what simplification is possible for the particular return values, so different return values provide different behaviour. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] No error when assigning values to an empty vector/matrix/array
Also note that these warnings or errors are complaining that the number of items to replace (left length) is not a multiple of replacement length (right length). This suggests that when the left length is a multiple of the right length, everything is fine. And this is actually the case when the left length is 0. Because 0 is a multiple of anything. So in that case, the right value is truncated to length 0 and no warning is issued. Makes sense to me. Thanks Hervé, you gave the perfect explanation/rationale for this being consistent. This explains why a check for exact multiple of replacement length does not trigger a warning, but surely that is not sensible in the length 0 case. In all other cases, this check warns when there will be truncation of the replacement, and that seems to me the sensible intent of the check. A silent truncation to nothing is surely not the intended behaviour. I can't help feeling that the 'check for multiple of length' was a neat portmanteau check for several possible problems when recycling is allowed, but that the possibility of assigning to a length 0 object was not considered. I'd suggest logging it as an issue to for R-core to at least look at and either to fix or to at least warn of in documentation. S Ellison *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Advice on package design for handling of dots in a formula
This seems to be a common approach in other packages. However, one of my testers noted that if he put formula=y~. then w, ID, and site showed up in the model where they weren't supposed to be. This is the documented behaviour for '.' in a formula - it means 'everything else in the data object' Without changing your current code, though, your user could have said something like y~.-w-ID-site if they wanted to specify 'everything _except_ the subtracted terms', so it's not as bad as having no shortcuts at all. If you want to do the work for them, one (probably crude) way of doing it could use drop.terms() in combination with some work with the term labels: #A function that drops the terms in two later arguments from the terms in the first and returns the resulting trimmed terms object. f - function(form, dropthis, dropthattoo, data) { everything - attr(terms(form, data=data), term.labels) #needs data to expand '.' drops - c(attr(terms(dropthis, data=data), term.labels), attr(terms(dropthattoo, data=data), term.labels)) #could probably do without 'data' excludes -which(everything %in% drops) terms(form, data=data)[-excludes] } d - data.frame(a=1:10, b=10:1, g=gl(5,2), g2=gl(2,5), y=rnorm(10)) f(y~., ~g, ~b, data=d) #This returns a terms object, but there's a formula in that if you want it formula(f(y~., ~g, ~b, data=d)) You'll need to be careful about evaluating that though; don't forget to give any relevant model or model matrix functions the environment (data frame) to go with it or you'll get nonsense. S *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [RFC] A case for freezing CRAN
If we could all agree on a particular set of cran packages to be used with a certain release of R, then it doesn't matter how the 'snapshotting' gets implemented. This is pretty much the sticking point, though. I see no practical way of reaching that agreement without the kind of decision authority (and effort) that Linux distro maintainers put in to the internal consistency of each distribution. CRAN doesn't try to do that; it's just a place to access packages offered by maintainers. As a package maintainer, I think support for critical version dependencies in the imports or dependency lists is a good idea that individual package maintainers could relatively easily manage, but I think freezing CRAN as a whole or adopting single release cycles for CRAN would be thoroughly impractical. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Maintainer NOTE in R CMD Check
Using R 3.0.3 and Rtools 31, I now see a Note (reproduced on my R-forge checks) of the form * checking CRAN incoming feasibility ... NOTE Maintainer: 'firstname A B lastname ' where A and B are middle initials. I can change to a 'firstname lastname' form or 'INITS lastname' form and that removes the above Note*, but I then get a Note warning of maintainer change. Is either Note going to get in the way of CRAN submission? (And if one of them will, which one?) S Ellison *A minor aside: I couldn't find any documented reason for that, or indeed any restriction on the format of a maintaner's name other than 'name first email second in '; perhaps I missed something there? *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Possible tweak to R intro - was RE: [R] Subseting a data.frame -
Transferred from R-help: From: S Ellison Subsetting using subset() is perhaps the most natural way of subsetting data frames; perhaps a line or two and an example could usefully be included in the 'Working with data frames' section of the R Intro? From: Bert Gunter [mailto:gunter.ber...@gene.com] The R Intro Manual was largely or entirely the work of Bill Venables some years ago. So it is not really a part of R's maintained document system and has thus not been kept up to date with changes like the convenience function, subset(), which is basically a wrapper for [] . This is not to say that your suggestion is not worthwhile, only to explain why it probably won't be acted upon. It's trivial enough that I could offer a 3-line patch if someone has time and inclination to add it... S Ellison *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Q-Q plot scaling in plot.lm(); bug or thinko?
I've been looking fairly carefully at the Q-Q plots produced by plot.lm() and am having difficulty understanding why plot.lm() is doing what it's doing, specifically scaling the standardized residuals by the prior weights. Can anyone explain this to me ... ? Because with ideal choice of prior weights the scaled residuals are expected to be IID Normal (under the normality assumption for a linear model) and without scaling they aren't IID, so a Q-Q plot would be meaningless without scaling. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Is there an automatic method for updating an existing CRAN package from R-forge? - resolved
| Is there a mechanism on R-forge for updating an existing CRAN package, analogous to the 'submit to cran' link on the R-forge package page, I prefer to do this 'by hand' but my understanding is that 'submit to cran' is equivalent to 'push this current version to CRAN' --- and not to 'make one initial upload' as you seem to read it. So I say go for it. I had; it was when it failed that I posted the question. However, you've encouraged me to try again and that helped; the second attempt indicates that I had misunderstood the error message. It wasnt indicating a clash with an existing package name but a clash with an existing version number. Probable explanation: I have failed to update my DESCRIPTION file. Apologies for wasting folks' time. Again. S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Is there an automatic method for updating an existing CRAN package from R-forge?
I have a package on CRAN and now have a modest update that's passing build checks on R-forge. Is there a mechanism on R-forge for updating an existing CRAN package, analogous to the 'submit to cran' link on the R-forge package page, or should I just follow the instructions at http://cran.r-project.org/web/packages/policies.html for FTP upload? S Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Milestone: 4000 packages on CRAN
-Original Message- | There is a bug somewhere if another count gives 4001. | | Hint: available.packages() has filters, so you are not seeing the | Windows-only packages. . 'OS_type' exclude packages whose OS requirement is incompatible with this version of R: that is exclude Windows-only packages on a Unix-alike platform and _vice versa_. | Note too that the number of packages on CRAN is not monotone: | packages get added, packages get withdrawn/archived, sometimes 10s at a time. ... from which we deduce* that even the combination of the world's best statisticians and a system entirely under their control does not guarantee an unambiguous count. Anyone out there still think statistics are easy? Even so, 4000 plus or minus a few says a great deal for the R project's impact S Ellison *with limited accuracy but some entertainment *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Specifying a function as not being and S3 Class function
Is it possible to keep from triggering the following warning when I check the package? summary: function(object, ...) summary.agriculture: function(x, analyte.names, results.col, analyte.col, by, det.col, [clip] Part of the solution is to add ... to the legacy function; that is required by the generic and is missing in your own function. Adding ... will not break existing code. The name of the initial argument will still cause problems. But I've kludged round a similar issue (an intentional difference in required parameters, in my case) by replacing something like obj.summary(x, y, z, ...) with something like obj.summary(object, y, z, x=object, ...) This preserves legacy argument order, is consistent with summary(object, ...) and retains the named argument x to avoid code changes. But it is clearly a kludge. It also runs the risk of accidental overwriting of x if someone specifies too many unnamed parameters. That should not happen in working legacy code, of course, as that would have broken if you included a surplus parameter in a function call with no ... . If it _is_ a problem you could try obj.summary(object, y, z, ..., x=object), which would avoid the accidental assignment by requiring exact match naming, but I cannot recall offhand if that construct would be considered consistent with the generic using the current CMD check. Steve Ellison -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Matt Pocernich Sent: 24 April 2012 16:34 To: r-devel@r-project.org Subject: [Rd] Specifying a function as not being and S3 Class function I am compiling a library with legacy code which has functions named with periods in the names - but are not S3 class functions.For example for example, summary.agriculture is not an extension of the summary function for and 'agriculture. class object - it is just poorly named. Is it possible to keep from triggering the following warning when I check the package? * checking S3 generic/method consistency ... WARNING summary: function(object, ...) summary.agriculture: function(x, analyte.names, results.col, analyte.col, by, det.col, iQuantiles, iDetStats, iSW, iUCL, iLand, conf.level, iUTL, tol.level, utl.conf.level, iND, sig.figs) I know that the best answer would be to rename with a better naming convention, but that would cause issues with legacy applications. Thanks, Matt __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Unexpected email address change - again...
Further apologies to the list, but emails are still not getting to folk. Duncan, you should have had a diff from me yesterday - if not, they've fouled it up again... -- View this message in context: http://r.789695.n4.nabble.com/Unexpected-email-address-change-and-maybe-a-missing-manual-patch-tp3621766p3638311.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] median and data frames
This seems trivially fixable using something like median.data.frame - function(x, na.rm=FALSE) { sapply(x, function(y, na.rm=FALSE) if(is.factor(y)) NA else median(y, na.rm=na.rm), na.rm=na.rm) } Paul Johnson pauljoh...@gmail.com 28/04/2011 06:20 On Wed, Apr 27, 2011 at 12:44 PM, Patrick Burns pbu...@pburns.seanet.com wrote: Here are some data frames: df3.2 - data.frame(1:3, 7:9) df4.2 - data.frame(1:4, 7:10) df3.3 - data.frame(1:3, 7:9, 10:12) df4.3 - data.frame(1:4, 7:10, 10:13) df3.4 - data.frame(1:3, 7:9, 10:12, 15:17) df4.4 - data.frame(1:4, 7:10, 10:13, 15:18) Now here are some commands and their answers: median(df4.4) [1] 8.5 11.5 median(df3.2[c(1,2,3),]) [1] 2 8 median(df3.2[c(1,3,2),]) [1] 2 NA Warning message: In mean.default(X[[2L]], ...) : argument is not numeric or logical: returning NA The sessionInfo is below, but it looks to me like the present behavior started in 2.10.0. Sometimes it gets the right answer. I'd be grateful to hear how it does that -- I can't figure it out. Hello, Pat. Nice poetry there! I think I have an actual answer, as opposed to the usual crap I spew. I would agree if you said median.data.frame ought to be written to work columnwise, similar to mean.data.frame. apply and sapply always give the correct answer apply(df3.3, 2, median) X1.3 X7.9 X10.12 2 8 11 apply(df3.2, 2, median) X1.3 X7.9 28 apply(df3.2[c(1,3,2),], 2, median) X1.3 X7.9 28 mean.data.frame is now implemented as mean.data.frame - function(x, ...) sapply(x, mean, ...) I think we would suggest this for medians: ?? median.data.frame - function(x,...) sapply(x, median, ...) ? It works, see: median.data.frame(df3.2[c(1,3,2),]) X1.3 X7.9 28 Would our next step be to enter that somewhere in R bugzilla? (I'm not joking--I'm that naive). I think I can explain why the current median works intermittently in those cases you mention. Give it a small set of pre-sorted data, all is well. median.default uses a sort function, and it is confused when it is given a data.frame object rather than just a vector. I put a browser() at the top of median.default median(df3.2[c(1,3,2),]) Called from: median.default(df3.2[c(1, 3, 2), ]) Browse[1] n debug at tmp#4: if (is.factor(x)) stop(need numeric data) Browse[2] n debug at tmp#4: NULL Browse[2] n debug at tmp#6: if (length(names(x))) names(x) - NULL Browse[2] n debug at tmp#6: names(x) - NULL Browse[2] n debug at tmp#8: if (na.rm) x - x[!is.na(x)] else if (any(is.na(x))) return(x[FALSE][NA]) Browse[2] n debug at tmp#8: if (any(is.na(x))) return(x[FALSE][NA]) Browse[2] n debug at tmp#8: NULL Browse[2] n debug at tmp#12: n - length(x) Browse[2] n debug at tmp#13: if (n == 0L) return(x[FALSE][NA]) Browse[2] n debug at tmp#13: NULL Browse[2] n debug at tmp#15: half - (n + 1L)%/%2L Browse[2] n debug at tmp#16: if (n%%2L == 1L) sort(x, partial = half)[half] else mean(sort(x, partial = half + 0L:1L)[half + 0L:1L]) Browse[2] n debug at tmp#16: mean(sort(x, partial = half + 0L:1L)[half + 0L:1L]) Browse[2] n [1] 2 NA Warning message: In mean.default(X[[2L]], ...) : argument is not numeric or logical: returning NA Note the sort there in step 16. I think that's what is killing us. If you are lucky, give it a small data frame that is in order, like df3.2, the sort doesn't produce gibberish. When I get to that point, I will show you the sort's effect. First, the case that works. I moved the browser() down, because I got tired of looking at the same old not-yet-erroneous output. median(df3.2) Called from: median.default(df3.2) Browse[1] n debug at tmp#15: half - (n + 1L)%/%2L Browse[2] n debug at tmp#16: if (n%%2L == 1L) sort(x, partial = half)[half] else mean(sort(x, partial = half + 0L:1L)[half + 0L:1L]) Browse[2] n debug at tmp#16: mean(sort(x, partial = half + 0L:1L)[half + 0L:1L]) Interactively, type Browse[2] sort(x, partial = half + 0L:1L) NA NA NA NA NA NA 1 1 7 NULL NULL NULL NULL 2 2 8 NA NA NA NA 3 3 9 NA NA NA NA Warning message: In format.data.frame(x, digits = digits, na.encode = FALSE) : corrupt data frame: columns will be truncated or padded with NAs But it still gives you a right answer: Browse[2] n [1] 2 8 But if you give it data out of order, the second column turns to NA, and that causes doom. median(df3.2[c(1,3,2),]) Called from: median.default(df3.2[c(1, 3, 2), ]) Browse[1] n debug at tmp#15: half - (n + 1L)%/%2L Browse[2] n debug at tmp#16: if (n%%2L == 1L) sort(x, partial = half)[half] else mean(sort(x, partial = half + 0L:1L)[half + 0L:1L]) Browse[2] n debug at tmp#16: mean(sort(x, partial = half + 0L:1L)[half + 0L:1L]) Interactively: Browse[2] sort(x, partial = half + 0L:1L) NA NA NA NA NA NA 1 1 NULL 7 NULL NULL NULL 3 3 NA 9 NA NA NA 2 2 NA 8 NA NA NA Warning message: In format.data.frame(x, digits = digits, na.encode = FALSE) : corrupt data frame:
[Rd] Windows build not running on r-forge
Please forgive any mis-post, and do feel free to point me to a more appropriate list if this isn't properly R-dev. I have a package on R-forge that shows correct linux and other *nix builds, but no windows build. The log for the patched version shows the error below, which appears to be due to a lack of /src files, a problem that does not halt the *nix builds. The package contains no compiled code (src is intentionally empty). Log as follows: * installing to library 'R:/lib/R/CRAN/2.12' * installing *source* package 'metRology' ... ** libs *** arch - i386 Error in file.copy(Sys.glob(src/*), ss, recursive = TRUE) : no files to copy from * removing 'R:/lib/R/CRAN/2.12/metRology' Run time: 1.27 seconds. Advice would be welcome on what I can do about it...? Steve Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Windows build not running on r-forge
Thanks for the advice; blindingly simple. I was foxed by the fact that the R-forge linux 'builds' ran successfully with warning but not error. Regret that I couldn't provide R version detail, but the problem was the build at the r-forge end and not my own installed version of R; I could only tell you what the R-forge log told me. Steve Ellison Prof Brian Ripley rip...@stats.ox.ac.uk 28/03/2011 05:32 On Mon, 28 Mar 2011, S Ellison wrote: Please forgive any mis-post, and do feel free to point me to a more appropriate list if this isn't properly R-dev. I have a package on R-forge that shows correct linux and other *nix builds, but no windows build. The log for the patched version shows the error below, which appears to be due to a lack of /src files, a problem that does not halt the *nix builds. The package contains no compiled code (src is intentionally empty). Log as follows: * installing to library 'R:/lib/R/CRAN/2.12' * installing *source* package 'metRology' ... ** libs *** arch - i386 Error in file.copy(Sys.glob(src/*), ss, recursive = TRUE) : no files to copy from * removing 'R:/lib/R/CRAN/2.12/metRology' Run time: 1.27 seconds. Advice would be welcome on what I can do about it...? Don't have an empty 'src' directory: that is not a valid source package. And please do tell us the version of R as per the posting guide. I am guessing 2.12.2 here. Steve Ellison *** This email and any attachments are confidential. Any\ ...{{dropped:24}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Windows build not running on r-forge
Please forgive any mis-post, and do feel free to point me to a more appropriate list if this isn't properly R-dev. I have a package on R-forge that shows correct linux and other *nix builds, but no windows build. The log for the patched version shows the error below, which appears to be due to a lack of /src files, a problem that does not halt the *nix builds. The package contains no compiled code (src is intentionally empty). Log as follows: * installing to library 'R:/lib/R/CRAN/2.12' * installing *source* package 'metRology' ... ** libs *** arch - i386 Error in file.copy(Sys.glob(src/*), ss, recursive = TRUE) : no files to copy from * removing 'R:/lib/R/CRAN/2.12/metRology' Run time: 1.27 seconds. Advice would be welcome on what I can do about it...? Steve Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Why is there no c.factor?
c() should have been put on the deprecated list a couple of decades ago Don't you dare! Back to reality phew! had me worried there. c() is no problem at all for lists, Dates and most simple vector types; why deprecate something solely because it doesn't behave for something it doesn't claim to work on? Steve E *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Inconsistency in as.data.frame.table for stringsAsFactors
Some of us (including me) have strongly argued on several occasions that global options() settings should *not* have an effect on anything computing ... ... Global options are less of a problem where a function allows them to be overridden by the user or programmer. If something is affected by a global option, a programmer desiring consistent behaviour then has a simple recourse - set it explicitly in the call. In other words, the programmer should be able to enforce that principle of functional programming; an observation which I would offer as a wider imperative than the language having to do so. That is, I believe that 'it should always be possible for a user to set any parameter used by a function' irrespective of the existence or otherwise of global options. (I believe in programmer choice in these things). In a sense, too, a global option expresses a user choice for a set of operations. There are often good reasons for this; for example, factor contrasts. It has drastic effects on the model computed, but there seems good reason for the convenience of allowing a user to set contrasts for a series of related analyses rather than setting them individually on each model call. The (small number of pretty much trivial) defaults that have, over the last few years, given me a temporary headache are not globals or argument defaults; they have been hardwired defaults that couldn't be changed in the function calls. Rewriting the function is almost always possible, but not quite a straightforward method of overriding a default! To an extent, the same can be said of global options that affect a function, are user-settable but can't be overridden in the call itself. The main 'culprits' of this tend to be older graphics calls, which often respect par() options but don't all take all the par() options in '...' . But in general I too concur; there shouldn't be global options without pretty good reason. Which isn't to say I don't think that you're right - I would hate for R to head in the direction of PHP PHP is indeed an example to stay away from; it changes the nature of data without allowing a test on the stored data to reveal the fact. By contrast, stringsAsFactors produces a _detectable_ effect on the data; we can tell what form the data has now, irrespective of system settings now or (worse) on original input. Steve Ellison *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Package (PR#13475)
I had the same normalizePath error recently on a new laptop, with a fresh install of R 2.8.1 and an attempt to install lme4. First attempt: package 'Matrix' successfully unpacked and MD5 sums checked Error in normalizePath(path) : path[1]: The system cannot find the file specified Second attempt: package 'Matrix' successfully unpacked and MD5 sums checked package 'mlmRev' successfully unpacked and MD5 sums checked package 'MEMSS' successfully unpacked and MD5 sums checked package 'lme4' successfully unpacked and MD5 sums checked Error in normalizePath(path) : path[1]: The system cannot find the file specified The irreproducibility made me wonder... so I turned off Norton's auto-protect, which has a habit of scanning files on the fly when requested and that often delays file opening. The error disappeared, at least that once and for subsequent installations of NADA and the much larger rggobi install. The main reason for logging this post is to suggest a posible cause and workround. But if it does turn out to be a consistent issue, perhaps it would be worth checking for timeout issues related to normalizePath or related routines in a future update? S Duncan Murdoch-2 wrote: On 1/27/2009 10:15 AM, partho_bhowm...@ml.com wrote: Full_Name: Partho Bhowmick Version: 2.8.1 OS: Windows XP Submission from: (NULL) (199.43.48.131) While trying to install package sn (I have tried multiple mirrors), I get the following message trying URL 'http://www.revolution-computing.com/cran/bin/windows/contrib/2.8/sn_0.4-10.zip' Content type 'application/zip' length 320643 bytes (313 Kb) opened URL downloaded 313 Kb package 'sn' successfully unpacked and MD5 sums checked Error in normalizePath(path) : path[1]: The system cannot find the file specified It works for me. I suspect it's a permission problem or something similar on your system. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- View this message in context: http://www.nabble.com/Package-%28PR-13475%29-tp21690164p22987300.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Is text(..., adj) upside down? (Or am I?)
?text says 'adj' allows _adj_ustment of the text with respect to '(x,y)'. Values of 0, 0.5, and 1 specify left/bottom, middle and right/top, respectively. But it looks like 0, 1 specify top, bottom respectively in the y direction. plot(1:4) text(2,2, adj=c(0,0), adj=c(0,0)) text(2,2, adj=c(0,1), adj=c(0,1), col=2) #the red one's below the black one... #x-adj is OK text(3,3, adj=c(0,0), adj=c(0,0)) text(3,3, adj=c(1,0), adj=c(1,0), col=2) [I am using r 2.7.1 in windows; adj behaviour is consistent in 2.6.0ff and for expressions as well as text] Perhaps a two-word correction to ?text ? Steve Ellison Lab of the Government Chemist UK *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Is text(..., adj) upside down? (Or am I?)
Yup; you're all right - it IS consistent (and I'd even checked the x-adj and it did what I expected!!). It's just that ?text is talking about the position of the 'anchor' point in the text region rather than the subsequent location of the centre of the text. Anyway; if anyone is considering a minor tweak to ?text, would it be clearer if it said Values of 0, 0.5, and 1 specify text towards right/top, middle and left/bottom of x,y, respectively. ? (or, of course, Values of 0, 0.5, and 1 specify x,y at left/bottom, middle and right/top of text, respectively.) Steve Ellison Lab of the Government Chemist UK *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] savePlot() no longer automatically adds an extension to the filename.
Michael Prager [EMAIL PROTECTED] 06/04/08 4:28 AM There is much to be said for consistency (across platforms and functions) and stability (across versions) in software. I could not agree more. But while consistency is an excellent reason for making the patch consistent across platforms, it doesn't help decide what the consistent behaviour should be. It is as strong an argument for retaining the original behaviour at 2.6.x as for keeping a change. As to filename extensions or their lack; in practice, RGui.exe addds a defult Rdata extension on saving a workspace via the GUI, and so does saving history, so we already have overwriteable default extensions for those purposes even though the command line behaviour doesn't do that. And it genuinely does help, which is my main motivation here. The main advantage of the present patch is that it delivers _most_ of the optional behaviours missing from both 2.6.2 and 2.7.0 without adding an extra parameter. But it isn't the best that could be done. If there is a platform-independent way of keeping the default extension-by-type and allowing the option of removing it entirely (in other words, an extra parameter to savePlot) I'd personally see that as a better way forward than the present patch. An alternative might be to keep the 2.7.0 savePlot and define a new convenience function (SavePlot, say, or save.plot for extra dottiness) which provides the wrapper. It would have impact on legacy code, but it is a lot easier to change a function name throughout legacy code than to change supplied parameters. In the mean time, consistency arguments would indicate that the present patch should be applied to all platforms if at all. Steve E *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] savePlot() no longer automatically adds an extension to the filename.
Apologies; in my last post (below) I should have distinguished between action on the current patch and suggested action for the future. I bow to Duncan on the fate of the current patch, as he's managing it. The suggestions of an extra savePlot parameter or a wrapper function with extension default were suggestions for possible future tidying-up in subsequent releases. Steve E S Ellison [EMAIL PROTECTED] 06/04/08 12:44 PM Michael Prager [EMAIL PROTECTED] 06/04/08 4:28 AM There is much to be said for consistency (across platforms and functions) and stability (across versions) in software. I could not agree more. But while consistency is an excellent reason for making the patch consistent across platforms, it doesn't help decide what the consistent behaviour should be. It is as strong an argument for retaining the original behaviour at 2.6.x as for keeping a change. As to filename extensions or their lack; in practice, RGui.exe addds a defult Rdata extension on saving a workspace via the GUI, and so does saving history, so we already have overwriteable default extensions for those purposes even though the command line behaviour doesn't do that. And it genuinely does help, which is my main motivation here. The main advantage of the present patch is that it delivers _most_ of the optional behaviours missing from both 2.6.2 and 2.7.0 without adding an extra parameter. But it isn't the best that could be done. If there is a platform-independent way of keeping the default extension-by-type and allowing the option of removing it entirely (in other words, an extra parameter to savePlot) I'd personally see that as a better way forward than the present patch. An alternative might be to keep the 2.7.0 savePlot and define a new convenience function (SavePlot, say, or save.plot for extra dottiness) which provides the wrapper. It would have impact on legacy code, but it is a lot easier to change a function name throughout legacy code than to change supplied parameters. In the mean time, consistency arguments would indicate that the present patch should be applied to all platforms if at all. Steve E *** This email and any attachments are confidential. Any use...{{dropped:16}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Request: Documentation of formulae
Rather than transport quantities of the Introduction to R (a perfectly sensible title for a very good starting point, IMHO) would it not be simpler and involve less maintenance to include a link or cross-reference in the 'formula' help page to the relevant part of the Introduction? If nothing else, that might get folk to look at the Introduction if they've bypassed it Steve E Martin Maechler [EMAIL PROTECTED] 03/06/2008 07:48:46 MP == Mike Prager [EMAIL PROTECTED] MP Mike Prager [EMAIL PROTECTED] wrote: I was at a loss to understand the use of / until I looked in An Introduction [!] to R, where I found the explanation. My request is that more complete material on model formulae be lifted from Introduction to R (or elsewhere) and put into the main online help files. *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] savePlot() no longer automatically adds an extension to the filename.
Plaintive squeak: Why the change? Some OS's and desktops use the extension, so forgetting it causes trouble. The new default filename keeps a filetype (as before) but the user now has to type a filetype twice (once as the type, once as extension) to get the same effect fo rtheir own filenames. And the extension isn't then checked for consistency with valid file types, so it can be mistyped and saved with no warning. Hard to see the advantage of doing away with it... Suggestion: Revert to the previous default (extension as type) and include an 'extension' in the parameter list so that folk who don't want it can change it and folk who did want it get it automatically. The code would then look something like savePlot-function (filename = Rplot, type = c(wmf, emf, png, jpg, jpeg, bmp, tif, tiff, ps, eps, pdf), device = dev.cur(), restoreConsole = TRUE, extension) #Added extension { type - match.arg(type) if(missing(extension)) extension - type ##added devlist - dev.list() devcur - match(device, devlist, NA) if (is.na(devcur)) stop(no such device) devname - names(devlist)[devcur] if (devname != windows) stop(can only copy from 'windows' devices) if (filename == clipboard type == wmf) filename - else fullname - paste(filename, extension, sep=ifelse(extension==,,.) ) ##added invisible(.External(CsavePlot, device, fullname, type, restoreConsole))##Modded } Steve E PS Yes, I took a while to upgrade from 2.6.x. Otherwise I'd have squeaked the day I upgraded - like I just did - 'cos I use savePlot a LOT. *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] formula(CO2)
CO2 is apparently a groupedData object; the formula attribute is described by Pinheiro and Bates as a 'display formula'. Perhaps reference to the nlme package's groupedData help would be informative? Gabor Grothendieck [EMAIL PROTECTED] 16/07/2007 16:18:37 Yes. That's what I was referring to. On 7/16/07, Ted Harding [EMAIL PROTECTED] wrote: On 16-Jul-07 14:42:19, Gabor Grothendieck wrote: Note that the formula uptake ~. will do the same thing so its not clear how useful this facility really is. Hmmm... Do you mean somthing like lm(uptake ~ . , data=CO2[,i]) where i is a subset of (1:4) as in my code below? In which case I agree! Ted. On 7/16/07, Ted Harding [EMAIL PROTECTED] wrote: On 16-Jul-07 14:16:10, Gabor Grothendieck wrote: Following up on your comments it seems formula.data.frame just creates a formula whose lhs is the first column name and whose rhs is made up of the remaining column names. It ignores the formula attribute. In fact, CO2 does have a formula attribute but its not extracted by formula.data.frame: [EMAIL PROTECTED] uptake ~ conc | Plant formula(CO2) Plant ~ Type + Treatment + conc + uptake Indeed! And, following up yet again on my own follow-up comment: library(combinat) for(j in (1:4)){ for(i in combn((1:4),j,simplify=FALSE)){ print(formula(CO2[,c(5,i)])) } } uptake ~ Plant uptake ~ Type uptake ~ Treatment uptake ~ conc uptake ~ Plant + Type uptake ~ Plant + Treatment uptake ~ Plant + conc uptake ~ Type + Treatment uptake ~ Type + conc uptake ~ Treatment + conc uptake ~ Plant + Type + Treatment uptake ~ Plant + Type + conc uptake ~ Plant + Treatment + conc uptake ~ Type + Treatment + conc uptake ~ Plant + Type + Treatment + conc opening the door to automated fitting of all possible models (without interactions)! Now if only I could find out how to do the interactions as well, I would never need to think again! best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 16-Jul-07 Time: 15:40:36 -- XFMail -- __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 16-Jul-07 Time: 16:13:15 -- XFMail -- __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel *** This email and any attachments are confidential. Any use, co...{{dropped}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] boxplot and bxp do not respect xlim by default (PR#9754)
Brian, Note that ?bxp quite carefully says which graphical pars it does and does not accept, and 'xlim' is one it does not accept. In my version at the time, bxp did not list which plot parameters it does not accept. xlim was simply not mentioned at all. I can't easily see lack of a mention as _good_ documentation of lack of acceptance when other unmentioned parameters ARE accepted and when plot.default is very clear on xlim being an allowed parameter by default. But that's pretty much academic if one adds xlim support in bxp. Steve's suggestion is not good if the boxes differ in size or if at = c(0, 10:15) [or if unordered at] Certainly true (and I was kicking myself only shortly after posting it for exactly the same reason!). But the default for xlim stays much simpler if brian's next point is accepted: I should have added that some code assumes the current default for xlim even when 'at' is specified, including the last example in boxplot. An important point. That says one should not change the default xlim to adjust for arbitrary at. Fortunately, that makes life easier: keep the present xlim defaults while allowing allow user-specified xlim is near-trivial compared to implementing a general at-specific xlim default. ...and when there is a log x axis (and there the previous default is also inadequate). The log issue, ironically, perhaps IS a bug, as log is an allowed parameter on the x-axis (via log=x) and width is chosen without paying it any attention. I haven't looked closely at a required code mod, but it's adjustable with user-specified boxwex. Perhaps documenting the fact of a poor default and suggesting manual boxwex might be sufficient remedy? If I understand the discussion so far, though, the requirement for this bugfix/wishfix would go something like this: i) if at is unspecified, at=1:n; if specified, it is respected. (true now) ii) if xlim is unspecified, xlim=c(0.5, n+0.5); if it is specified, it is respected for add=F. (only adds respect for specified xlim) iii) For add=T, xlim should be ignored if specified (silently? with warning?). (Currently silently ignored) iv) behaviour on log=x should be noted in the help. The above are fairly trivial to implement and document, as such things go... I'd be happy to give it a shot. Steve Ellison Prof Brian Ripley [EMAIL PROTECTED] 02/07/2007 15:15:16 On Mon, 2 Jul 2007, [EMAIL PROTECTED] wrote: So this is a wish, not a bug. The easy part is to allow it to accept 'xlim' is specified. The harder part is to find a good default of xlim in general. , for example. It seems to me that 'at' would normally be used with add=T, so I don't think we need to do this well (and the user will always be able to set 'xlim'). I should have added that some code assumes the current default for xlim even when 'at' is specified, including the last example in boxplot. Other cases where Steve's suggestion was wrong are when 'at' is not sorted, and when there is a log x axis (and there the previous default is also inadequate). I am about to commit an improved version for R-devel. On Tue, 26 Jun 2007, [EMAIL PROTECTED] wrote: On 6/26/2007 8:16 AM, [EMAIL PROTECTED] wrote: Full_Name: Steve Ellison Version: 2.4.1 OS: Windows, Linux Submission from: (NULL) (194.73.101.157) bxp() allows specifcation of box locations with at=, but neither adjusts xlim= to fit at nor does it respect xlim provided explicitly. This is because bxp() now includes explicit xlim as c(0.5, n+0.5), without checking for explicitly supplied xlim (or ylim if horizontal). This also prevents simple added plots (eg if add=T, with at=(1:n)+0.5, the last box is partly off the plot. The 'offending' code is in bxp: if (!add) { plot.new() if (horizontal) plot.window(ylim = c(0.5, n + 0.5), xlim = ylim, log = log, xaxs = pars$yaxs) else plot.window(xlim = c(0.5, n + 0.5), ylim = ylim, log = log, yaxs = pars$yaxs) } Suggested fix: if (!add) { plot.new() bxp.limits - if(!is.null(at)) { c(at[1]-(at[2]-at[1])/2, at[n]+(at[n]-at[n-1])/2 ) } else { c(0.5, n + 0.5) } if (horizontal) plot.window(ylim = if(is.null(pars$xlim)) bxp.limits else pars$xlim, xlim = ylim, log = log, xaxs = pars$yaxs) else plot.window(xlim = if(is.null(pars$xlim)) bxp.limits else pars$xlim, ylim = ylim, log = log, yaxs = pars$yaxs) } This retains the current defaults for xlim with at unspecified but allows explicit specification of xlim. (which is the grouping level axis whether horizontal or vertical). But it fails in a few other cases: if the user sets the widths, this doesn't respect that setting; if the user specifies the location of one boxplot (so length(at) == 1) it fails when it tries to access at[2]. This is
Re: [Rd] boxplot and bxp do not respect xlim by default (PR#9756)
What is mystifying is that the issue was not present in previous versions, = so appropriate code already existed. However, I agree that there seem to be a couple of additional issues that = I had missed. =20 I am perfectly happy to look at this again myself, though, and provide = extended code; would that help? S This retains the current defaults for xlim with at unspecified but = allows explicit specification of xlim. (which is the grouping level axis = whether horizontal or vertical). But it fails in a few other cases: if the user sets the widths, this=20 doesn't respect that setting; if the user specifies the location of one=20 boxplot (so length(at) =3D=3D 1) it fails when it tries to access at[2]. This is a somewhat tricky problem, that needs more careful thought than=20 I have time for right now, so I'll leave it for someone else (or for=20 myself in a less busy future, which may exist in some alternate universe). What I'd suggest you do in the short term is simply to set up the plot=20 axes the way you want before calling bxp, then call it with add=3DTRUE. Duncan Murdoch = ***=0D=0A= This email and any attachments are confidential. Any use, co...{{dropped}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Boxplot issues (formerly posted to R-help in error)
Boxplot and bxp seem to have changed behaviour a bit of late (R 2.4.1). Or maybe I am mis-remembering. An annoying feature is that while at=3:6 will work, there is no way of overriding the default xlim of 0.5 to n+0.5. That prevents plotting boxes on, for example, interval scales - a useful thing to do at times. I really can see no good reason for bxp to hard-core the xlim=c(0.5, n+0.5) in the function body; it should be a parameter default conditional on horizontal=, not hard coded. Also, boxplot does not drop empty groups. I'm sure it used to. I know it is good to be able to see where a factor level is unpopulated, but its a nuisance with fractional factorials and some nested or survey problems when many are known to be missing and are of no interest. Irrespective of whether my memory is correct, the option would be useful. How hard can it be to add a 'drop.empty=F' default to boxplot to allow it to switch? Obviously, these are things I can fix locally. But who 'owns' boxplot so I can provide suggested code to them for later releases? Steve Ellison *** This email and any attachments are confidential. Any use, co...{{dropped}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] One for the wish list - var.default etc
I was working on a permutation-like variant of the bootstrap for smaller samples, and wanted to be able to get summary stats of my estimator conveniently. mean() is OK as its a generic, so a mean.oddboot function gets used automatically. But var, sd and others are not originally written as generic; they have to be explicitly masked by a package or new declaration. It would have been nice if stats::var was a generic to make it more easily extensible... one for the wish list? Steve Ellison *** This email and any attachments are confidential. Any use, co...{{dropped}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] One for the wish list - var.default etc
Brian, If we make functions generic, we rely on package writers implementing the documented semantics (and that is not easy to check). That was deemed to be too easy to get wrong for var(). Hard to argue with a considered decision, but the alternative facing increasing numbers of package developers seems to me to be pretty bad too ... There are two ways a package developer can currently get a function tailored to their own new class. One is to rely on a generic function to launch their class-specific instance, and write only the class-specific instance. That may indeed be hard to check, though I would be inclined to think that is the package developer's problem, not the core team's. But it has (as far as I know today ...?) no wider impact. The other option, with no existing generic, is to mask the original function by writing a new generic function that respects the original syntax exactly, and then implement a fun.default that replicates the original non-generic function's behaviour, hopefully by calling it directly. As an example, library(circular) masks stats::var, though I'm fairly sure its not the only case. This has obvious disadvantages, including potentially system-wide (R-wide at least!) impact and unfavourable interactions between packages masking each other's generics and defaults). I will use masking if I have to, at least for my own local use where its only me that suffers if (when?) I get it wrong. But the idea makes me very nervous, especially if I imagine folk who _don't_ get as nervous at the idea. Hence the feeling that wider use of generics for fundamental and common functions might make for a safer world. Steve E *** This email and any attachments are confidential. Any use, co...{{dropped}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel