>>>>> Marc Schwartz <marc_schwa...@me.com> >>>>> on Fri, 14 Jul 2017 11:01:03 -0500 writes:
>> On Jul 14, 2017, at 9:50 AM, Martin Maechler >> <maech...@stat.math.ethz.ch> wrote: >> >>>>>>> Martin Maechler <maech...@stat.math.ethz.ch> on Fri, >>>>>>> 14 Jul 2017 16:30:50 +0200 writes: >> >>>>>>> Marc Schwartz <marc_schwa...@me.com> on Fri, 14 Jul >>>>>>> 2017 06:57:26 -0500 writes: >> >>>>> On Jul 13, 2017, at 5:07 PM, Marc Schwartz >>>>> <marc_schwa...@me.com> wrote: >>>>> >>>>> >>>>> On Jul 13, 2017, at 3:37 PM, Marc Schwartz >>>>> <marc_schwa...@me.com> wrote: >>>>> >>>>> >>>>>>> On Jul 13, 2017, at 3:22 PM, Duncan Murdoch >>>>>>> <murdoch.dun...@gmail.com> wrote: >>>>>>> >>>>>>> On 13/07/2017 4:08 PM, Marc Schwartz wrote: >>>>>>>> Hi All, >>>>>>>> >>>>>>>> As per the discussion today on R-Help: >>>>>>>> >>>>>>>> https://stat.ethz.ch/pipermail/r-help/2017-July/448132.html >>>>>>>> >>>>>>>> I am attaching a proposed patch for poly.Rd to >>>>>>>> provide clarifying wording relative to naming the >>>>>>>> 'degree' argument explicitly, in the case where the >>>>>>>> 'x' argument is a matrix, rather than a vector. >>>>>>>> >>>>>>>> This is based upon the svn trunk version of >>>>>>>> poly.Rd. >>>>>>> >>>>>>> I don't think this is the right fix. The use of the >>>>>>> unnamed 2nd arg as degree happens whether the first >>>>>>> arg is a matrix or not. >>>>>>> >>>>>>> I didn't read the whole thread in detail, but it >>>>>>> appears there's a bug somewhere, in the report or in >>>>>>> the poly() code or in the plsr() code. That bug >>>>>>> should be reported on the bug list if it turns out >>>>>>> to be in base R, and to the package maintainer if it >>>>>>> is in plsr(). >>>>>>> >>>>>>> Duncan Murdoch >>>>> >>>>> >>>>> Duncan, >>>>> >>>>> Thanks for your reply. You only really need to read that >>>>>>> last post in the thread linked to above. >>>>> >>>>> I won't deny the possibility of a bug in poly(), relative >>>>>>> to the handling of 'x' as a matrix. The behavior >>>>>>> occurs in the poly() function in a pure stand alone >>>>>>> fashion, without the need for plsr(): >>>>> >>>>> x1 <- runif(20) >>>>> x2 <- runif(20) >>>>> mx <- cbind(x1, x2) >>>>> >>>>> >>>>> <snip> >>>>> >>>>> Duncan, >>>>> >>>>> Tracing through the code for poly() using debug once >>>>> with: >>>>> >>>>> poly(mx, 2) >>>>> >>>>> and then with: >>>>> >>>>> poly(mx, degree = 2) >>>>> >>>>> there is a difference in the transformation of 'mx' >>>>> internally by the use of: >>>>> >>>>> if (is.matrix(x)) { m <- >>>>> unclass(as.data.frame(cbind(x, ...))) >>>>> return(do.call(polym, c(m, degree = degree, raw = raw, >>>>> list(coefs = coefs)))) } >>>>> >>>>> >>>>> In the first case, 'mx' ends up being transformed to: >>>>> >>>>> Browse[2]> m $x1 [1] 0.99056941 0.13953093 0.38965567 >>>>> 0.35353514 0.90838486 0.97552474 [7] 0.01135743 >>>>> 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973 >>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157 >>>>> 0.31164777 0.81694822 [19] 0.58641410 0.08858699 >>>>> >>>>> $x2 [1] 0.6628658 0.9221436 0.3162418 0.8494452 >>>>> 0.4665010 0.3403719 [7] 0.4040692 0.4916650 0.9091161 >>>>> 0.2956006 0.3454689 0.3331070 [13] 0.8788974 0.5614636 >>>>> 0.7794396 0.2304009 0.6566537 0.6875646 [19] 0.5110733 >>>>> 0.4122336 >>>>> >>>>> $V3 [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 >>>>> >>>>> attr(,"row.names") [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 >>>>> 14 15 16 17 18 19 20 >>>>> >>>>> Thus, when do.call() is used, m$V3 is passed as the >>>>> 'x' argument on the third iteration, essentially >>>>> resulting in: >>>>> >>>>> polym(rep(2, 20), degree = 2) Error in poly(dots[[1L]], >>>>> degree, raw = raw, simple = raw && nd > 1) : 'degree' >>>>> must be less than number of unique points >>>>> >>>>> >>>>> Note also that in this case, 'dots', which is the >>>>> result of using list(...) on the initial call, is: >>>>> >>>>> Browse[2]> dots [[1]] [1] 2 >>>>> >>>>> >>>>> In the second case: >>>>> >>>>> Browse[2]> m $x1 [1] 0.99056941 0.13953093 0.38965567 >>>>> 0.35353514 0.90838486 0.97552474 [7] 0.01135743 >>>>> 0.06537047 0.56207834 0.50554056 0.96653391 0.69533973 >>>>> [13] 0.31333549 0.97488211 0.54952630 0.71747157 >>>>> 0.31164777 0.81694822 [19] 0.58641410 0.08858699 >>>>> >>>>> $x2 [1] 0.6628658 0.9221436 0.3162418 0.8494452 >>>>> 0.4665010 0.3403719 [7] 0.4040692 0.4916650 0.9091161 >>>>> 0.2956006 0.3454689 0.3331070 [13] 0.8788974 0.5614636 >>>>> 0.7794396 0.2304009 0.6566537 0.6875646 [19] 0.5110733 >>>>> 0.4122336 >>>>> >>>>> attr(,"row.names") [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 >>>>> 14 15 16 17 18 19 20 >>>>> >>>>> >>>>> So, there is no m$V3. >>>>> >>>>> Note also that 'dots' ends up being: >>>>> >>>>> Browse[2]> dots list() >>>>> >>>>> >>>>> In both cases, 'degree' is indeed 2, but the result of >>>>> 'list(...)' on the initial function call is quite >>>>> different. >>>>> >>>>> So, I may be hypo-caffeinated, but if there is a bug >>>>> here, it may be due to the way in which cbind() is >>>>> being called in the code above, where the three dots >>>>> are being used? >>>>> >>>>> I can replicate the presumably correct behavior by >>>>> using: >>>>> >>>>> m <- unclass(as.data.frame(cbind(x))) >>>>> >>>>> instead of: >>>>> >>>>> m <- unclass(as.data.frame(cbind(x, ...))) >>>>> >>>>> But I am not sure if removing the three dots in the >>>>> cbind() call may have other unintended consequences. >>>>> >>>>> Regards, >>>>> >>>>> Marc >> >> >>>> Duncan, >> >>>> Some additional information here. Reviewing the source >>>> code for the function in SVN: >> >>>> https://svn.r-project.org/R/trunk/src/library/stats/R/contr.poly.R >> >>>> there is a relevant comment in the code: >> >>>> if(is.matrix(x)) { ## FIXME: fails when combined with >>>> 'unnamed degree' above m <- >>>> unclass(as.data.frame(cbind(x, ...))) >>>> return(do.call(polym, c(m, degree = degree, raw = raw, >>>> list(coefs=coefs)))) } >> >> >>>> A version review would suggest that the above comment >>>> was added to the code back in 2015. >> >>> Yes, by me, possibly here : >> >>> $ svn log -v -c68727 >>> ------------------------------------------------------------------------ >>> r68727 | maechler | 2015-07-23 16:14:59 +0200 (Thu, 23 >>> Jul 2015) | 1 line Changed paths: M /trunk/doc/NEWS.Rd M >>> /trunk/src/library/stats/R/contr.poly.R M >>> /trunk/src/library/stats/man/poly.Rd M >>> /trunk/tests/Examples/stats-Ex.Rout.save M >>> /trunk/tests/reg-tests-1c.R >> >>> poly(), polym() now work better notably for prediction >>> ------------------------------------------------------------------------ >>> $ svn-diffB -c68727 doc/NEWS.Rd Index: doc/NEWS.Rd >>> =================================================================== >>> 126a127,133 >>>> >>>> \item \code{polym()} gains a \code{coefs = NULL} >>>> argument and returns class \code{"poly"} just like >>>> \code{poly()} which gets a new \code{simple=FALSE} >>>> option. They now lead to correct \code{predict()}ions, >>>> e.g., on subsets of the original data. %% see >>>> https://stat.ethz.ch/pipermail/r-devel/2015-July/071532.html >> >> >>>> So it would appear that the behavior being discussed >>>> here is known. >> >>> Indeed! I remember to have spent quite a few hours with >>> the code and its different uses before committing that >>> patch. >> >>>> I am still confused by the need for the '...' in the >>>> call to cbind(), which as far as I can tell, has been >>>> in the code at least back to 2003, when the poly() code >>>> was split from base. >> >>>> I am not sure why one would want to pass on other '...' >>>> arguments to cbind(), but I am presumably missing >>>> something here. >> >>> Yes, I think passing the '...' is important there... >>> OTOH, I'm almost sure that I wrote the 'FIXME' because I >>> thought one should be able to do things better. So, I'm >>> happy to e-talk to you about how to get rid of the FIXME >>> and still remain back-compatible: Notably with the >>> paragraph in ?poly |> Details: >>> |> >>> |> Although formally ‘degree’ should be named (as it >>> follows ‘...’), |> an unnamed second argument of length >>> 1 will be interpreted as the |> degree, such that >>> ‘poly(x, 3)’ can be used in formulas. >> >> As a matter of fact, a patch seems very simple, and I am >> testing it now. >> >> Won't have much more time today, but will return "on this >> channel" later, maybe tomorrow. >> >> Martin > Martin, > Thanks for taking the time to look at this! > Marc Duncan had in the mean time filed a bug report about this, --> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17310 but I had fixed the issue even before seeing the PR. [currently fixed in R-devel only (svn r 72919)] Martin ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel