Re: [Rd] bug (PR#13570)
Many thanks Brian for tracking this down. Was it fixed by c next line is not in current dloess goto 7 in ehg136? If this needs to be in the netlib version as well, we should inform Eric Grosse. While we're at it, there are a few more inconsistencies (not nearly as serious as PR#13570 so I hesitate to call them bugs) regarding the definition of leaf cell membership (certain .lt. should be .le. ) in ehg128, ehg137, and ehg138 (not currently used); it seems I neglected to mention these to Eric. If you are interested in these I can submit a patch and will notify Eric as well. Finally, perhaps now is as good a time as any to point out that in the documentation, the bit about cross-terms in \item{drop.square}{for fits with more than one predictor and \code{degree=2}, should the quadratic term (and cross-terms) be dropped for particular predictors? is incorrect -- cross terms are not dropped in this implementation of loess. Thanks again, Ben Prof Brian Ripley wrote: I've found the discrepancy, so the patched code from current dloess is now available in R-patched and R-devel. On Fri, 6 Mar 2009, Prof Brian Ripley wrote: On Thu, 5 Mar 2009, Benjamin Tyner wrote: Hi Nice to hear from you Ryan. I also do not have the capability to debug on windows; however, there is a chance that the behavior you are seeing is caused by the following bug noted in my thesis (available on ProQuest; email me if you don't have access): When lambda = 0 there are no local slopes to aid the blending algorithm, yet the interpolator would still assume they were available, and thus use arbitrary values from memory. This had implications for both fit and tr[L] computation. In the updated code these are set equal to zero which seems the best automatic rule when lambda = 0. [lambda refers to degree] I submitted a bug fix to Eric Grosse, the maintainer of the netlib routines; the fixed lines of fortran are identified in the comments at (just search for my email address): http://www.netlib.org/a/loess These fixes would be relatively simple to incorporate into R's version of loessf.f The fixes from dloess even more simply, since R's code is based on dloess. Thank you for the suggestion. Given how tricky this is to reproduce, I went back to my example under valgrind. If I use the latest dloess code, it crashes, but by selectively importing some of the differences I can get it to work. So it looks as if we are on the road to a solution, but something in the current version (not necessarily in these changes) is incompatible with the current R code and I need to dig further (not for a few days). Alternatively, a quick check would be for someone to compile the source package at https://centauri.stat.purdue.edu:98/loess/loess_0.4-1.tar.gz and test it on windows. Though this package incorporates this and a few other fixes, please be aware that it the routines are converted to C and thus there is a slight performance hit compared to the fortran. Hope this helps, Ben [...] -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
On Tue, 10 Mar 2009, Benjamin Tyner wrote: Many thanks Brian for tracking this down. Was it fixed by c next line is not in current dloess goto 7 in ehg136? If this needs to be in the netlib version as well, we should inform Eric Grosse. The difference was in the argument list of one of the functions (ehg124?). It was 'just' a question of looking at 354 diff sections, not all of which I understood, including that commented above. While we're at it, there are a few more inconsistencies (not nearly as serious as PR#13570 so I hesitate to call them bugs) regarding the definition of leaf cell membership (certain .lt. should be .le. ) in ehg128, ehg137, and ehg138 (not currently used); it seems I neglected to mention these to Eric. If you are interested in these I can submit a patch and will notify Eric as well. Please do let me know and I'll merge in. Finally, perhaps now is as good a time as any to point out that in the documentation, the bit about cross-terms in \item{drop.square}{for fits with more than one predictor and \code{degree=2}, should the quadratic term (and cross-terms) be dropped for particular predictors? is incorrect -- cross terms are not dropped in this implementation of loess. Thanks, I will incorporate that. Thanks again, Ben Prof Brian Ripley wrote: I've found the discrepancy, so the patched code from current dloess is now available in R-patched and R-devel. On Fri, 6 Mar 2009, Prof Brian Ripley wrote: On Thu, 5 Mar 2009, Benjamin Tyner wrote: Hi Nice to hear from you Ryan. I also do not have the capability to debug on windows; however, there is a chance that the behavior you are seeing is caused by the following bug noted in my thesis (available on ProQuest; email me if you don't have access): When lambda = 0 there are no local slopes to aid the blending algorithm, yet the interpolator would still assume they were available, and thus use arbitrary values from memory. This had implications for both fit and tr[L] computation. In the updated code these are set equal to zero which seems the best automatic rule when lambda = 0. [lambda refers to degree] I submitted a bug fix to Eric Grosse, the maintainer of the netlib routines; the fixed lines of fortran are identified in the comments at (just search for my email address): http://www.netlib.org/a/loess These fixes would be relatively simple to incorporate into R's version of loessf.f The fixes from dloess even more simply, since R's code is based on dloess. Thank you for the suggestion. Given how tricky this is to reproduce, I went back to my example under valgrind. If I use the latest dloess code, it crashes, but by selectively importing some of the differences I can get it to work. So it looks as if we are on the road to a solution, but something in the current version (not necessarily in these changes) is incompatible with the current R code and I need to dig further (not for a few days). Alternatively, a quick check would be for someone to compile the source package at https://centauri.stat.purdue.edu:98/loess/loess_0.4-1.tar.gz and test it on windows. Though this package incorporates this and a few other fixes, please be aware that it the routines are converted to C and thus there is a slight performance hit compared to the fortran. Hope this helps, Ben [...] -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
I've found the discrepancy, so the patched code from current dloess is now available in R-patched and R-devel. On Fri, 6 Mar 2009, Prof Brian Ripley wrote: On Thu, 5 Mar 2009, Benjamin Tyner wrote: Hi Nice to hear from you Ryan. I also do not have the capability to debug on windows; however, there is a chance that the behavior you are seeing is caused by the following bug noted in my thesis (available on ProQuest; email me if you don't have access): When lambda = 0 there are no local slopes to aid the blending algorithm, yet the interpolator would still assume they were available, and thus use arbitrary values from memory. This had implications for both fit and tr[L] computation. In the updated code these are set equal to zero which seems the best automatic rule when lambda = 0. [lambda refers to degree] I submitted a bug fix to Eric Grosse, the maintainer of the netlib routines; the fixed lines of fortran are identified in the comments at (just search for my email address): http://www.netlib.org/a/loess These fixes would be relatively simple to incorporate into R's version of loessf.f The fixes from dloess even more simply, since R's code is based on dloess. Thank you for the suggestion. Given how tricky this is to reproduce, I went back to my example under valgrind. If I use the latest dloess code, it crashes, but by selectively importing some of the differences I can get it to work. So it looks as if we are on the road to a solution, but something in the current version (not necessarily in these changes) is incompatible with the current R code and I need to dig further (not for a few days). Alternatively, a quick check would be for someone to compile the source package at https://centauri.stat.purdue.edu:98/loess/loess_0.4-1.tar.gz and test it on windows. Though this package incorporates this and a few other fixes, please be aware that it the routines are converted to C and thus there is a slight performance hit compared to the fortran. Hope this helps, Ben [...] -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
Prof Brian Ripley wrote: On Thu, 5 Mar 2009, Benjamin Tyner wrote: [...] I submitted a bug fix to Eric Grosse, the maintainer of the netlib routines; the fixed lines of fortran are identified in the comments at (just search for my email address): http://www.netlib.org/a/loess These fixes would be relatively simple to incorporate into R's version of loessf.f The fixes from dloess even more simply, since R's code is based on dloess. Thank you for the suggestion. Given how tricky this is to reproduce, I went back to my example under valgrind. If I use the latest dloess code, it crashes, but by selectively importing some of the differences I can get it to work. So it looks as if we are on the road to a solution, but something in the current version (not necessarily in these changes) is incompatible with the current R code and I need to dig further (not for a few days). What a nice war story this is! Good that it now seems fixable; even though degree=0 is not of much practical use, it is the sort of thing people like to have available when explaining how the method works. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
Could you explain what you are seeing that is wrong? In R 2.7.2, which is what I have here, it looks ok, and the NEWS file doesn't list any changes since 2.7.1. -thomas On Wed, 4 Mar 2009 rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. R 2.8 --please do not edit the information below-- Version: platform = i386-apple-darwin8.11.1 arch = i386 os = darwin8.11.1 system = i386, darwin8.11.1 status = major = 2 minor = 8.1 year = 2008 month = 12 day = 22 svn rev = 47281 language = R version.string = R version 2.8.1 (2008-12-22) GUI: R-GUI 1.27 (5301) Locale: en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 Search Path: .GlobalEnv, tools:RGUI, package:stats, package:graphics, package:grDevices, package:utils, package:datasets, package:Rutils, package:methods, Autoloads, package:base __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). R 2.8 --please do not edit the information below-- Version: platform = i386-apple-darwin8.11.1 arch = i386 os = darwin8.11.1 system = i386, darwin8.11.1 status = major = 2 minor = 8.1 year = 2008 month = 12 day = 22 svn rev = 47281 language = R version.string = R version 2.8.1 (2008-12-22) GUI: R-GUI 1.27 (5301) Locale: en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 Search Path: .GlobalEnv, tools:RGUI, package:stats, package:graphics, package:grDevices, package:utils, package:datasets, package:Rutils, package:methods, Autoloads, package:base __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in steps) gave sensible results. My guess would be that some memory is not correctly allocated or initialised. Or is it something like an object with storage mode integer being passed to a double? But then, why doesn't it show on linux? Happy bug hunting. If my guess is correct, then I have no idea how to track down such things under windows. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in steps) gave sensible results. My guess would be that some memory is not correctly allocated or initialised. Or is it something like an object with storage mode integer being passed to a double? But then, why doesn't it show on linux? Happy bug hunting. If my guess is correct, then I have no idea how to track down such things under windows. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Please can you folks try under R-devel (to be R-2.9.0 in a couple of weeks) and report if you still see it. I do not under R-devel (but do under R-release), so my guess is that something called by loess() has been fixed in the meantime. Moreover it is not the plot stuff that was wrong under R-2.8.1 (release) but the loess computations. Uwe Ligges __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in steps) gave sensible results. My guess would be that some memory is not correctly allocated or initialised. Or is it something like an object with storage mode integer being passed to a double? But then, why doesn't it show on linux? Happy bug hunting. If my guess is correct, then I have no idea how to track down such things under windows. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Please can you folks try under R-devel (to be R-2.9.0 in a couple of weeks) and report if you still see it. I do not under R-devel (but do under R-release), so my guess is that something called by loess() has been fixed in the meantime. Moreover it is not the plot stuff that was wrong under R-2.8.1 (release) but the loess computations. Uwe Ligges __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
Uwe Ligges wrote: Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in steps) gave sensible results. My guess would be that some memory is not correctly allocated or initialised. Or is it something like an object with storage mode integer being passed to a double? But then, why doesn't it show on linux? Happy bug hunting. If my guess is correct, then I have no idea how to track down such things under windows. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Please can you folks try under R-devel (to be R-2.9.0 in a couple of weeks) and report if you still see it. I do not under R-devel (but do under R-release), so my guess is that something called by loess() has been fixed in the meantime. Moreover it is not the plot stuff that was wrong under R-2.8.1 (release) but the loess computations. I still see it in R-patched (haven't tried R-devel yet). So I think it is worth tracking down and fixing; I'll do it later today. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
Uwe Ligges wrote: Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in steps) gave sensible results. My guess would be that some memory is not correctly allocated or initialised. Or is it something like an object with storage mode integer being passed to a double? But then, why doesn't it show on linux? Happy bug hunting. If my guess is correct, then I have no idea how to track down such things under windows. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Please can you folks try under R-devel (to be R-2.9.0 in a couple of weeks) and report if you still see it. I do not under R-devel (but do under R-release), so my guess is that something called by loess() has been fixed in the meantime. Moreover it is not the plot stuff that was wrong under R-2.8.1 (release) but the loess computations. I still see it in R-patched (haven't tried R-devel yet). So I think it is worth tracking down and fixing; I'll do it later today. Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
Undortunately the example is random, so not really reproducible (and I see nothing wrong on my Mac). However, Linux valgrind on R-devel is showing a problem: ==3973== Conditional jump or move depends on uninitialised value(s) ==3973==at 0xD76017B: ehg141_ (loessf.f:532) ==3973==by 0xD761600: lowesa_ (loessf.f:769) ==3973==by 0xD736E47: loess_raw (loessc.c:117) (The uninitiialized value is in someone else's code and I suspect it was either never intended to work or never tested.) No essential change has been made to the loess code for many years. I would not have read the documentation to say that degree = 0 was a reasonable value. It is not to my mind 'a polynomial surface', and loess() is described as a 'local regression' for degree 1 or 2 in the reference. So unless anyone wants to bury their heads in that code I think a perfectly adequate fix would be to disallow degree = 0. (I vaguely recall debating allowing in the code ca 10 years ago.) On Thu, 5 Mar 2009, Uwe Ligges wrote: Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in steps) gave sensible results. My guess would be that some memory is not correctly allocated or initialised. Or is it something like an object with storage mode integer being passed to a double? But then, why doesn't it show on linux? Happy bug hunting. If my guess is correct, then I have no idea how to track down such things under windows. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Please can you folks try under R-devel (to be R-2.9.0 in a couple of weeks) and report if you still see it. I do not under R-devel (but do under R-release), so my guess is that something called by loess() has been fixed in the meantime. Moreover it is not the plot stuff that was wrong under R-2.8.1 (release) but the loess computations. Uwe Ligges __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
Prof Brian Ripley wrote: Undortunately the example is random, so not really reproducible (and I see nothing wrong on my Mac). However, Linux valgrind on R-devel is showing a problem: ==3973== Conditional jump or move depends on uninitialised value(s) ==3973==at 0xD76017B: ehg141_ (loessf.f:532) ==3973==by 0xD761600: lowesa_ (loessf.f:769) ==3973==by 0xD736E47: loess_raw (loessc.c:117) (The uninitiialized value is in someone else's code and I suspect it was either never intended to work or never tested.) No essential change has been made to the loess code for many years. I would not have read the documentation to say that degree = 0 was a reasonable value. It is not to my mind 'a polynomial surface', and loess() is described as a 'local regression' for degree 1 or 2 in the reference. So unless anyone wants to bury their heads in that code I think a perfectly adequate fix would be to disallow degree = 0. (I vaguely recall debating allowing in the code ca 10 years ago.) The code itself has if (!match(degree, 0:2, 0)) stop('degree' must be 0, 1 or 2) though. Local fitting of a constant essentially becomes kernel smoothing, right? On Thu, 5 Mar 2009, Uwe Ligges wrote: Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in steps) gave sensible results. My guess would be that some memory is not correctly allocated or initialised. Or is it something like an object with storage mode integer being passed to a double? But then, why doesn't it show on linux? Happy bug hunting. If my guess is correct, then I have no idea how to track down such things under windows. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Please can you folks try under R-devel (to be R-2.9.0 in a couple of weeks) and report if you still see it. I do not under R-devel (but do under R-release), so my guess is that something called by loess() has been fixed in the meantime. Moreover it is not the plot stuff that was wrong under R-2.8.1 (release) but the loess computations. Uwe Ligges __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
On Thu, 5 Mar 2009, Peter Dalgaard wrote: Prof Brian Ripley wrote: Undortunately the example is random, so not really reproducible (and I see nothing wrong on my Mac). However, Linux valgrind on R-devel is showing a problem: ==3973== Conditional jump or move depends on uninitialised value(s) ==3973==at 0xD76017B: ehg141_ (loessf.f:532) ==3973==by 0xD761600: lowesa_ (loessf.f:769) ==3973==by 0xD736E47: loess_raw (loessc.c:117) (The uninitiialized value is in someone else's code and I suspect it was either never intended to work or never tested.) No essential change has been made to the loess code for many years. I would not have read the documentation to say that degree = 0 was a reasonable value. It is not to my mind 'a polynomial surface', and loess() is described as a 'local regression' for degree 1 or 2 in the reference. So unless anyone wants to bury their heads in that code I think a perfectly adequate fix would be to disallow degree = 0. (I vaguely recall debating allowing in the code ca 10 years ago.) The code itself has if (!match(degree, 0:2, 0)) stop('degree' must be 0, 1 or 2) though. Local fitting of a constant essentially becomes kernel smoothing, right? I do know the R code allows it: the question is whether it is worth the effort of finding the problem(s) in the underlying c/dloess code, whose manual (and our reference) is entirely about 1 or 2. I am concerned that there may be other things lurking in the degree=0 case if it was never tested (in the netlib version: I am sure it was only minmally tested through my R interface). I checked the original documentation on netlib and that says 29 DIM dimension of local regression 1 constant d+1 linear (default) (d+2)(d+1)/2quadratic Modified by ehg127 if cdegtdeg. which seems to confirm that degree = 0 was intended to be allowed, and what I dimly recall from ca 1998 is debating whether the R code should allow that or not. If left to me I would say I did not wish to continue to support degree = 0. On Thu, 5 Mar 2009, Uwe Ligges wrote: Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in steps) gave sensible results. My guess would be that some memory is not correctly allocated or initialised. Or is it something like an object with storage mode integer being passed to a double? But then, why doesn't it show on linux? Happy bug hunting. If my guess is correct, then I have no idea how to track down such things under windows. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Please can you folks try under R-devel (to be R-2.9.0 in a couple of weeks) and report if you still see it. I do not under R-devel (but do under R-release), so my guess is that something called by loess() has been fixed in the meantime. Moreover it is not the plot stuff that was wrong under R-2.8.1 (release) but the loess computations. Uwe Ligges __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
On 3/5/2009 7:10 AM, Prof Brian Ripley wrote: Undortunately the example is random, so not really reproducible (and I see nothing wrong on my Mac). However, Linux valgrind on R-devel is showing a problem: I can reproduce it using y - sin(x) instead of rnorm(100), on R-patched (not R-devel). ==3973== Conditional jump or move depends on uninitialised value(s) ==3973==at 0xD76017B: ehg141_ (loessf.f:532) ==3973==by 0xD761600: lowesa_ (loessf.f:769) ==3973==by 0xD736E47: loess_raw (loessc.c:117) I don't see why there would be errors at those spots, but I did try tracing into loessf.f, and it's really a maze of code. In case someone wants to follow up, it looks as though the ehg128 function returns a garbage value on the first call. Working backwards through it, this is because the local variable s is garbage, because g(0,1) (an array, not a function call) is garbage at line 957, which is because it got set as garbage somewhere between being initialized at line 918, and line 957. I think the problem happened at lines 950/951, but I didn't follow up to see why. (The uninitiialized value is in someone else's code and I suspect it was either never intended to work or never tested.) No essential change has been made to the loess code for many years. I would not have read the documentation to say that degree = 0 was a reasonable value. It is not to my mind 'a polynomial surface', and loess() is described as a 'local regression' for degree 1 or 2 in the reference. So unless anyone wants to bury their heads in that code I think a perfectly adequate fix would be to disallow degree = 0. (I vaguely recall debating allowing in the code ca 10 years ago.) I agree that's the best solution. Duncan Murdoch On Thu, 5 Mar 2009, Uwe Ligges wrote: Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in steps) gave sensible results. My guess would be that some memory is not correctly allocated or initialised. Or is it something like an object with storage mode integer being passed to a double? But then, why doesn't it show on linux? Happy bug hunting. If my guess is correct, then I have no idea how to track down such things under windows. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Please can you folks try under R-devel (to be R-2.9.0 in a couple of weeks) and report if you still see it. I do not under R-devel (but do under R-release), so my guess is that something called by loess() has been fixed in the meantime. Moreover it is not the plot stuff that was wrong under R-2.8.1 (release) but the loess computations. Uwe Ligges __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
Hi Uwe, This is not a problem under Vista, using a development version (mine now somewhat outdated). Regards, Mark. sessionInfo() R version 2.9.0 Under development (unstable) (2009-01-22 r47686) i386-pc-mingw32 locale: LC_COLLATE=English_South Africa.1252;LC_CTYPE=English_South Africa.1252;LC_MONETARY=English_South Africa.1252;LC_NUMERIC=C;LC_TIME=English_South Africa.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] ade4_1.4-10 Design_2.1-2survival_2.34-1 Hmisc_3.5-2 loaded via a namespace (and not attached): [1] cluster_1.11.12gamlss_1.9-4 grid_2.9.0 lattice_0.17-20 latticeExtra_0.5-4 [6] MASS_7.2-45tools_2.9.0 Uwe Ligges-3 wrote: Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in steps) gave sensible results. My guess would be that some memory is not correctly allocated or initialised. Or is it something like an object with storage mode integer being passed to a double? But then, why doesn't it show on linux? Happy bug hunting. If my guess is correct, then I have no idea how to track down such things under windows. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Please can you folks try under R-devel (to be R-2.9.0 in a couple of weeks) and report if you still see it. I do not under R-devel (but do under R-release), so my guess is that something called by loess() has been fixed in the meantime. Moreover it is not the plot stuff that was wrong under R-2.8.1 (release) but the loess computations. Uwe Ligges __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- View this message in context: http://www.nabble.com/bug-%28PR-13570%29-tp22346406p22350779.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
On Mar 5, 2009, at 7:59 AM, Prof Brian Ripley wrote: On Thu, 5 Mar 2009, Peter Dalgaard wrote: Prof Brian Ripley wrote: Undortunately the example is random, so not really reproducible (and I see nothing wrong on my Mac). However, Linux valgrind on R-devel is showing a problem: ==3973== Conditional jump or move depends on uninitialised value(s) ==3973==at 0xD76017B: ehg141_ (loessf.f:532) ==3973==by 0xD761600: lowesa_ (loessf.f:769) ==3973==by 0xD736E47: loess_raw (loessc.c:117) (The uninitiialized value is in someone else's code and I suspect it was either never intended to work or never tested.) No essential change has been made to the loess code for many years. I would not have read the documentation to say that degree = 0 was a reasonable value. It is not to my mind 'a polynomial surface', and loess() is described as a 'local regression' for degree 1 or 2 in the reference. So unless anyone wants to bury their heads in that code I think a perfectly adequate fix would be to disallow degree = 0. (I vaguely recall debating allowing in the code ca 10 years ago.) The code itself has if (!match(degree, 0:2, 0)) stop('degree' must be 0, 1 or 2) though. Local fitting of a constant essentially becomes kernel smoothing, right? I do know the R code allows it: the question is whether it is worth the effort of finding the problem(s) in the underlying c/dloess code, whose manual (and our reference) is entirely about 1 or 2. I am concerned that there may be other things lurking in the degree=0 case if it was never tested (in the netlib version: I am sure it was only minmally tested through my R interface). I checked the original documentation on netlib and that says 29 DIM dimension of local regression 1 constant d+1 linear (default) (d+2)(d+1)/2quadratic Modified by ehg127 if cdegtdeg. which seems to confirm that degree = 0 was intended to be allowed, and what I dimly recall from ca 1998 is debating whether the R code should allow that or not. If left to me I would say I did not wish to continue to support degree = 0. True. There are plenty of reasons why one wouldn't want to use degree=0 anyway. And I'm sure there are plenty of other simple ways to achieve the same effect. I ran into the problem because some code I'm planning on distributing as part of a paper submission blends partway down to degree 0 smoothing at the endpoints to reduce the variance. The only bad effect of disallowing degree 0 is for anyone with code depending on it, although there are probably few that use it and better to disallow than to give an incorrect computation. I got around the problem by installing a modified loess by one of Cleveland's former students: https://centauri.stat.purdue.edu:98/loess/ (but don't want to require others who use my code to do so as well). What is very strange to me is that it has been working fine in previous R versions (tested on 2.7.1 and 2.6.1) and nothing has changed in the loess source but yet it is having problems on 2.8.1. Would this suggest it not being a problem with the netlib code? Also strange that it reportedly works on Linux but not on Mac or Windows. On the mac, the effect was much smaller. With windows, it was predicting values like 2e215 whereas on the mac, you would almost believe the results were legitimate if you didn't think about the fact that a weighted moving average involving half the data shouldn't oscillate so much. If the consensus is to keep degree=0, I'd be happy to help try to find the problem or provide a test case or something. Thanks for looking into this. Ryan On Thu, 5 Mar 2009, Uwe Ligges wrote: Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in steps) gave sensible results. My guess would be that some memory is not correctly allocated or initialised. Or is it something like an object with storage mode integer being passed to a double? But then, why doesn't it show on linux? Happy bug hunting. If my guess is correct, then I have no idea how to track down such things under windows. Cheers, Berwin
Re: [Rd] bug (PR#13570)
Mark Difford wrote: Hi Uwe, This is not a problem under Vista, using a development version (mine now somewhat outdated). Mark, as others have reported and debugged so far and you can see on the lists, the problem is more serious than I thought it is and it is probably also a problem under R-devel and here are just some nice lucky circumstances that we do not observe obvious miscalculations any more. Best, Uwe Regards, Mark. sessionInfo() R version 2.9.0 Under development (unstable) (2009-01-22 r47686) i386-pc-mingw32 locale: LC_COLLATE=English_South Africa.1252;LC_CTYPE=English_South Africa.1252;LC_MONETARY=English_South Africa.1252;LC_NUMERIC=C;LC_TIME=English_South Africa.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] ade4_1.4-10 Design_2.1-2survival_2.34-1 Hmisc_3.5-2 loaded via a namespace (and not attached): [1] cluster_1.11.12gamlss_1.9-4 grid_2.9.0 lattice_0.17-20 latticeExtra_0.5-4 [6] MASS_7.2-45tools_2.9.0 Uwe Ligges-3 wrote: Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in steps) gave sensible results. My guess would be that some memory is not correctly allocated or initialised. Or is it something like an object with storage mode integer being passed to a double? But then, why doesn't it show on linux? Happy bug hunting. If my guess is correct, then I have no idea how to track down such things under windows. Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel Please can you folks try under R-devel (to be R-2.9.0 in a couple of weeks) and report if you still see it. I do not under R-devel (but do under R-release), so my guess is that something called by loess() has been fixed in the meantime. Moreover it is not the plot stuff that was wrong under R-2.8.1 (release) but the loess computations. Uwe Ligges __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] bug (PR#13570)
I see the same problem on Windows XP. But if I run loess with surface='direct' then the results are correct. So it looks like the problem comes from the smoothing/interpolating, not the main loess algorithm. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r- project.org] On Behalf Of Ryan Hafen Sent: Thursday, March 05, 2009 7:43 AM To: Prof Brian Ripley Cc: Uwe Ligges; Berwin A Turlach; r-de...@stat.math.ethz.ch; Peter Dalgaard Subject: Re: [Rd] bug (PR#13570) On Mar 5, 2009, at 7:59 AM, Prof Brian Ripley wrote: On Thu, 5 Mar 2009, Peter Dalgaard wrote: Prof Brian Ripley wrote: Undortunately the example is random, so not really reproducible (and I see nothing wrong on my Mac). However, Linux valgrind on R-devel is showing a problem: ==3973== Conditional jump or move depends on uninitialised value(s) ==3973==at 0xD76017B: ehg141_ (loessf.f:532) ==3973==by 0xD761600: lowesa_ (loessf.f:769) ==3973==by 0xD736E47: loess_raw (loessc.c:117) (The uninitiialized value is in someone else's code and I suspect it was either never intended to work or never tested.) No essential change has been made to the loess code for many years. I would not have read the documentation to say that degree = 0 was a reasonable value. It is not to my mind 'a polynomial surface', and loess() is described as a 'local regression' for degree 1 or 2 in the reference. So unless anyone wants to bury their heads in that code I think a perfectly adequate fix would be to disallow degree = 0. (I vaguely recall debating allowing in the code ca 10 years ago.) The code itself has if (!match(degree, 0:2, 0)) stop('degree' must be 0, 1 or 2) though. Local fitting of a constant essentially becomes kernel smoothing, right? I do know the R code allows it: the question is whether it is worth the effort of finding the problem(s) in the underlying c/dloess code, whose manual (and our reference) is entirely about 1 or 2. I am concerned that there may be other things lurking in the degree=0 case if it was never tested (in the netlib version: I am sure it was only minmally tested through my R interface). I checked the original documentation on netlib and that says 29 DIM dimension of local regression 1 constant d+1 linear (default) (d+2)(d+1)/2quadratic Modified by ehg127 if cdegtdeg. which seems to confirm that degree = 0 was intended to be allowed, and what I dimly recall from ca 1998 is debating whether the R code should allow that or not. If left to me I would say I did not wish to continue to support degree = 0. True. There are plenty of reasons why one wouldn't want to use degree=0 anyway. And I'm sure there are plenty of other simple ways to achieve the same effect. I ran into the problem because some code I'm planning on distributing as part of a paper submission blends partway down to degree 0 smoothing at the endpoints to reduce the variance. The only bad effect of disallowing degree 0 is for anyone with code depending on it, although there are probably few that use it and better to disallow than to give an incorrect computation. I got around the problem by installing a modified loess by one of Cleveland's former students: https://centauri.stat.purdue.edu:98/loess/ (but don't want to require others who use my code to do so as well). What is very strange to me is that it has been working fine in previous R versions (tested on 2.7.1 and 2.6.1) and nothing has changed in the loess source but yet it is having problems on 2.8.1. Would this suggest it not being a problem with the netlib code? Also strange that it reportedly works on Linux but not on Mac or Windows. On the mac, the effect was much smaller. With windows, it was predicting values like 2e215 whereas on the mac, you would almost believe the results were legitimate if you didn't think about the fact that a weighted moving average involving half the data shouldn't oscillate so much. If the consensus is to keep degree=0, I'd be happy to help try to find the problem or provide a test case or something. Thanks for looking into this. Ryan On Thu, 5 Mar 2009, Uwe Ligges wrote: Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5
Re: [Rd] bug (PR#13570)
On 05/03/2009 9:42 AM, Ryan Hafen wrote: On Mar 5, 2009, at 7:59 AM, Prof Brian Ripley wrote: On Thu, 5 Mar 2009, Peter Dalgaard wrote: Prof Brian Ripley wrote: Undortunately the example is random, so not really reproducible (and I see nothing wrong on my Mac). However, Linux valgrind on R-devel is showing a problem: ==3973== Conditional jump or move depends on uninitialised value(s) ==3973==at 0xD76017B: ehg141_ (loessf.f:532) ==3973==by 0xD761600: lowesa_ (loessf.f:769) ==3973==by 0xD736E47: loess_raw (loessc.c:117) (The uninitiialized value is in someone else's code and I suspect it was either never intended to work or never tested.) No essential change has been made to the loess code for many years. I would not have read the documentation to say that degree = 0 was a reasonable value. It is not to my mind 'a polynomial surface', and loess() is described as a 'local regression' for degree 1 or 2 in the reference. So unless anyone wants to bury their heads in that code I think a perfectly adequate fix would be to disallow degree = 0. (I vaguely recall debating allowing in the code ca 10 years ago.) The code itself has if (!match(degree, 0:2, 0)) stop('degree' must be 0, 1 or 2) though. Local fitting of a constant essentially becomes kernel smoothing, right? I do know the R code allows it: the question is whether it is worth the effort of finding the problem(s) in the underlying c/dloess code, whose manual (and our reference) is entirely about 1 or 2. I am concerned that there may be other things lurking in the degree=0 case if it was never tested (in the netlib version: I am sure it was only minmally tested through my R interface). I checked the original documentation on netlib and that says 29 DIM dimension of local regression 1 constant d+1 linear (default) (d+2)(d+1)/2quadratic Modified by ehg127 if cdegtdeg. which seems to confirm that degree = 0 was intended to be allowed, and what I dimly recall from ca 1998 is debating whether the R code should allow that or not. If left to me I would say I did not wish to continue to support degree = 0. True. There are plenty of reasons why one wouldn't want to use degree=0 anyway. And I'm sure there are plenty of other simple ways to achieve the same effect. I ran into the problem because some code I'm planning on distributing as part of a paper submission blends partway down to degree 0 smoothing at the endpoints to reduce the variance. The only bad effect of disallowing degree 0 is for anyone with code depending on it, although there are probably few that use it and better to disallow than to give an incorrect computation. I got around the problem by installing a modified loess by one of Cleveland's former students: https://centauri.stat.purdue.edu:98/loess/ (but don't want to require others who use my code to do so as well). What is very strange to me is that it has been working fine in previous R versions (tested on 2.7.1 and 2.6.1) and nothing has changed in the loess source but yet it is having problems on 2.8.1. Would this suggest it not being a problem with the netlib code? Also strange that it reportedly works on Linux but not on Mac or Windows. On the mac, the effect was much smaller. With windows, it was predicting values like 2e215 whereas on the mac, you would almost believe the results were legitimate if you didn't think about the fact that a weighted moving average involving half the data shouldn't oscillate so much. I think it's pretty clear that it's using an uninitialized value. On other systems (and previous versions) we've just been lucky, and those locations held values like 0.0 that didn't matter. If the consensus is to keep degree=0, I'd be happy to help try to find the problem or provide a test case or something. Thanks for looking into this. I'd say right now the consensus among R core members is that nobody wants to support degree=0, but if you're volunteering, the consensus could change. Duncan Murdoch Ryan On Thu, 5 Mar 2009, Uwe Ligges wrote: Berwin A Turlach wrote: G'day Peter, On Thu, 05 Mar 2009 09:09:27 +0100 Peter Dalgaard p.dalga...@biostat.ku.dk wrote: rha...@stat.purdue.edu wrote: insert bug report here This is a CRITICAL bug!!! I have verified it in R 2.8.1 for mac and for windows. The problem is with loess degree=0 smoothing. For example, try the following: x - 1:100 y - rnorm(100) plot(x, y) lines(predict(loess(y ~ x, degree=0, span=0.5))) This is obviously wrong. Obvious? How? I don't see anything particularly odd (on Linux). Neither did I on linux; but the OP mentioned mac and windows. On windows, on running that code, the lines() command added a lot of vertical lines; most spanning the complete window but some only part. Executing the code a second time (or in
Re: [Rd] bug (PR#13570)
That is true - good point. lp1 - predict(loess(y ~ x, degree=0)) lp2 - predict(loess(y ~ x, degree=0, control=loess.control(surface=direct))) sort(abs(lp1-lp2)) It appears that the interpolating fit is correct at the vertices. I know when degree=1, the interpolation uses the slopes of the local fits to get a better approximation. Perhaps it's still trying to do this with degree=0 but the slopes aren't available. And we have just been lucky in the past with uninitialized values? If this is the problem it would probably be very simple to fix and I'd love to see degree=0 stay. I will see if I can figure it out. On Mar 5, 2009, at 6:01 PM, Greg Snow wrote: I see the same problem on Windows XP. But if I run loess with surface='direct' then the results are correct. So it looks like the problem comes from the smoothing/ interpolating, not the main loess algorithm. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r- project.org] On Behalf Of Ryan Hafen Sent: Thursday, March 05, 2009 7:43 AM To: Prof Brian Ripley Cc: Uwe Ligges; Berwin A Turlach; r-de...@stat.math.ethz.ch; Peter Dalgaard Subject: Re: [Rd] bug (PR#13570) On Mar 5, 2009, at 7:59 AM, Prof Brian Ripley wrote: On Thu, 5 Mar 2009, Peter Dalgaard wrote: Prof Brian Ripley wrote: Undortunately the example is random, so not really reproducible (and I see nothing wrong on my Mac). However, Linux valgrind on R-devel is showing a problem: ==3973== Conditional jump or move depends on uninitialised value(s) ==3973==at 0xD76017B: ehg141_ (loessf.f:532) ==3973==by 0xD761600: lowesa_ (loessf.f:769) ==3973==by 0xD736E47: loess_raw (loessc.c:117) (The uninitiialized value is in someone else's code and I suspect it was either never intended to work or never tested.) No essential change has been made to the loess code for many years. I would not have read the documentation to say that degree = 0 was a reasonable value. It is not to my mind 'a polynomial surface', and loess() is described as a 'local regression' for degree 1 or 2 in the reference. So unless anyone wants to bury their heads in that code I think a perfectly adequate fix would be to disallow degree = 0. (I vaguely recall debating allowing in the code ca 10 years ago.) The code itself has if (!match(degree, 0:2, 0)) stop('degree' must be 0, 1 or 2) though. Local fitting of a constant essentially becomes kernel smoothing, right? I do know the R code allows it: the question is whether it is worth the effort of finding the problem(s) in the underlying c/dloess code, whose manual (and our reference) is entirely about 1 or 2. I am concerned that there may be other things lurking in the degree=0 case if it was never tested (in the netlib version: I am sure it was only minmally tested through my R interface). I checked the original documentation on netlib and that says 29 DIM dimension of local regression 1 constant d+1 linear (default) (d+2)(d+1)/2quadratic Modified by ehg127 if cdegtdeg. which seems to confirm that degree = 0 was intended to be allowed, and what I dimly recall from ca 1998 is debating whether the R code should allow that or not. If left to me I would say I did not wish to continue to support degree = 0. True. There are plenty of reasons why one wouldn't want to use degree=0 anyway. And I'm sure there are plenty of other simple ways to achieve the same effect. I ran into the problem because some code I'm planning on distributing as part of a paper submission blends partway down to degree 0 smoothing at the endpoints to reduce the variance. The only bad effect of disallowing degree 0 is for anyone with code depending on it, although there are probably few that use it and better to disallow than to give an incorrect computation. I got around the problem by installing a modified loess by one of Cleveland's former students: https://centauri.stat.purdue.edu:98/loess/ (but don't want to require others who use my code to do so as well). What is very strange to me is that it has been working fine in previous R versions (tested on 2.7.1 and 2.6.1) and nothing has changed in the loess source but yet it is having problems on 2.8.1. Would this suggest it not being a problem with the netlib code? Also strange that it reportedly works on Linux but not on Mac or Windows. On the mac, the effect was much smaller. With windows, it was predicting values like 2e215 whereas on the mac, you would almost believe the results were legitimate if you didn't think about the fact that a weighted moving average involving half the data shouldn't oscillate so much. If the consensus is to keep degree=0, I'd be happy to help try to find the problem or provide a test case
Re: [Rd] bug (PR#13570)
Excellent, Ben! Thanks!! On Mar 5, 2009, at 8:24 PM, Benjamin Tyner wrote: Hi Nice to hear from you Ryan. I also do not have the capability to debug on windows; however, there is a chance that the behavior you are seeing is caused by the following bug noted in my thesis (available on ProQuest; email me if you don't have access): When lambda = 0 there are no local slopes to aid the blending algorithm, yet the interpolator would still assume they were available, and thus use arbitrary values from memory. This had implications for both fit and tr[L] computation. In the updated code these are set equal to zero which seems the best automatic rule when lambda = 0. [lambda refers to degree] I submitted a bug fix to Eric Grosse, the maintainer of the netlib routines; the fixed lines of fortran are identified in the comments at (just search for my email address): http://www.netlib.org/a/loess These fixes would be relatively simple to incorporate into R's version of loessf.f Alternatively, a quick check would be for someone to compile the source package at https://centauri.stat.purdue.edu:98/loess/loess_0.4-1.tar.gz and test it on windows. Though this package incorporates this and a few other fixes, please be aware that it the routines are converted to C and thus there is a slight performance hit compared to the fortran. Hope this helps, Ben Ryan Hafen wrote: That is true - good point. lp1 - predict(loess(y ~ x, degree=0)) lp2 - predict(loess(y ~ x, degree=0, control=loess.control(surface=direct))) sort(abs(lp1-lp2)) It appears that the interpolating fit is correct at the vertices. I know when degree=1, the interpolation uses the slopes of the local fits to get a better approximation. Perhaps it's still trying to do this with degree=0 but the slopes aren't available. And we have just been lucky in the past with uninitialized values? If this is the problem it would probably be very simple to fix and I'd love to see degree=0 stay. I will see if I can figure it out. On Mar 5, 2009, at 6:01 PM, Greg Snow wrote: I see the same problem on Windows XP. But if I run loess with surface='direct' then the results are correct. So it looks like the problem comes from the smoothing/ interpolating, not the main loess algorithm. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r- project.org] On Behalf Of Ryan Hafen Sent: Thursday, March 05, 2009 7:43 AM To: Prof Brian Ripley Cc: Uwe Ligges; Berwin A Turlach; r-de...@stat.math.ethz.ch; Peter Dalgaard Subject: Re: [Rd] bug (PR#13570) On Mar 5, 2009, at 7:59 AM, Prof Brian Ripley wrote: On Thu, 5 Mar 2009, Peter Dalgaard wrote: Prof Brian Ripley wrote: Undortunately the example is random, so not really reproducible (and I see nothing wrong on my Mac). However, Linux valgrind on R- devel is showing a problem: ==3973== Conditional jump or move depends on uninitialised value(s) ==3973==at 0xD76017B: ehg141_ (loessf.f:532) ==3973==by 0xD761600: lowesa_ (loessf.f:769) ==3973==by 0xD736E47: loess_raw (loessc.c:117) (The uninitiialized value is in someone else's code and I suspect it was either never intended to work or never tested.) No essential change has been made to the loess code for many years. I would not have read the documentation to say that degree = 0 was a reasonable value. It is not to my mind 'a polynomial surface', and loess() is described as a 'local regression' for degree 1 or 2 in the reference. So unless anyone wants to bury their heads in that code I think a perfectly adequate fix would be to disallow degree = 0. (I vaguely recall debating allowing in the code ca 10 years ago.) The code itself has if (!match(degree, 0:2, 0)) stop('degree' must be 0, 1 or 2) though. Local fitting of a constant essentially becomes kernel smoothing, right? I do know the R code allows it: the question is whether it is worth the effort of finding the problem(s) in the underlying c/dloess code, whose manual (and our reference) is entirely about 1 or 2. I am concerned that there may be other things lurking in the degree=0 case if it was never tested (in the netlib version: I am sure it was only minmally tested through my R interface). I checked the original documentation on netlib and that says 29 DIM dimension of local regression 1 constant d+1 linear (default) (d+2)(d+1)/2quadratic Modified by ehg127 if cdegtdeg. which seems to confirm that degree = 0 was intended to be allowed, and what I dimly recall from ca 1998 is debating whether the R code should allow that or not. If left to me I would say I did not wish to continue to support degree = 0. True. There are plenty of reasons why one
Re: [Rd] bug (PR#13570)
Hi Nice to hear from you Ryan. I also do not have the capability to debug on windows; however, there is a chance that the behavior you are seeing is caused by the following bug noted in my thesis (available on ProQuest; email me if you don't have access): When lambda = 0 there are no local slopes to aid the blending algorithm, yet the interpolator would still assume they were available, and thus use arbitrary values from memory. This had implications for both fit and tr[L] computation. In the updated code these are set equal to zero which seems the best automatic rule when lambda = 0. [lambda refers to degree] I submitted a bug fix to Eric Grosse, the maintainer of the netlib routines; the fixed lines of fortran are identified in the comments at (just search for my email address): http://www.netlib.org/a/loess These fixes would be relatively simple to incorporate into R's version of loessf.f Alternatively, a quick check would be for someone to compile the source package at https://centauri.stat.purdue.edu:98/loess/loess_0.4-1.tar.gz and test it on windows. Though this package incorporates this and a few other fixes, please be aware that it the routines are converted to C and thus there is a slight performance hit compared to the fortran. Hope this helps, Ben Ryan Hafen wrote: That is true - good point. lp1 - predict(loess(y ~ x, degree=0)) lp2 - predict(loess(y ~ x, degree=0, control=loess.control(surface=direct))) sort(abs(lp1-lp2)) It appears that the interpolating fit is correct at the vertices. I know when degree=1, the interpolation uses the slopes of the local fits to get a better approximation. Perhaps it's still trying to do this with degree=0 but the slopes aren't available. And we have just been lucky in the past with uninitialized values? If this is the problem it would probably be very simple to fix and I'd love to see degree=0 stay. I will see if I can figure it out. On Mar 5, 2009, at 6:01 PM, Greg Snow wrote: I see the same problem on Windows XP. But if I run loess with surface='direct' then the results are correct. So it looks like the problem comes from the smoothing/interpolating, not the main loess algorithm. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r- project.org] On Behalf Of Ryan Hafen Sent: Thursday, March 05, 2009 7:43 AM To: Prof Brian Ripley Cc: Uwe Ligges; Berwin A Turlach; r-de...@stat.math.ethz.ch; Peter Dalgaard Subject: Re: [Rd] bug (PR#13570) On Mar 5, 2009, at 7:59 AM, Prof Brian Ripley wrote: On Thu, 5 Mar 2009, Peter Dalgaard wrote: Prof Brian Ripley wrote: Undortunately the example is random, so not really reproducible (and I see nothing wrong on my Mac). However, Linux valgrind on R-devel is showing a problem: ==3973== Conditional jump or move depends on uninitialised value(s) ==3973==at 0xD76017B: ehg141_ (loessf.f:532) ==3973==by 0xD761600: lowesa_ (loessf.f:769) ==3973==by 0xD736E47: loess_raw (loessc.c:117) (The uninitiialized value is in someone else's code and I suspect it was either never intended to work or never tested.) No essential change has been made to the loess code for many years. I would not have read the documentation to say that degree = 0 was a reasonable value. It is not to my mind 'a polynomial surface', and loess() is described as a 'local regression' for degree 1 or 2 in the reference. So unless anyone wants to bury their heads in that code I think a perfectly adequate fix would be to disallow degree = 0. (I vaguely recall debating allowing in the code ca 10 years ago.) The code itself has if (!match(degree, 0:2, 0)) stop('degree' must be 0, 1 or 2) though. Local fitting of a constant essentially becomes kernel smoothing, right? I do know the R code allows it: the question is whether it is worth the effort of finding the problem(s) in the underlying c/dloess code, whose manual (and our reference) is entirely about 1 or 2. I am concerned that there may be other things lurking in the degree=0 case if it was never tested (in the netlib version: I am sure it was only minmally tested through my R interface). I checked the original documentation on netlib and that says 29 DIM dimension of local regression 1 constant d+1 linear (default) (d+2)(d+1)/2quadratic Modified by ehg127 if cdegtdeg. which seems to confirm that degree = 0 was intended to be allowed, and what I dimly recall from ca 1998 is debating whether the R code should allow that or not. If left to me I would say I did not wish to continue to support degree = 0. True. There are plenty of reasons why one wouldn't want to use degree=0 anyway. And I'm sure there are plenty of other simple ways to achieve the same effect. I ran
Re: [Rd] bug (PR#13570)
On Thu, 5 Mar 2009, Benjamin Tyner wrote: Hi Nice to hear from you Ryan. I also do not have the capability to debug on windows; however, there is a chance that the behavior you are seeing is caused by the following bug noted in my thesis (available on ProQuest; email me if you don't have access): When lambda = 0 there are no local slopes to aid the blending algorithm, yet the interpolator would still assume they were available, and thus use arbitrary values from memory. This had implications for both fit and tr[L] computation. In the updated code these are set equal to zero which seems the best automatic rule when lambda = 0. [lambda refers to degree] I submitted a bug fix to Eric Grosse, the maintainer of the netlib routines; the fixed lines of fortran are identified in the comments at (just search for my email address): http://www.netlib.org/a/loess These fixes would be relatively simple to incorporate into R's version of loessf.f The fixes from dloess even more simply, since R's code is based on dloess. Thank you for the suggestion. Given how tricky this is to reproduce, I went back to my example under valgrind. If I use the latest dloess code, it crashes, but by selectively importing some of the differences I can get it to work. So it looks as if we are on the road to a solution, but something in the current version (not necessarily in these changes) is incompatible with the current R code and I need to dig further (not for a few days). Alternatively, a quick check would be for someone to compile the source package at https://centauri.stat.purdue.edu:98/loess/loess_0.4-1.tar.gz and test it on windows. Though this package incorporates this and a few other fixes, please be aware that it the routines are converted to C and thus there is a slight performance hit compared to the fortran. Hope this helps, Ben [...] -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel