Thanks, Martin. This is very helpful.
On Tue, 16 Jul 2024 at 14:52, Martin Maechler <maech...@stat.math.ethz.ch> wrote: > >>>>> Anupam Tyagi > >>>>> on Tue, 9 Jul 2024 16:16:43 +0530 writes: > > > How can I do automatic knot selection while fitting piecewise linear > > splines to two variables x and y? Which package to use to do it > simply? I > > also want to visualize the splines (and the scatter plot) with a > graph. > > > Anupam > > NB: linear splines, i.e. piecewise linear continuous functions. > Given the knots, use approx() or approxfun() however, the > automatic knots selection does not happen in the base R packages. > > I'm sure there are several R packages doing this. > The best such package in my opinion is "earth" which does a > re-implementation (and extensive *generalization*) of the > famous MARS algorithm of Friedman. > ==> https://en.wikipedia.org/wiki/Multivariate_adaptive_regression_splines > > Note that their strengths and power is that they do their work > for multivariate x (MARS := Multivariate Adaptive Regression > Splines), but indeed do work for the simple 1D case. > > In the following example, we always get 11 final knots, > but I'm sure one can tweak the many tuning paramters of earth() > to get more: > > ## Can we do knot-selection for simple (x,y) splines? === Yes, via > earth() {using MARS}! > > x <- (0:800)/8 > > f <- function(x) 7 * sin(pi/8*x) * abs((x-50)/20)^1.25 - (x-40)*(12-x)/64 > curve(f(x), 0, 100, n = 1000, col=2, lwd=2) > > set.seed(11) > y <- f(x) + 10*rnorm(x) > > m.sspl <- smooth.spline(x,y) # base line "standard smoother" > > require(earth) > fm1 <- earth(x, y) # default settings > summary(fm1, style = "pmax") #-- got 10 knots (x = 44 "used twice") below > ## Call: earth(x=x, y=y) > > ## y = > ## 175.9612 > ## - 10.6744 * pmax(0, x - 4.625) > ## + 9.928496 * pmax(0, x - 10.875) > ## - 5.940857 * pmax(0, x - 20.25) > ## + 3.438948 * pmax(0, x - 27.125) > ## - 3.828159 * pmax(0, 44 - x) > ## + 4.207046 * pmax(0, x - 44) > ## + 2.573822 * pmax(0, x - 76.5) > ## - 10.99073 * pmax(0, x - 87.125) > ## + 10.97592 * pmax(0, x - 90.875) > ## + 9.331949 * pmax(0, x - 94) > ## - 8.48575 * pmax(0, x - 96.5) > > ## Selected 12 of 12 terms, and 1 of 1 predictors > ## Termination condition: Reached nk 21 > ## Importance: x > ## Number of terms at each degree of interaction: 1 11 (additive model) > ## GCV 108.6592 RSS 82109.44 GRSq 0.861423 RSq 0.86894 > > > fm2 <- earth(x, y, fast.k = 0) # (more extensive forward pass) > summary(fm2) > all.equal(fm1, fm2)# they are identical (apart from 'call'): > fm3 <- earth(x, y, fast.k = 0, pmethod = "none", trace = 3) # extensive > forward pass; *no* pruning > ## still no change: fm3 "==" fm1 > all.equal(predict(fm1, xx), predict(fm3, xx)) > > ## BTW: The chosen knots and coefficients are > mat <- with(fm1, cbind(dirs, cuts=c(cuts), coef = c(coefficients))) > > ## Plots : fine grid for visualization: instead of xx <- seq(x[1], > x[length(x)], length.out = 1024) > rnx <- extendrange(x) ## to extrapolate a bit > xx <- do.call(seq.int, c(rnx, list(length.out = 1200))) > > cbind(f = f(xx), > sspl = predict(m.sspl, xx)$y, > mars = predict(fm1, xx)) -> fits > > plot(x,y, xlim=rnx, cex = 1/4, col = adjustcolor(1, 1/2)) > cols <- c(adjustcolor(2, 1/3), > adjustcolor(4, 2/3), > adjustcolor("orange4", 2/3)) > lwds <- c(3, 2, 2) > matlines(xx, fits, col = cols, lwd = lwds, lty=1) > legend("topleft", c("true f(x)", "smooth.spline()", "earth()"), > col=cols, lwd=lwds, bty = "n") > title(paste("earth() linear spline vs. smooth.spline(); n =", length(x))) > mtext(substitute(f(x) == FDEF, list(FDEF = body(f)))) > -- Anupam. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.