Re: [R] non-linear regression and root finding

2023-11-07 Thread Ivan Krylov
On Mon, 6 Nov 2023 14:55:39 -0500
J C Nash  wrote:

> However, I'm wondering if this approach is worth writing up, at least
> as a vignette or blog post. It does need a shorter example and some
> explanation of the "why" and some testing perhaps.

Do you mean using this problem as a basis to illustrate ordering
constraints on parameters? Weird constraints do come up every now and
then in regression problems. I could definitely offer my help with at
least some of the text.

> If there's interest, I'll be happy to join in. And my own posting
> suggests how the ordering is enforced by bounding the "delta"
> parameters from below.

I have just tried nlsr::nlxb for a slightly larger dataset shared by
Troels off-list, and it worked great with the delta parameters as you
suggested, thank you!

It's interesting that nlxb and nlsLM give slightly different answers,
differing in 0.5 pK units for pK1 and (pK2-pK1) but not (pK3-pK2). Then
again, they both agree that the standard error for pK1 and (pK2-pK1) is
very large, so perhaps the problem is just very ill-conditioned.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread Berwin A Turlach
G'day Troels,

On Tue, 7 Nov 2023 07:14:02 +0100
Troels Ring  wrote:

> Be as it may, I wonder if not your method might work if only we KNOW
> that pK1 is either positive OR negative, in which case we have pK1 =
> -exp(theta1)?

If pK1 can be either negative or positive (or 0 :-) ), and it is just
the ordering that you want to have enforced, then I would try the
parameterisation:

pK1 = pK1
pK2 = pK1 + exp(theta2)
pK3 = pk2 + exp(theta3)

and optimise over pK1, theta2 and theta3.  

As long as you want to know the estimates only.  

Asking for standard errors of the original estimates would open another
can of worms. :-)

Cheers,

Berwin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread Troels Ring
Thanks a lot, Berwin. Unfortunately, pK1 may well be negative and as I 
understand the literature it may be poorly defined as such, and also 
seems to be at a boundary, since when lower is set to say rep(-4,3) pK1 
is returned as -4 while pK2 and pK3 are undisturbed. Perhaps the point 
is that pK1 is not carrying any information at the pH around 5. Fair 
enough, I guess. Only, I believe I need stick to including all three pK 
values to be in agreement with the molecular information about HEPES - 
although even this is contentious. Be as it may, I wonder if not your 
method might work if only we KNOW that pK1 is either positive OR 
negative, in which case we have pK1 = -exp(theta1)?


Best wishes
Troels

Den 07-11-2023 kl. 05:10 skrev Berwin A Turlach:

G'day Troels,

On Mon, 6 Nov 2023 20:43:10 +0100
Troels Ring  wrote:


Thanks a lot! This was amazing. I'm not sure I see how the conditiion
pK1 < pK2 < pK3 is enforced?

One way of enforcing such constraints (well, in finite computer
arithemtic only "<=" can be enforced) is to rewrite the parameters as:

pK1 = exp(theta1)   ## only if pK1 > 0
pK2 = pK1 + exp(theta2)
pK3 = pk2 + exp(theta3)

And then use your optimiser to optimise over theta1, theta2 and theta3.

There might be better approaches depending on the specific problem.

Cheers,

Berwin


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread Berwin A Turlach
G'day Troels,

On Mon, 6 Nov 2023 20:43:10 +0100
Troels Ring  wrote:

> Thanks a lot! This was amazing. I'm not sure I see how the conditiion
> pK1 < pK2 < pK3 is enforced? 

One way of enforcing such constraints (well, in finite computer
arithemtic only "<=" can be enforced) is to rewrite the parameters as:

pK1 = exp(theta1)   ## only if pK1 > 0
pK2 = pK1 + exp(theta2)
pK3 = pk2 + exp(theta3)

And then use your optimiser to optimise over theta1, theta2 and theta3.

There might be better approaches depending on the specific problem.

Cheers,

Berwin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread J C Nash

I won't send to list, but just to the two of you, as I don't have
anything to add at this time. However, I'm wondering if this approach
is worth writing up, at least as a vignette or blog post. It does need
a shorter example and some explanation of the "why" and some testing
perhaps.

If there's interest, I'll be happy to join in. And my own posting suggests
how the ordering is enforced by bounding the "delta" parameters from below.

Note that nls() can only handle bounds in the "port" algorithm, and the
man page rather pours cold water on using that algorithm.

Best, JN


On 2023-11-06 14:43, Troels Ring wrote:
Thanks a lot! This was amazing. I'm not sure I see how the conditiion pK1 < pK2 < pK3 is enforced? - it comes from the 
derivation via generalized Henderson-Hasselbalch but perhaps it is not really necessary. Anyway, the use of Vectorize 
did the trick!


Best wishes
Troels

Den 06-11-2023 kl. 19:19 skrev Ivan Krylov:

В Mon, 6 Nov 2023 17:53:49 +0100
Troels Ring  пишет:


Hence I wonder if I could somehow have non linear regression to find
the 3 pK values. Below is HEPESFUNC which delivers charge in the
fluid for known pKs, HEPTOT and SID. Is it possible to have
root-finding in the formula with nls?

Sure. Just reformulate the problem in terms of a function that takes a
vector of predictors (your independent variable SID) and the desired
parameters (pK1, pK2, pK3) as separate arguments and then returns
predicted values of the dependent variable (to compare against pHobs):

kw <- 1e-14 # I'm assuming
pHm <- Vectorize(function(SID, pK1, pK2, pK3)
  -log10(uniroot(HEPESFUNC,c(1e-20,1),tol=1e-20,maxiter=1e4,
 HEPTOT=HEPTOT,SID = SID, pK1=pK1,pK2=pK2,pK3=pK3)$root))

(Yes, Vectorize() doesn't make the function any faster, but I don't see
an easy way to rewrite this function to make it truly vectorised.)

Unfortunately, nls() seems to take a step somewhere where crossprod()
of the Jacobian of the model function cannot be inverted and fails with
the message "Singular gradient". I wish that R could have a more
reliable built-in nonlinear least squares solver. (I could also be
holding it wrong.) Meanwhile, we have excellent CRAN packages nlsr and
minpack.lm:

minpack.lm::nlsLM(
  pHobs ~ pHm(SID, pK1, pK2, pK3),
  data.frame(pHobs = pHobs, SID = SID),
  start = c(pK1 = pK1, pK2 = pK2, pK3 = pK3),
  # the following is also needed to avoid MINPACK failing to fit
  lower = rep(-1, 3), upper = rep(9, 3)
)
# Nonlinear regression model
#   model: pHobs ~ pHm(SID, pK1, pK2, pK3)
#    data: data.frame(pHobs = pHobs, SID = SID)
#    pK1 pK2    pK3
# -1.000   2.966  7.606
#  residual sum-of-squares: 0.001195
#
# Number of iterations to convergence: 15
# Achieved convergence tolerance: 1.49e-08

(Unfortunately, your code seemed to have a lot of non-breakable spaces,
which confuse R's parser and make it harder to copy&paste it.)

I think that your function can also be presented as a degree-5
polynomial in H, so it should also be possible to use polyroot() to
obtain your solutions in a more exact manner.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread Troels Ring
Thanks a lot! This was amazing. I'm not sure I see how the conditiion 
pK1 < pK2 < pK3 is enforced? - it comes from the derivation via 
generalized Henderson-Hasselbalch but perhaps it is not really 
necessary. Anyway, the use of Vectorize did the trick!


Best wishes
Troels

Den 06-11-2023 kl. 19:19 skrev Ivan Krylov:

В Mon, 6 Nov 2023 17:53:49 +0100
Troels Ring  пишет:


Hence I wonder if I could somehow have non linear regression to find
the 3 pK values. Below is HEPESFUNC which delivers charge in the
fluid for known pKs, HEPTOT and SID. Is it possible to have
root-finding in the formula with nls?

Sure. Just reformulate the problem in terms of a function that takes a
vector of predictors (your independent variable SID) and the desired
parameters (pK1, pK2, pK3) as separate arguments and then returns
predicted values of the dependent variable (to compare against pHobs):

kw <- 1e-14 # I'm assuming
pHm <- Vectorize(function(SID, pK1, pK2, pK3)
  -log10(uniroot(HEPESFUNC,c(1e-20,1),tol=1e-20,maxiter=1e4,
 HEPTOT=HEPTOT,SID = SID, pK1=pK1,pK2=pK2,pK3=pK3)$root))

(Yes, Vectorize() doesn't make the function any faster, but I don't see
an easy way to rewrite this function to make it truly vectorised.)

Unfortunately, nls() seems to take a step somewhere where crossprod()
of the Jacobian of the model function cannot be inverted and fails with
the message "Singular gradient". I wish that R could have a more
reliable built-in nonlinear least squares solver. (I could also be
holding it wrong.) Meanwhile, we have excellent CRAN packages nlsr and
minpack.lm:

minpack.lm::nlsLM(
  pHobs ~ pHm(SID, pK1, pK2, pK3),
  data.frame(pHobs = pHobs, SID = SID),
  start = c(pK1 = pK1, pK2 = pK2, pK3 = pK3),
  # the following is also needed to avoid MINPACK failing to fit
  lower = rep(-1, 3), upper = rep(9, 3)
)
# Nonlinear regression model
#   model: pHobs ~ pHm(SID, pK1, pK2, pK3)
#data: data.frame(pHobs = pHobs, SID = SID)
#pK1 pK2pK3
# -1.000   2.966  7.606
#  residual sum-of-squares: 0.001195
#
# Number of iterations to convergence: 15
# Achieved convergence tolerance: 1.49e-08

(Unfortunately, your code seemed to have a lot of non-breakable spaces,
which confuse R's parser and make it harder to copy&paste it.)

I think that your function can also be presented as a degree-5
polynomial in H, so it should also be possible to use polyroot() to
obtain your solutions in a more exact manner.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread J C Nash

Your script is missing something (in particular kw).

I presume you are trying to estimate the pK values. You may have more success
with package nlsr than nls(). nlsr::nlxb() tries to get the Jacobian of the
model specified by a formula and do so by applying symbolic or automatic
differentiation. The multi expression function would probably not work.
(That is one advantage of nls(), but it has an unstabilized solver and unless
carefully called will use a simple forward derivative approximation.)
There is also nlsr::nlfb() that can use functions for the residuals and
jacobian. Your function is messy, but likely the jacobian can be developed
with a bit of work.

Some of the symbolic derivative features of R can help here. Alternatively,
there are several numerical approximations in nlsr and the vignette "Intro
to nlsr" explains how to use these. Note that simple forward and backward
approximations are generally not worth using, but central approximation is
quite good.

The solver in the nlsr package allows of bounds on the parameters. Since you
want them ordered, you may want to transform

   pK1 < pK2 <- pK3

to  pk1, deltapk2, deltapk3 so pK2 = pk1+deltapk2
and pk3 = pk2 + deltapk3 = pk1 + deltapk2 + deltapk3
and all values bounded below by 0 or possibly some small numbers to
keep the paramters apart.

I won't pretend any of this is trivial. It is VERY easy to make small errors
that still allow for output that is disastrously incorrect. If it is possible
to get a single "formula" as one expression even if spread over multiple lines,
then nlxb() might be able to handle it.


J Nash (maintainer of nlsr and optimx)

On 2023-11-06 11:53, Troels Ring wrote:



HEPESFUNC <-
   function(H,SID,HEPTOT,pK1,pK2,pK3) {
     XX <- (H^3/(10^-pK3*10^-pK2*10^-pK1)+H^2/(10^-pK3*10^-pK2)+H/(10^-pK3))
     IV <- HEPTOT/(1+XX)
     I <- IV*H^3/(10^-pK3*10^-pK2*10^-pK1)
     II <- IV*H^2/(10^-pK3*10^-pK2)
     III <- IV*H/10^-pK3
     H - kw/H + SID + I*2 + II - IV
   }


HEPTOT <- 0.050
SID <- c(-seq(10,1)*1e-4,0,seq(1,10)*1e-4)

pHobs  <- c(4.63,4.68,4.72,4.77,4.83,4.9,4.96,5.04,5.12,5.21,5.3,
     5.39,5.48,5.55,5.63,5.69,5.74,5.8,5.85,5.89,5.93)

pK1 <- -1; pK2 <- 3; pK3 <- 7.55 # literature values
pK3 <- 7.6; pK2 <- 2.96 # values eye-balled to be better

pH <- c()
for (i in 1:length(SID)) {
   pH[i] <- -log10(uniroot(HEPESFUNC,c(1e-20,1),tol=1e-20,maxiter=1e4,
    HEPTOT=HEPTOT,SID = SID[i],
    pK1=pK1,pK2=pK2,pK3=pK3)$root)
}

plot(SID,pH)
points(SID,pHobs,col="red")


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread Ivan Krylov
В Mon, 6 Nov 2023 17:53:49 +0100
Troels Ring  пишет:

> Hence I wonder if I could somehow have non linear regression to find
> the 3 pK values. Below is HEPESFUNC which delivers charge in the
> fluid for known pKs, HEPTOT and SID. Is it possible to have
> root-finding in the formula with nls?

Sure. Just reformulate the problem in terms of a function that takes a
vector of predictors (your independent variable SID) and the desired
parameters (pK1, pK2, pK3) as separate arguments and then returns
predicted values of the dependent variable (to compare against pHobs):

kw <- 1e-14 # I'm assuming
pHm <- Vectorize(function(SID, pK1, pK2, pK3)
 -log10(uniroot(HEPESFUNC,c(1e-20,1),tol=1e-20,maxiter=1e4,
HEPTOT=HEPTOT,SID = SID, pK1=pK1,pK2=pK2,pK3=pK3)$root))

(Yes, Vectorize() doesn't make the function any faster, but I don't see
an easy way to rewrite this function to make it truly vectorised.)

Unfortunately, nls() seems to take a step somewhere where crossprod()
of the Jacobian of the model function cannot be inverted and fails with
the message "Singular gradient". I wish that R could have a more
reliable built-in nonlinear least squares solver. (I could also be
holding it wrong.) Meanwhile, we have excellent CRAN packages nlsr and
minpack.lm:

minpack.lm::nlsLM(
 pHobs ~ pHm(SID, pK1, pK2, pK3),
 data.frame(pHobs = pHobs, SID = SID),
 start = c(pK1 = pK1, pK2 = pK2, pK3 = pK3),
 # the following is also needed to avoid MINPACK failing to fit
 lower = rep(-1, 3), upper = rep(9, 3)
)
# Nonlinear regression model
#   model: pHobs ~ pHm(SID, pK1, pK2, pK3)
#data: data.frame(pHobs = pHobs, SID = SID)
#pK1 pK2pK3
# -1.000   2.966  7.606
#  residual sum-of-squares: 0.001195
#
# Number of iterations to convergence: 15 
# Achieved convergence tolerance: 1.49e-08

(Unfortunately, your code seemed to have a lot of non-breakable spaces,
which confuse R's parser and make it harder to copy&paste it.)

I think that your function can also be presented as a degree-5
polynomial in H, so it should also be possible to use polyroot() to
obtain your solutions in a more exact manner.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.