Re: [R] non-linear regression and root finding

2023-11-06 Thread Berwin A Turlach
G'day Troels,

On Tue, 7 Nov 2023 07:14:02 +0100
Troels Ring  wrote:

> Be as it may, I wonder if not your method might work if only we KNOW
> that pK1 is either positive OR negative, in which case we have pK1 =
> -exp(theta1)?

If pK1 can be either negative or positive (or 0 :-) ), and it is just
the ordering that you want to have enforced, then I would try the
parameterisation:

pK1 = pK1
pK2 = pK1 + exp(theta2)
pK3 = pk2 + exp(theta3)

and optimise over pK1, theta2 and theta3.  

As long as you want to know the estimates only.  

Asking for standard errors of the original estimates would open another
can of worms. :-)

Cheers,

Berwin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread Troels Ring
Thanks a lot, Berwin. Unfortunately, pK1 may well be negative and as I 
understand the literature it may be poorly defined as such, and also 
seems to be at a boundary, since when lower is set to say rep(-4,3) pK1 
is returned as -4 while pK2 and pK3 are undisturbed. Perhaps the point 
is that pK1 is not carrying any information at the pH around 5. Fair 
enough, I guess. Only, I believe I need stick to including all three pK 
values to be in agreement with the molecular information about HEPES - 
although even this is contentious. Be as it may, I wonder if not your 
method might work if only we KNOW that pK1 is either positive OR 
negative, in which case we have pK1 = -exp(theta1)?


Best wishes
Troels

Den 07-11-2023 kl. 05:10 skrev Berwin A Turlach:

G'day Troels,

On Mon, 6 Nov 2023 20:43:10 +0100
Troels Ring  wrote:


Thanks a lot! This was amazing. I'm not sure I see how the conditiion
pK1 < pK2 < pK3 is enforced?

One way of enforcing such constraints (well, in finite computer
arithemtic only "<=" can be enforced) is to rewrite the parameters as:

pK1 = exp(theta1)   ## only if pK1 > 0
pK2 = pK1 + exp(theta2)
pK3 = pk2 + exp(theta3)

And then use your optimiser to optimise over theta1, theta2 and theta3.

There might be better approaches depending on the specific problem.

Cheers,

Berwin


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread Berwin A Turlach
G'day Troels,

On Mon, 6 Nov 2023 20:43:10 +0100
Troels Ring  wrote:

> Thanks a lot! This was amazing. I'm not sure I see how the conditiion
> pK1 < pK2 < pK3 is enforced? 

One way of enforcing such constraints (well, in finite computer
arithemtic only "<=" can be enforced) is to rewrite the parameters as:

pK1 = exp(theta1)   ## only if pK1 > 0
pK2 = pK1 + exp(theta2)
pK3 = pk2 + exp(theta3)

And then use your optimiser to optimise over theta1, theta2 and theta3.

There might be better approaches depending on the specific problem.

Cheers,

Berwin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Concordance and Kendall's tau in copula

2023-11-06 Thread Steven Yen

Dear

I estimate a sample selection model using the Clayton copula and Burr 
and Gaussian marginal. I need to derive ther Kendall'sw tau from the 
concordance coefficient by integration. I came across a way to do that 
in R long time ago but cannot find it again. Can somewone tell me what 
to read and what to use? Thank you.


Steven Yen

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] understanding predict.lm

2023-11-06 Thread Spencer Graves

Doh! Thanks very much. sg


On 11/6/23 5:17 PM, John Fox wrote:

Dear Spencer,

You need the t distribution with correct df, not the standard-normal 
distribution:


 > pt(-z.confInt/2, df=13)
     1 2 3 4 5 6 7 8 9    10    11
0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025
    12    13
0.025 0.025

 > pt(-z.predInt/2, df=13)
     1 2 3 4 5 6 7 8 9    10    11
0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025 0.025
    12    13
0.025 0.025

I hope this helps,
  John


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] understanding predict.lm

2023-11-06 Thread Spencer Graves

Hello, All:


	  I am unable to manually replicate predict.lm, specifically comparing 
se.fit with (fit[,3]-fit[,2]): I think their ratio should be 
2*qnorm((1-level)/2), and that's not what I'm getting.



	  Consider the following slight modification of the first example in 
help('predict.lm'):



set.seed(1)
x <- rnorm(15)
y <- x + rnorm(15)
predict(lm(y ~ x))
new <- data.frame(x = seq(-3, 3, 0.5))
predict(lm(y ~ x), new, se.fit = TRUE)
pred.w.plim <- predict(lm(y ~ x), new, interval = "prediction",
   se.fit = TRUE)
pred.w.clim <- predict(lm(y ~ x), new, interval = "confidence",
   se.fit = TRUE)

(z.confInt <- with(pred.w.clim, (fit[,3]-fit[,2])/se.fit))
pnorm(-z.confInt/2)

s.pred <- sqrt(with(pred.w.plim,
se.fit^2+residual.scale^2))
(z.predInt <- with(pred.w.plim, (fit[,3]-fit[,2])/s.pred))
pnorm(-z.predInt/2)


	  ** This gives me 0.01537207. I do not understand why it's not 0.025 
with level = 0.95.



  Can someone help me understand this?
  Thanks,
  Spencer Graves

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread J C Nash

I won't send to list, but just to the two of you, as I don't have
anything to add at this time. However, I'm wondering if this approach
is worth writing up, at least as a vignette or blog post. It does need
a shorter example and some explanation of the "why" and some testing
perhaps.

If there's interest, I'll be happy to join in. And my own posting suggests
how the ordering is enforced by bounding the "delta" parameters from below.

Note that nls() can only handle bounds in the "port" algorithm, and the
man page rather pours cold water on using that algorithm.

Best, JN


On 2023-11-06 14:43, Troels Ring wrote:
Thanks a lot! This was amazing. I'm not sure I see how the conditiion pK1 < pK2 < pK3 is enforced? - it comes from the 
derivation via generalized Henderson-Hasselbalch but perhaps it is not really necessary. Anyway, the use of Vectorize 
did the trick!


Best wishes
Troels

Den 06-11-2023 kl. 19:19 skrev Ivan Krylov:

В Mon, 6 Nov 2023 17:53:49 +0100
Troels Ring  пишет:


Hence I wonder if I could somehow have non linear regression to find
the 3 pK values. Below is HEPESFUNC which delivers charge in the
fluid for known pKs, HEPTOT and SID. Is it possible to have
root-finding in the formula with nls?

Sure. Just reformulate the problem in terms of a function that takes a
vector of predictors (your independent variable SID) and the desired
parameters (pK1, pK2, pK3) as separate arguments and then returns
predicted values of the dependent variable (to compare against pHobs):

kw <- 1e-14 # I'm assuming
pHm <- Vectorize(function(SID, pK1, pK2, pK3)
  -log10(uniroot(HEPESFUNC,c(1e-20,1),tol=1e-20,maxiter=1e4,
 HEPTOT=HEPTOT,SID = SID, pK1=pK1,pK2=pK2,pK3=pK3)$root))

(Yes, Vectorize() doesn't make the function any faster, but I don't see
an easy way to rewrite this function to make it truly vectorised.)

Unfortunately, nls() seems to take a step somewhere where crossprod()
of the Jacobian of the model function cannot be inverted and fails with
the message "Singular gradient". I wish that R could have a more
reliable built-in nonlinear least squares solver. (I could also be
holding it wrong.) Meanwhile, we have excellent CRAN packages nlsr and
minpack.lm:

minpack.lm::nlsLM(
  pHobs ~ pHm(SID, pK1, pK2, pK3),
  data.frame(pHobs = pHobs, SID = SID),
  start = c(pK1 = pK1, pK2 = pK2, pK3 = pK3),
  # the following is also needed to avoid MINPACK failing to fit
  lower = rep(-1, 3), upper = rep(9, 3)
)
# Nonlinear regression model
#   model: pHobs ~ pHm(SID, pK1, pK2, pK3)
#    data: data.frame(pHobs = pHobs, SID = SID)
#    pK1 pK2    pK3
# -1.000   2.966  7.606
#  residual sum-of-squares: 0.001195
#
# Number of iterations to convergence: 15
# Achieved convergence tolerance: 1.49e-08

(Unfortunately, your code seemed to have a lot of non-breakable spaces,
which confuse R's parser and make it harder to copy it.)

I think that your function can also be presented as a degree-5
polynomial in H, so it should also be possible to use polyroot() to
obtain your solutions in a more exact manner.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread Troels Ring
Thanks a lot! This was amazing. I'm not sure I see how the conditiion 
pK1 < pK2 < pK3 is enforced? - it comes from the derivation via 
generalized Henderson-Hasselbalch but perhaps it is not really 
necessary. Anyway, the use of Vectorize did the trick!


Best wishes
Troels

Den 06-11-2023 kl. 19:19 skrev Ivan Krylov:

В Mon, 6 Nov 2023 17:53:49 +0100
Troels Ring  пишет:


Hence I wonder if I could somehow have non linear regression to find
the 3 pK values. Below is HEPESFUNC which delivers charge in the
fluid for known pKs, HEPTOT and SID. Is it possible to have
root-finding in the formula with nls?

Sure. Just reformulate the problem in terms of a function that takes a
vector of predictors (your independent variable SID) and the desired
parameters (pK1, pK2, pK3) as separate arguments and then returns
predicted values of the dependent variable (to compare against pHobs):

kw <- 1e-14 # I'm assuming
pHm <- Vectorize(function(SID, pK1, pK2, pK3)
  -log10(uniroot(HEPESFUNC,c(1e-20,1),tol=1e-20,maxiter=1e4,
 HEPTOT=HEPTOT,SID = SID, pK1=pK1,pK2=pK2,pK3=pK3)$root))

(Yes, Vectorize() doesn't make the function any faster, but I don't see
an easy way to rewrite this function to make it truly vectorised.)

Unfortunately, nls() seems to take a step somewhere where crossprod()
of the Jacobian of the model function cannot be inverted and fails with
the message "Singular gradient". I wish that R could have a more
reliable built-in nonlinear least squares solver. (I could also be
holding it wrong.) Meanwhile, we have excellent CRAN packages nlsr and
minpack.lm:

minpack.lm::nlsLM(
  pHobs ~ pHm(SID, pK1, pK2, pK3),
  data.frame(pHobs = pHobs, SID = SID),
  start = c(pK1 = pK1, pK2 = pK2, pK3 = pK3),
  # the following is also needed to avoid MINPACK failing to fit
  lower = rep(-1, 3), upper = rep(9, 3)
)
# Nonlinear regression model
#   model: pHobs ~ pHm(SID, pK1, pK2, pK3)
#data: data.frame(pHobs = pHobs, SID = SID)
#pK1 pK2pK3
# -1.000   2.966  7.606
#  residual sum-of-squares: 0.001195
#
# Number of iterations to convergence: 15
# Achieved convergence tolerance: 1.49e-08

(Unfortunately, your code seemed to have a lot of non-breakable spaces,
which confuse R's parser and make it harder to copy it.)

I think that your function can also be presented as a degree-5
polynomial in H, so it should also be possible to use polyroot() to
obtain your solutions in a more exact manner.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread J C Nash

Your script is missing something (in particular kw).

I presume you are trying to estimate the pK values. You may have more success
with package nlsr than nls(). nlsr::nlxb() tries to get the Jacobian of the
model specified by a formula and do so by applying symbolic or automatic
differentiation. The multi expression function would probably not work.
(That is one advantage of nls(), but it has an unstabilized solver and unless
carefully called will use a simple forward derivative approximation.)
There is also nlsr::nlfb() that can use functions for the residuals and
jacobian. Your function is messy, but likely the jacobian can be developed
with a bit of work.

Some of the symbolic derivative features of R can help here. Alternatively,
there are several numerical approximations in nlsr and the vignette "Intro
to nlsr" explains how to use these. Note that simple forward and backward
approximations are generally not worth using, but central approximation is
quite good.

The solver in the nlsr package allows of bounds on the parameters. Since you
want them ordered, you may want to transform

   pK1 < pK2 <- pK3

to  pk1, deltapk2, deltapk3 so pK2 = pk1+deltapk2
and pk3 = pk2 + deltapk3 = pk1 + deltapk2 + deltapk3
and all values bounded below by 0 or possibly some small numbers to
keep the paramters apart.

I won't pretend any of this is trivial. It is VERY easy to make small errors
that still allow for output that is disastrously incorrect. If it is possible
to get a single "formula" as one expression even if spread over multiple lines,
then nlxb() might be able to handle it.


J Nash (maintainer of nlsr and optimx)

On 2023-11-06 11:53, Troels Ring wrote:



HEPESFUNC <-
   function(H,SID,HEPTOT,pK1,pK2,pK3) {
     XX <- (H^3/(10^-pK3*10^-pK2*10^-pK1)+H^2/(10^-pK3*10^-pK2)+H/(10^-pK3))
     IV <- HEPTOT/(1+XX)
     I <- IV*H^3/(10^-pK3*10^-pK2*10^-pK1)
     II <- IV*H^2/(10^-pK3*10^-pK2)
     III <- IV*H/10^-pK3
     H - kw/H + SID + I*2 + II - IV
   }


HEPTOT <- 0.050
SID <- c(-seq(10,1)*1e-4,0,seq(1,10)*1e-4)

pHobs  <- c(4.63,4.68,4.72,4.77,4.83,4.9,4.96,5.04,5.12,5.21,5.3,
     5.39,5.48,5.55,5.63,5.69,5.74,5.8,5.85,5.89,5.93)

pK1 <- -1; pK2 <- 3; pK3 <- 7.55 # literature values
pK3 <- 7.6; pK2 <- 2.96 # values eye-balled to be better

pH <- c()
for (i in 1:length(SID)) {
   pH[i] <- -log10(uniroot(HEPESFUNC,c(1e-20,1),tol=1e-20,maxiter=1e4,
    HEPTOT=HEPTOT,SID = SID[i],
    pK1=pK1,pK2=pK2,pK3=pK3)$root)
}

plot(SID,pH)
points(SID,pHobs,col="red")


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] non-linear regression and root finding

2023-11-06 Thread Ivan Krylov
В Mon, 6 Nov 2023 17:53:49 +0100
Troels Ring  пишет:

> Hence I wonder if I could somehow have non linear regression to find
> the 3 pK values. Below is HEPESFUNC which delivers charge in the
> fluid for known pKs, HEPTOT and SID. Is it possible to have
> root-finding in the formula with nls?

Sure. Just reformulate the problem in terms of a function that takes a
vector of predictors (your independent variable SID) and the desired
parameters (pK1, pK2, pK3) as separate arguments and then returns
predicted values of the dependent variable (to compare against pHobs):

kw <- 1e-14 # I'm assuming
pHm <- Vectorize(function(SID, pK1, pK2, pK3)
 -log10(uniroot(HEPESFUNC,c(1e-20,1),tol=1e-20,maxiter=1e4,
HEPTOT=HEPTOT,SID = SID, pK1=pK1,pK2=pK2,pK3=pK3)$root))

(Yes, Vectorize() doesn't make the function any faster, but I don't see
an easy way to rewrite this function to make it truly vectorised.)

Unfortunately, nls() seems to take a step somewhere where crossprod()
of the Jacobian of the model function cannot be inverted and fails with
the message "Singular gradient". I wish that R could have a more
reliable built-in nonlinear least squares solver. (I could also be
holding it wrong.) Meanwhile, we have excellent CRAN packages nlsr and
minpack.lm:

minpack.lm::nlsLM(
 pHobs ~ pHm(SID, pK1, pK2, pK3),
 data.frame(pHobs = pHobs, SID = SID),
 start = c(pK1 = pK1, pK2 = pK2, pK3 = pK3),
 # the following is also needed to avoid MINPACK failing to fit
 lower = rep(-1, 3), upper = rep(9, 3)
)
# Nonlinear regression model
#   model: pHobs ~ pHm(SID, pK1, pK2, pK3)
#data: data.frame(pHobs = pHobs, SID = SID)
#pK1 pK2pK3
# -1.000   2.966  7.606
#  residual sum-of-squares: 0.001195
#
# Number of iterations to convergence: 15 
# Achieved convergence tolerance: 1.49e-08

(Unfortunately, your code seemed to have a lot of non-breakable spaces,
which confuse R's parser and make it harder to copy it.)

I think that your function can also be presented as a degree-5
polynomial in H, so it should also be possible to use polyroot() to
obtain your solutions in a more exact manner.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strptime with +03:00 zone designator

2023-11-06 Thread Joshua Ulrich
On Mon, Nov 6, 2023 at 3:02 AM Martin Maechler
 wrote:
>
> > Richard O'Keefe
> > on Mon, 6 Nov 2023 18:37:34 +1300 writes:
>
> > Thanks to all who replied.  On Mon, 6 Nov 2023 at 18:37,
> > Richard O'Keefe  wrote:
>
> >> OK, so the consensus is (1) One cannot make strptime
> >> accept ISO8601-compliant zone designators (2) The
> >> lubridate package can (3) Or one can hack away with
> >> regex.  Lubridate it is, then.
> >>
> >> But I do regard strptime's inability to process
> >> ISO8601-compliant zone designators as a bug.
>
> Did you try to submit it to R's bugzilla?
>
> It's the first time I hear of this "Feature" of the ISO
> standard, but then I'm not at all a timezone, and even less an
> ISO standard expert.
>
FWIW, the timezone offset format (%z) is handled in
src/main/Rstrptime.h around line 987. There's a comment that only the
RFC 822 form is recognized (+/-HHMM). The fix may be as simple as
ignoring the ':' in the while() loop.

> Best,
> Martin

>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] non-linear regression and root finding

2023-11-06 Thread Troels Ring
Dear friends - I have a function for the charge in a fluid (water) 
buffered with HEPES and otherwise only containing Na and Cl so that [Na] 
- [Cl] = SID (strong ion difference) goes from -1 mM to 1 mM. With known 
SID and total HEPES concentration I can calculate accurately the pH if I 
know 3 pK values for HEPES by finding the single root with uniroot


Now, the problem is that there is some disagreement in the literature 
what is the correct value for the 3 pKs. I know the 3 pK values have the 
relationship pK1 < pK2 <- pK3 and for the most common formulation of 
HEPES I know the charge on fully protonated species is 2. Hence I can 
generate a huge number of pK values from uniform distribution by taking 
sort(runif(3,-1,9) and make sure there are no ties and then run all 
those triplets of pK values to find pH values and thereby find the 
lowest deviation vis a vis measured values. That works but requires many 
triplets and is not satisfying. Hence I wonder if I could somehow have 
non linear regression to find the 3 pK values. Below is HEPESFUNC which 
delivers charge in the fluid for known pKs, HEPTOT and SID. Is it 
possible to have root-finding in the formula with nls? (I know the 
precision asked for is extreme but it has worked well in really many 
applications).


All best wishes

Troels Ring, MD
Aalborg, Denmark


HEPESFUNC <-
  function(H,SID,HEPTOT,pK1,pK2,pK3) {
    XX <- (H^3/(10^-pK3*10^-pK2*10^-pK1)+H^2/(10^-pK3*10^-pK2)+H/(10^-pK3))
    IV <- HEPTOT/(1+XX)
    I <- IV*H^3/(10^-pK3*10^-pK2*10^-pK1)
    II <- IV*H^2/(10^-pK3*10^-pK2)
    III <- IV*H/10^-pK3
    H - kw/H + SID + I*2 + II - IV
  }


HEPTOT <- 0.050
SID <- c(-seq(10,1)*1e-4,0,seq(1,10)*1e-4)

pHobs  <- c(4.63,4.68,4.72,4.77,4.83,4.9,4.96,5.04,5.12,5.21,5.3,
    5.39,5.48,5.55,5.63,5.69,5.74,5.8,5.85,5.89,5.93)

pK1 <- -1; pK2 <- 3; pK3 <- 7.55 # literature values
pK3 <- 7.6; pK2 <- 2.96 # values eye-balled to be better

pH <- c()
for (i in 1:length(SID)) {
  pH[i] <- -log10(uniroot(HEPESFUNC,c(1e-20,1),tol=1e-20,maxiter=1e4,
   HEPTOT=HEPTOT,SID = SID[i],
   pK1=pK1,pK2=pK2,pK3=pK3)$root)
}

plot(SID,pH)
points(SID,pHobs,col="red")

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I need to create new variables based on two numeric variables and one dichotomize conditional category

2023-11-06 Thread Jorgen Harmse via R-help
Avi: Thank you for checking. I think the optimization is limited. If test is 
all TRUE or all FALSE then at most one vector is evaluated. Anything beyond 
that would be very complicated. (Inspect the two expressions and verify that 
both specify elementwise computations. Then use indexing to shrink the input 
properly. Take into account all recycling rules for binary operations.)


> ifelse(0:1, log(-1:0), 1:2)

Warning in log(-1:0) : NaNs produced

[1]1 -Inf

> ifelse(c(FALSE,FALSE), log(-1:0), 1:2)

[1] 1 2

I agree that nested ifelse is cumbersome. I wrote a function to address that:


#' Nested conditional element selection

#'

#' \code{ifelses(test1,yes1,test2,yes2,,no)} is shorthand for

#' \code{ifelse(test1,yes1,ifelse(test2,yes2,,no))}. The inputs should

#' not be named.

#'

#' @param test1 usually \code{test} for the outer call to \code{\link{ifelse}}

#' @param yes1 \code{yes} for the outer call to \code{ifelse}

#' @param ... usually the \code{(test,yes)} for nested calls followed by 
\code{no}

#' for the innermost call to \code{ifelse}

#'

#' @note There must be an odd number of inputs. If there is exactly one input 
then it is

#' returned (unless it is named \code{yes1}): this supports the recursive 
implementation.

#'

#' @return a vector with entries from \code{yes1} where \code{test1} is 
\code{TRUE}, else from

#' \code{yes2} where \code{test2} is \code{TRUE}, ..., and from \code{no} where 
none of

#' the conditions holds

#'

#' @export



ifelses <- function(test1,yes1,...)

{ if (missing(test1))

  { if (!missing(yes1) || length(L <- list(...)) != 1L)

  stop("Wrong number of arguments or confusing argument names.")

return(L[[1L]])

  }

  if (missing(yes1))

  { if (length(L <- list(...)) != 0L)

  stop("Wrong number of arguments or confusing argument names.")

return(test1)

  }

  return( ifelse(test1, yes1, ifelses(...)) )

}

Regards,
Jorgen Harmse (not Jordan).

--

Message: 10
Date: Sat, 4 Nov 2023 01:08:03 -0400
From: 
To: "'Jorgen Harmse'" 
Cc: 
Subject: Re: [R] [EXTERNAL] RE: I need to create new variables based
on two numeric variables and one dichotomize conditional category
variables.
Message-ID: <019a01da0edc$e41c39e0$ac54ada0$@gmail.com>
Content-Type: text/plain; charset="utf-8"

To be fair, Jordan, I think R has some optimizations so that the arguments
in some cases are NOT evaluated until needed. So only one or the other
choice ever gets evaluated for each row. My suggestion merely has
typographic implications and some aspects of clarity and minor amounts of
less memory and parsing needed.

But ifelse() is currently implemented somewhat too complexly for my taste.
Just type "ifelse" at the prompt and you will see many lines of code that
handle various scenarios.

�

If you later want to add categories such as �transgender� with a value of 61 or 
have other numbers for groups like �Hispanic male�, you can amend the 
instructions as long as you put your conditions in an order so that they are 
tried until one of them matches, or it takes the default. Yes, in a sense the 
above is doable using a deeply nested ifelse() but easier for me to read and 
write and evaluate. It may not be more efficient or may be as some of dplyr is 
compiled code.






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXTERNAL] Re: I need to create new variables based on two numeric variables and one dichotomize conditional category variables.

2023-11-06 Thread Jorgen Harmse via R-help
That�s ingenious, but I would hesitate to rely on a specific mapping between 
strings and integers. (I usually read data frames with stringsAsFactors=FALSE 
or coerce to character later: I don�t think it takes more memory.) Maybe create 
another column with the coefficients. What if gender is part of another formula?

Regards,
Jorgen Harmse.

From: CALUM POLWART 
Date: Saturday, November 4, 2023 at 18:23
To: avi.e.gr...@gmail.com 
Cc: Jorgen Harmse , r-help@r-project.org 
, mkzama...@gmail.com 
Subject: [EXTERNAL] Re: [R] I need to create new variables based on two numeric 
variables and one dichotomize conditional category variables.
I might have factored the gender.

I'm not sure it would in any way be quicker.  But might be to some extent 
easier to develop variations of. And is sort of what factors should be doing...

# make dummy data
gender <- c("Male", "Female", "Male", "Female")
WC <- c(70,60,75,65)
TG <- c(0.9, 1.1, 1.2, 1.0)
myDf <- data.frame( gender, WC, TG )

# label a factor
myDf$GF <- factor(myDf$gender, labels= c("Male"=65, "Female"=58))

# do the maths
myDf$LAP <- (myDf$WC - as.numeric(myDf$GF))* myDf$TG

#show results
head(myDf)

gender WC  TG GF  LAP
1   Male 70 0.9 58 61.2
2 Female 60 1.1 65 64.9
3   Male 75 1.2 58 87.6
4 Female 65 1.0 65 64.0


(Reality: I'd have probably used case_when in tidy to create a new numeric 
column)




The equation to
calculate LAP is different for male and females. I am giving both equations
below.

LAP for male = (WC-65)*TG
LAP for female = (WC-58)*TG

My question is 'how can I calculate the LAP and create a single new column?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cryptic error for mscmt function

2023-11-06 Thread Ivan Krylov
В Sun, 5 Nov 2023 13:35:39 +
"Leu  Thierry"  пишет:

> However, when trying to run it I get a very cryptic error message
> saying  "Error in lst[[nam]][intersect(tim, rownames(lst[[nam]])),
> cols, drop = FALSE]: subscript out of bounds".

Without a way to reproduce the error, I can offer a few bits of generic
advice:

1. Use traceback() to find out where the error happens. You can then
type the name of the function at the R prompt to read its source code
(although most likely without the comments).

2. Even better, set options(error = recover) before running your code
and have the debugger launched at the point where the error happens.
Use the debugger (see help(browser) to look at every variable and find
out why indeed lst[[nam]] doesn't seem to contain rows with names
intersect(tim, rownames(lst[[nam]])) and/or columns with names `cols`.

3. See the free book The R Inferno
 for more
advice on debugging R code.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cryptic error for mscmt function

2023-11-06 Thread Rui Barradas

Às 13:35 de 05/11/2023, Leu Thierry escreveu:

Hi everyone,


I am trying to conduct a synthetic control analysis using the MSCMT package. However, when 
trying to run it I get a very cryptic error message saying  "Error in 
lst[[nam]][intersect(tim, rownames(lst[[nam]])), cols, drop = FALSE]: subscript out of 
bounds". Does anyone know what this means and why I receive this error? I attached the 
code & dataset used in the attachment. Thanks a lot!


Best regards

Thierry

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Hello,

No attachment came through the filters, can you resend in plain text or 
if it was a .R file, rename it .txt?


See [1], section General Instructions for more on this

[1] https://www.r-project.org/mail.html#instructions

Hope this helps,

Rui Barradas

--
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença 
de vírus.
www.avg.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cryptic error for mscmt function

2023-11-06 Thread Leu Thierry
Hi everyone,


I am trying to conduct a synthetic control analysis using the MSCMT package. 
However, when trying to run it I get a very cryptic error message saying  
"Error in lst[[nam]][intersect(tim, rownames(lst[[nam]])), cols, drop = FALSE]: 
subscript out of bounds". Does anyone know what this means and why I receive 
this error? I attached the code & dataset used in the attachment. Thanks a lot!


Best regards

Thierry

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] strptime with +03:00 zone designator

2023-11-06 Thread Martin Maechler
> Richard O'Keefe 
> on Mon, 6 Nov 2023 18:37:34 +1300 writes:

> Thanks to all who replied.  On Mon, 6 Nov 2023 at 18:37,
> Richard O'Keefe  wrote:

>> OK, so the consensus is (1) One cannot make strptime
>> accept ISO8601-compliant zone designators (2) The
>> lubridate package can (3) Or one can hack away with
>> regex.  Lubridate it is, then.
>> 
>> But I do regard strptime's inability to process
>> ISO8601-compliant zone designators as a bug.

Did you try to submit it to R's bugzilla?

It's the first time I hear of this "Feature" of the ISO
standard, but then I'm not at all a timezone, and even less an
ISO standard expert.

Best,
Martin


>> On Mon, 6 Nov 2023 at 13:18, jim holtman
>>  wrote:
>> 
>>> try using 'lubridate'
>>> 
>>> > library(lubridate)Attaching package: ‘lubridate’
>>> 
>>> The following objects are masked from ‘package:base’:
>>> 
>>> date, intersect, setdiff, union > x <-
>>> "2017-02-28T13:35:00+03:00"> ymd_hms(x)[1] "2017-02-28
>>> 10:35:00 UTC"
>>> 
>>> >
>>> 
>>> 
>>> 
>>> Thanks
>>> 
>>> Jim Holtman *Data Munger Guru*
>>> 
>>> 
>>> *What is the problem that you are trying to solve?Tell
>>> me what you want to do, not how you want to do it.*
>>> 
>>> 
>>> On Sun, Nov 5, 2023 at 3:45 PM Richard O'Keefe
>>>  wrote:
>>> 
 I have some data that includes timestamps like this:
 2017-02-28T13:35:00+03:00 The documentation for
 strptime says that %z expects an offset like 0300.  I
 don't see any way in the documentation to get it to
 accept +hh:mm with a colon separator, and everything I
 tried gave me NA as the answer.
 
 Section 4.2.5.1 of ISO 8601:2004(E) allows both the
 absence of colons in +hh[mm] (basic format) and the
 presence of colons in +hh:mm (extended format).  Again
 in section 4.2.5.2 where a zone offset is combined with
 a time of day: if you have hh:mm:ss you are using
 extended format and the offset MUST have a colon; if
 you have hhmmss you are using basic format and the
 offset MUST NOT have a colon.  And again in section
 4.3.2 (complete representations of date and time of
 day).  If you use hyphens and colons in the date and
 time part you MUST have a colon in the zone designator.
 
 So I am dealing with timestamps in strict ISO 8601
 complete extended representation, and it is rather
 frustrating that strptime doesn't deal with it simply.
 
 The simplest thing would be for R's own version of
 strptime to allow an optional colon between the hour
 digits and the minute digits of a zone designator.
 
 I'm about to clone the data source and edit it to
 remove the colons, but is there something obvious I am
 missing?
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and
 more, see https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide
 commented, minimal, self-contained, reproducible code.
 
>>> 

>   [[alternative HTML version deleted]]

> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and
> more, see https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide
> commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.