from:"Bert Gunter"

Re: [R] test2r.mengz1(X)

2024-05-13 Thread Bert Gunter

... but maybe cocor() in the cocor package will do do what you want.

-- Bert


On Mon, May 13, 2024 at 1:57 PM Duncan Murdoch  wrote:
>
> Google says that function is in the bcdstats package, which isn't on
> CRAN.  It appears to be a private package for a course, kept on Github.
>
> Duncan Murdoch
>
> On 2024-05-13 12:24 p.m., Alligand, Justine wrote:
> > Dear participants and subscribers of the R-help mailing list,
> >
> > I would like to compare two dependent correlations with one overlapping 
> > variable. The Meng Z1 method seemed suitable for this purpose. I found out 
> > that the "cocor" package is required for this instrument. I checked several 
> > times whether the package was installed and activated and both were the 
> > case. I have also tried the "psych" package, but I get the same error in 
> > both cases:
> >> library(cocor)
> >> library(psych)
> >> test2r.mengz1(corAVOBMI, corAVVBMI, corAVOAVV, 39)
> > Error in test2r.mengz1(corAVOBMI, corAVVBMI, corAVOAVV, 39) :
> >could not find function "test2r.mengz1" (Translated from German)
> > Do I not have the right package? Or can you recognize another error?
> > Many thanks in advance.
> >
> > Yours faithfully
> > Justine Alligand
> >
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [R-sig-ME] lmer error: number of observations <= number of random effects

2024-05-07 Thread Bert Gunter

I think you should seek out a local statistician with whom to consult if at
all possible, as the details of your research goals and the nature of the
data you have to meet those goals matter and cannot be effectively
discussed in a remote forum like this. That is, to be blunt, you seem to be
risk of producing junk. Just my opinion, which you are of course free to
ignore. Certainly, no response needed, and I will not say anything further.

Cheers,
Bert

On Tue, May 7, 2024 at 9:12 AM Srinidhi Jayakumar via R-help <
r-help@r-project.org> wrote:

> Thank you very  much for your responses!
>
> What if I reduce the model to
>  modelLSI3 <- lmer(SA ~ Index1* LSI+ (1+LSI |ID),data = LSIDATA, control =
> lmerControl(optimizer ="bobyqa"), REML=TRUE).
> This would allow me to see the random effects of LSI and I can drop the
> random effect of age (Index1) since I can see that in the unconditional
> model [model0 <- lmer(SA ~ Index1+ (1+Index1|ID),data = LSIDATA, control =
> lmerControl(optimizer ="bobyqa"), REML=TRUE)]. Would the modelLSI3 also
> have a type 1 error?
>
> Thank you,
> Srinidhi
>
>
>
>
> On Mon, 6 May 2024, 03:11 TT FF,  wrote:
>
> > See if this paper may help If it helps reducing the model when you have
> > few observations. the (1|ID) may increase the type 1 error.
> > https://journals.sagepub.com/doi/10.1177/25152459231214454
> >
> > Best
> >
> > On 6 May 2024, at 07:45, Thierry Onkelinx via R-sig-mixed-models <
> > r-sig-mixed-mod...@r-project.org> wrote:
> >
> > Dear Srinidhi,
> >
> > You are trying to fit 1 random intercept and 2 random slopes per
> > individual, while you have at most 3 observations per individual. You
> > simply don't have enough data to fit the random slopes. Reduce the random
> > part to (1|ID).
> >
> > Best regards,
> >
> > Thierry
> >
> > ir. Thierry Onkelinx
> > Statisticus / Statistician
> >
> > Vlaamse Overheid / Government of Flanders
> > INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
> AND
> > FOREST
> > Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> > thierry.onkel...@inbo.be
> > Havenlaan 88 bus 73, 1000 Brussel
> > *Postadres:* Koning Albert II-laan 15 bus 186, 1210 Brussel
> > *Poststukken die naar dit adres worden gestuurd, worden ingescand en
> > digitaal aan de geadresseerde bezorgd. Zo kan de Vlaamse overheid haar
> > dossiers volledig digitaal behandelen. Poststukken met de vermelding
> > ‘vertrouwelijk’ worden niet ingescand, maar ongeopend aan de
> geadresseerde
> > bezorgd.*
> > www.inbo.be
> >
> >
> >
> ///
> > To call in the statistician after the experiment is done may be no more
> > than asking him to perform a post-mortem examination: he may be able to
> say
> > what the experiment died of. ~ Sir Ronald Aylmer Fisher
> > The plural of anecdote is not data. ~ Roger Brinner
> > The combination of some data and an aching desire for an answer does not
> > ensure that a reasonable answer can be extracted from a given body of
> data.
> > ~ John Tukey
> >
> >
> ///
> >
> > 
> >
> >
> > Op ma 6 mei 2024 om 01:59 schreef Srinidhi Jayakumar via
> R-sig-mixed-models
> > :
> >
> > I am running a multilevel growth curve model to examine predictors of
> > social anhedonia (SA) trajectory through ages 12, 15 and 18. SA is a
> > continuous numeric variable. The age variable (Index1) has been coded as
> 0
> > for age 12, 1 for age 15 and 2 for age 18. I am currently using a time
> > varying predictor, stress (LSI), which was measured at ages 12, 15 and
> 18,
> > to examine whether trajectory/variation in LSI predicts difference in SA
> > trajectory. LSI is a continuous numeric variable and was grand-mean
> > centered before using in the models. The data has been converted to long
> > format with SA in 1 column, LSI in the other, ID in another, and age in
> > another column. I used the code below to run my model using lmer.
> However,
> > I get the following error. Please let me know how I can solve this error.
> > Please note that I have 50% missing data in SA at age 12.
> > modelLSI_maineff_RE <- lmer(SA ~ Index1* LSI+ (1 + Index1+LSI |ID), data
> =
> > LSIDATA, control = lmerControl(optimizer ="bobyqa"), REML=TRUE)
> > summary(modelLSI_maineff_RE)
> > Error: number of observations (=1080) <= number of random effects (=1479)
> > for term (1 + Index1 + LSI | ID); the random-effects parameters and the
> > residual variance (or scale parameter) are probably unidentifiable
> >
> > I did test the within-person variance for the LSI variable and the
> > within-person variance is significant from the Greenhouse-Geisser,
> > Hyunh-Feidt tests.
> >
> > I also tried control = lmerControl(check.nobs.vs.nRE = "ignore") which
> gave
> > me the following output. modelLSI_maineff_RE <- lmer(SA ~ Index1* LSI+
> (1 +
> > Index1+LSI |ID), data

Re: [R] x[0]: Can '0' be made an allowed index in R?

2024-04-23 Thread Bert Gunter

"This works with any single-index value, and lets all the existing
operations for such values continue to work."

As Peter Dalgaard already pointed out, that is false.

> x <- 1:4
> x[-1]
[1] 2 3 4
> elt(x,-0)
[1] 1

Cheers,
Bert

On Tue, Apr 23, 2024 at 4:55 PM Richard O'Keefe  wrote:
>
> Does it have to use square bracket syntax?
> elt <- function (x, i) x[i+1]
> "elt<-" <- function (x, i, value) { x[i+1] <- value; x }
>
> > u <- c("A","B","C")
> > elt(u,0)
> [1] "A"
> > elt(u,length(u)-1)
> [1] "C"
> > elt(u,0) <- "Z"
> > u
> [1] "Z" "B" "C"
>
> This works with any single-index value, and lets all the existing
> operations for such values continue to work.  It seems to me to be the
> simplest and cleanest way to do things, and has the advantage of
> highlighting to a human reader that this is NOT normal R indexing.
>
> On Sun, 21 Apr 2024 at 19:56, Hans W  wrote:
> >
> > As we all know, in R indices for vectors start with 1, i.e, x[0] is not a
> > correct expression. Some algorithms, e.g. in graph theory or combinatorics,
> > are much easier to formulate and code if 0 is an allowed index pointing to
> > the first element of the vector.
> >
> > Some programming languages, for instance Julia (where the index for normal
> > vectors also starts with 1), provide libraries/packages that allow the user
> > to define an index range for its vectors, say 0:9 or 10:20 or even negative
> > indices.
> >
> > Of course, this notation would only be feasible for certain specially
> > defined vectors. Is there a library that provides this functionality?
> > Or is there a simple trick to do this in R? The expression 'x[0]' must
> > be possible, does this mean the syntax of R has to be twisted somehow?
> >
> > Thanks, Hans W.
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] System GMM yields identical results for any weighting matrix

2024-04-23 Thread Bert Gunter

Sorry, my advice is incorrect (and I should have known better!). Task
views are to search for relevant packages, not for posting questions.
R "special interest groups" (SIGs) are for the latter, which in your
case might be this one:
https://stat.ethz.ch/mailman/listinfo/r-sig-finance .

Cheers,
Bert


On Tue, Apr 23, 2024 at 11:45 AM Bert Gunter  wrote:
>
> Generally speaking, this sort of detailed statistical question about a
> speccial package in R does not get a reply on this general R
> programming help list. Instead, I suggest you either email the
> maintainer (found by ?maintainer) or ask a question on a relevant R
> task view, such as
> https://cran.r-project.org/web/views/Econometrics.html . (or any other
> that you judge to be more appropriate).
>
> Cheers,
> Bert
>
> On Tue, Apr 23, 2024 at 9:41 AM Richard Hardy  wrote:
> >
> > A copy of this question can be found on Cross Validated:
> > https://stats.stackexchange.com/questions/645362
> >
> > I am estimating a system of seemingly unrelated regressions (SUR) in R.
> > Each of the equations has one unique regressor and one common regressor. I
> > am using `gmm::sysGmm` and am experimenting with different weighting
> > matrices. I get the same results (point estimates, standard errors and
> > anything else that I can see (**except** for the value of the $J$-test)
> > regardless of the weighting matrix. I do not think this is correct.
> > The phenomenon persists regardless of what type of covariance matrix
> > estimator I use: `MDS`, `CondHom` or `HAC`.
> > It also persists regardless of whether I use unrestricted estimation or
> > restrict the coefficients on one of the variables (the common regressor) to
> > be equal across equations.
> >
> > **Question:** Why does system GMM via `gmm::sysGmm` yield identical results
> > for any weighting matrix? How can I make it yield proper results that vary
> > with the weighting matrix (if that makes sense and I am not mistaken, of
> > course)?
> >
> > -- R code for a reproducible example
> >
> > library(gmm)
> > library(systemfit)
> >
> > # Generate and prepare the data
> > n <- 1000 # sample size
> > m <- 100  # length of the "second part" of the sample
> > N <- 3# number of equations
> > set.seed(321); x <- matrix(rnorm(n*N),ncol=N); colnames(x) <-
> > paste0("x",1:N) # generate regressors
> > dummy <- c( rep(0,n-m), rep(1,m) ) # generate a common regressor
> > x <- cbind(x,dummy)# include the common regressor with the
> > rest of the regressors
> > set.seed(123); y <- matrix(rnorm(n*N),ncol=N); colnames(y) <-
> > paste0("y",1:N) # a placeholder for dependent variables
> > for(i in 1:N){
> >  y[,i] <- i + sqrt(i)*x[,i] - i*dummy + y[,i]*15*sqrt(i)
> >  # y[,i] is a linear function of x[,i] and dummy,
> >  # plus an error term with equation-specific variance
> > }
> > data1 <- as.data.frame(cbind(y,x)) # create a data frame of all data (y and
> > x)
> >
> > # Create the model equations and moment conditions
> > ES_g = ES_h <- list() # ES ~ equation system
> > for(i in 1:N){
> >  ES_g[[i]] <- as.formula(assign(paste0("eq",i), value=paste0("y",i," ~
> > x",i," + dummy"))) # define linear equations of SUR
> >  ES_h[[i]] <- as.formula(assign(paste0("eq",i), value=paste0(   "~
> > x",i," + dummy"))) # define the moment conditions for GMM
> > }
> >
> > # Estimate a WLS-type weighting matrix to use as a user-specified weighting
> > matrix in GMM
> > m0 <- systemfit(formula=ES_g, method="OLS", data=data1)
> > OLSmat <- diag(diag(m0$residCov)); Wmat <- solve(OLSmat)
> >
> > # Choose the type of covariance matrix in GMM
> > vc1 <- "MDS"
> > vc1 <- "CondHom"
> > vc1 <- "HAC"
> > #vc1 <- "TrueFixed"
> >
> > # Choose between restricted and unrestricted estimation
> > cec1=NULL # unrestricted
> > cec1=3# restrict the coefficient on the dummy to be equal across
> > equations
> >
> > # Estimate the model with `sysGmm` using different weighting matrices:
> > identity, "optimal" and manually specified
> > m1a <- sysGmm(g=ES_g, h=ES_h, wmatrix="ident"  , weightsMatrix=NULL,
> > vcov=vc1, crossEquConst=cec1, data=data1); summary(m1a)
> > m1b <- sysGmm(g=ES_g, h=ES_h, wmatrix="optimal", weightsMatrix=NULL,
> > vcov=vc1, crossEqu

Re: [R] System GMM yields identical results for any weighting matrix

2024-04-23 Thread Bert Gunter

Generally speaking, this sort of detailed statistical question about a
speccial package in R does not get a reply on this general R
programming help list. Instead, I suggest you either email the
maintainer (found by ?maintainer) or ask a question on a relevant R
task view, such as
https://cran.r-project.org/web/views/Econometrics.html . (or any other
that you judge to be more appropriate).

Cheers,
Bert

On Tue, Apr 23, 2024 at 9:41 AM Richard Hardy  wrote:
>
> A copy of this question can be found on Cross Validated:
> https://stats.stackexchange.com/questions/645362
>
> I am estimating a system of seemingly unrelated regressions (SUR) in R.
> Each of the equations has one unique regressor and one common regressor. I
> am using `gmm::sysGmm` and am experimenting with different weighting
> matrices. I get the same results (point estimates, standard errors and
> anything else that I can see (**except** for the value of the $J$-test)
> regardless of the weighting matrix. I do not think this is correct.
> The phenomenon persists regardless of what type of covariance matrix
> estimator I use: `MDS`, `CondHom` or `HAC`.
> It also persists regardless of whether I use unrestricted estimation or
> restrict the coefficients on one of the variables (the common regressor) to
> be equal across equations.
>
> **Question:** Why does system GMM via `gmm::sysGmm` yield identical results
> for any weighting matrix? How can I make it yield proper results that vary
> with the weighting matrix (if that makes sense and I am not mistaken, of
> course)?
>
> -- R code for a reproducible example
>
> library(gmm)
> library(systemfit)
>
> # Generate and prepare the data
> n <- 1000 # sample size
> m <- 100  # length of the "second part" of the sample
> N <- 3# number of equations
> set.seed(321); x <- matrix(rnorm(n*N),ncol=N); colnames(x) <-
> paste0("x",1:N) # generate regressors
> dummy <- c( rep(0,n-m), rep(1,m) ) # generate a common regressor
> x <- cbind(x,dummy)# include the common regressor with the
> rest of the regressors
> set.seed(123); y <- matrix(rnorm(n*N),ncol=N); colnames(y) <-
> paste0("y",1:N) # a placeholder for dependent variables
> for(i in 1:N){
>  y[,i] <- i + sqrt(i)*x[,i] - i*dummy + y[,i]*15*sqrt(i)
>  # y[,i] is a linear function of x[,i] and dummy,
>  # plus an error term with equation-specific variance
> }
> data1 <- as.data.frame(cbind(y,x)) # create a data frame of all data (y and
> x)
>
> # Create the model equations and moment conditions
> ES_g = ES_h <- list() # ES ~ equation system
> for(i in 1:N){
>  ES_g[[i]] <- as.formula(assign(paste0("eq",i), value=paste0("y",i," ~
> x",i," + dummy"))) # define linear equations of SUR
>  ES_h[[i]] <- as.formula(assign(paste0("eq",i), value=paste0(   "~
> x",i," + dummy"))) # define the moment conditions for GMM
> }
>
> # Estimate a WLS-type weighting matrix to use as a user-specified weighting
> matrix in GMM
> m0 <- systemfit(formula=ES_g, method="OLS", data=data1)
> OLSmat <- diag(diag(m0$residCov)); Wmat <- solve(OLSmat)
>
> # Choose the type of covariance matrix in GMM
> vc1 <- "MDS"
> vc1 <- "CondHom"
> vc1 <- "HAC"
> #vc1 <- "TrueFixed"
>
> # Choose between restricted and unrestricted estimation
> cec1=NULL # unrestricted
> cec1=3# restrict the coefficient on the dummy to be equal across
> equations
>
> # Estimate the model with `sysGmm` using different weighting matrices:
> identity, "optimal" and manually specified
> m1a <- sysGmm(g=ES_g, h=ES_h, wmatrix="ident"  , weightsMatrix=NULL,
> vcov=vc1, crossEquConst=cec1, data=data1); summary(m1a)
> m1b <- sysGmm(g=ES_g, h=ES_h, wmatrix="optimal", weightsMatrix=NULL,
> vcov=vc1, crossEquConst=cec1, data=data1); summary(m1b)
> m1c <- sysGmm(g=ES_g, h=ES_h,weightsMatrix=Wmat,
> vcov=vc1, crossEquConst=cec1, data=data1); summary(m1c)
>
> -- R session info:
>
> R version 4.3.3 (2024-02-29 ucrt)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 19045)
>
> Matrix products: default
>
> locale:
> [1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United
> States.utf8
> [3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C
>
> [5] LC_TIME=English_United States.utf8
>
> time zone: Europe/Berlin
> tzcode source: internal
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> other attached packages:
> [1] systemfit_1.1-30 lmtest_0.9-40zoo_1.8-12   car_3.1-2
>  carData_3.0-5Matrix_1.6-1
> [7] gmm_1.8  sandwich_3.0-2
>
> loaded via a namespace (and not attached):
> [1] MASS_7.3-60.0.1   compiler_4.3.3tools_4.3.3   abind_1.4-5
> rstudioapi_0.15.0 grid_4.3.3
> [7] lattice_0.22-5
>
> --
>
> Kind regards,
> Richard
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more,

[R] Fwd: passing a modified argument to an S3 method

2024-04-20 Thread Bert Gunter

## Neglected to cc to list

-- Forwarded message -
From: Bert Gunter 
Date: Sat, Apr 20, 2024 at 1:26 PM
Subject: Re: [R] passing a modified argument to an S3 method
To: CRAN.r 

Well, my interpretation of your explanation is as follows:

You have a long list of "control" values, named "control" in my
example below with all fixed default values.  You also have an S3
generic, named e.g. "test", with many methods. Your methods all use
the control list and possibly other arguments that will vary from
method to method. Also, some methods may want to change some of the
fixed values of control but leave most of them unaltered. The altered
list can change from method to method. Also, some of a method's
altered values may have a default value that you do not wish to
specify each time for the method.

In my example below, I have shown how this can be done. I hope it
approximates what you want to do. However, I second Jeff's
recommendation that you do some reading on your own and **NOT** rely
on my (probably flawed) interpretation, which just is meant to suggest
what could be done (e.g. via ?modifyList) rather than point you to an
exact or best solution. There are many good tutorials out there in
addition to Jeff's suggestion that you can search for.
--

control <- list (a=2, b=3, c= "hi there")
## list of all control parameters with defaults

## The generic
test <- function(x,...){
   UseMethod("test")
}
## So test will dispatch methods on the 'x' object
##
test.default <- function(x,
  control = control,
  ... ## any additional non control arguments
  )
{
   ## default code here such as:
   cat(control$c, "\n default method result returned\n")
   ..1 ## first argument in ... list
}

## An example:
> test(x =3, control, y = 'abcd')
hi there
 default method result returned
[1] "abcd"

## Now a 'foo' method
test.foo <- function(x, control = control, ## fixed list of defaults again
 change, ## list of changed default values
 ## with no defaults for this method
 special = list(c = "So long"),
 ## list of changes with defaults for this method
 ##  If the method does not use its default change it
 ##  must be given explicitly here.
 y = 0 ## additional parameter for this method
 ## with a default
)
   {
   print(control$c)
   cat("Using x = ", x, "\n")
   before <- with(control, sum(a, b, y)) ## before modifying values
   change <- c(change, special) ## get all the changes to the control list
   control <- modifyList(control, val = change)
   after <- with(control, sum(a, b, y))
   cat("Results are: before = ",before, "  after = ", after, "\n\n")
   print(control$c)
   invisible(list(before, after)) ## return but don't print
}

## construct object of class 'foo'
> x <- structure("method foo", class = "foo")

## Now call the foo method on it.
## Note that the 'special' argument can be omitted, since the default
change to the
## 'c' control parameter will be used
## Note also that the 'a' parameter in the control list is used and so does not
## have to be specified in the change list.

> test(x, control, change = list( b = 100), y = 11)
[1] "hi there"
Using x =  method foo
Results are: before =  16   after =  113

[1] "So long"

## Specify a different 'special' value
> test(x, control, special = list( c="Au revoir"),  change = list( b = 100), y 
> = 11)
[1] "hi there"
Using x =  method foo
Results are: before =  16   after =  113

[1] "Au revoir"

Cheers,
Bert

On Sat, Apr 20, 2024 at 9:03 AM CRAN.r  wrote:
>
> I've got a complicated default value for an argument, basically a "control" 
> argument that's a list with a lot of defaults, and I want to figure out its 
> value before dispatching to a method so I don't need to have the same code 
> repeated at the beginning of every method. None of the default values should 
> be method-dependent, so I don't need or want each method to figure out the 
> value separately. I hope that helps!
>
> Jay
>
>
> On Saturday, April 20th, 2024 at 10:51 AM, Bert Gunter 
>  wrote:
>
> > I do not understand what your goal is here (more context may be helpful, 
> > perhaps to others rather than me). So I doubt this is what you want, but 
> > here is a guess -- no need to respond if it is unhelpful:
> >
> > ## test.default returns NULL if object "y" not found in **calling 
> > environment and enclosures**;
> > ## otherwise y.
> >
> > test <- function(x){
> > UseMethod("te

Re: [R] passing a modified argument to an S3 method

2024-04-20 Thread Bert Gunter

I do not understand what your goal is here (more context may be helpful,
perhaps to others rather than me). So I doubt this is what you want, but
here is a guess -- no need to respond if it is unhelpful:

## test.default returns NULL if object "y" not found in **calling
environment and enclosures**;
## otherwise y.

test <- function(x){
   UseMethod("test")
}
test.default <- function(x){
   tryCatch(y, error = function(e)NULL)
}

## y not found
test(x=3)
NULL

## y found
> y <- 'abcd'
> test(x = 3)
[1] "abcd"

Cheers,
Bert

On Sat, Apr 20, 2024 at 4:23 AM CRAN.r via R-help 
wrote:

> Is there a way to pass a modified argument from an S3 generic to a
> method?  Here's a non-working example that I want to return "abcd".
>
>   test <- function(x, y = NULL){
> y <- "abcd"
> UseMethod("test")
>   }
>   test.default <- function(x, y = NULL) y
>   test(x = 3)
>
> Is that possible? I've looked around a lot, but can't find any examples or
> discussion.
>
> Jay
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Synthetic Control Method

2024-04-16 Thread Bert Gunter

Note that your unit.variable and unit.names.variable are indentical. Is
this what you intended?

(I have no idea how the Synth package works).

Bert

On Tue, Apr 16, 2024 at 12:58 AM  wrote:

> Good Morning
>
>
>
> I want to perform a synthetic control method with R. For this purpose, I
> created the following code:
>
>
>
> # Re-load packages
>
> library(Synth)
>
> library(readxl)
>
>
>
> # Pfadeinstellung Excel-Blatt
>
> excel_file_path <-
> ("C:\\Users\\x\\Desktop\\DATA_INVESTMENTVOLUMEN_FOR_R_WITHOUT_NA.xlsx")
>
>
>
> # Load the Excel file
>
> INVESTMENTVOLUME <- read_excel(excel_file_path)
>
>
>
> # Anzeigen des gesamten Dataframes
>
> print(INVESTMENTVOLUME)
>
>
>
> # Make sure BFS is numeric right before dataprep
>
> INVESTMENTVOLUME$BFS <- as.numeric(INVESTMENTVOLUME$BFS)
>
>
>
> # running dataprep
>
> dataprep_out <- dataprep(
>
>   foo = INVESTMENTVOLUME,
>
>   predictors = c("Predictor 1", " Predictor 2", " Predictor 3" Predictor
> 4",
> " Predictor 5", " Predictor 6"),
>
>   special.predictors = list(list("Special Predictor 1", seq(1, 12, by =
> 1))),
>
>   dependent = "INVESTMENTVOLUME_12_MONTH_AVERAGE",
>
>   unit.variable = "BFS",
>
>   time.variable = "DATE",
>
>   treatment.identifier = ,
>
>   controls.identifier =
> unique(INVESTMENTVOLUME$BFS[-which(INVESTMENTVOLUME$BFS == )]),
>
>   time.predictors.prior = as.Date("2010-01-01"):as.Date("2017-10-01"),
>
>   time.optimize.ssr = as.Date("2010-01-01"):as.Date("2017-10-01"),
>
>   time.plot = as.Date("2010-01-01"):as.Date("2024-03-01"),
>
>   unit.names.variable = "BFS"
>
> )
>
>
>
> synth_out <- synth(
>
>   data.prep.obj = dataprep_out
>
> )
>
>
>
> I keep getting the same error message. Unfortunately, ChatGPT and solutions
> from various forums do not help. My unit variables are all numeric except
> one, which is a date and has POSIXct as type.
>
> Fehler in dataprep(foo = INVESTMENTVOLUME, predictors = c("Predictor 1",
> :
>
>
>
>  unit.variable not found as numeric variable in foo.
>
>
>
> I would be very grateful if you could help me with my problem.
>
>
>
> Thank you in advance for your efforts.
>
>
>
> Kind regards
>
> Nadja Delliehausen
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Exceptional slowness with read.csv

2024-04-08 Thread Bert Gunter

No idea, but have you tried using ?scan to read those next 5 rows? It might
give you a better idea of the pathologies that are causing problems. For
example, an unmatched quote might result in some huge number of characters
trying to be read into a single element of a character variable. As your
previous respondent said, resolving such problems can be a challenge.

Cheers,
Bert

On Mon, Apr 8, 2024 at 8:06 AM Dave Dixon  wrote:

> Greetings,
>
> I have a csv file of 76 fields and about 4 million records. I know that
> some of the records have errors - unmatched quotes, specifically.
> Reading the file with readLines and parsing the lines with read.csv(text
> = ...) is really slow. I know that the first 2459465 records are good.
> So I try this:
>
>  > startTime <- Sys.time()
>  > first_records <- read.csv(file_name, nrows = 2459465)
>  > endTime <- Sys.time()
>  > cat("elapsed time = ", endTime - startTime, "\n")
>
> elapsed time =   24.12598
>
>  > startTime <- Sys.time()
>  > second_records <- read.csv(file_name, skip = 2459465, nrows = 5)
>  > endTime <- Sys.time()
>  > cat("elapsed time = ", endTime - startTime, "\n")
>
> This appears to never finish. I have been waiting over 20 minutes.
>
> So why would (skip = 2459465, nrows = 5) take orders of magnitude longer
> than (nrows = 2459465) ?
>
> Thanks!
>
> -dave
>
> PS: readLines(n=2459470) takes 10.42731 seconds.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] split a factor into single elements

2024-04-02 Thread Bert Gunter

Note:
> levels(factor(c(0,0,1)))  ## just gives you the levels attribute
[1] "0" "1"
> as.character(factor(c(0,0,1))) ## gives you the level of each value in
the vector
[1] "0" "0" "1"

Does that answer your question or have I misunderstood.

Cheers,
Bert



On Tue, Apr 2, 2024 at 12:00 AM Kimmo Elo  wrote:

> Hi,
>
> why would this simple procedure not work?
>
> --- snip ---
> mydf <- data.frame(id_station = 1234, string_data = c(2024, 12, 1, 0, 0),
> rainfall_value= 55)
>
> mydf$string_data <- as.factor(mydf$string_data)
>
> values<-as.integer(levels(mydf$string_data))
>
> for (i in 1:length(values)) {
> assign(paste("VAR_", i, sep=""), values[i])
> }
>
> --- snip ---
>
> Best,
>
> Kimmo
>
> to, 2024-03-28 kello 14:17 +, Ebert,Timothy Aaron kirjoitti:
> > Here are some pieces of working code. I assume you want the second one or
> > the third one that is functionally the same but all in one statement. I
> > do not understand why it is a factor, but I will assume that there is a
> > current and future reason for that. This means I cannot alter the
> > string_data variable, or you can simplify by not making the variable a
> > factor only to turn it back into character.
> >
> > mydf <- data.frame(id_station = 1234, string_data = c(2024, 12, 1, 0, 0),
> > rainfall_value= 55)
> > mydf$string_data <- as.factor(mydf$string_data)
> >
> > mydf <- data.frame(id_station = 1234, string_data = "2024, 12, 1, 0, 0",
> > rainfall_value= 55)
> > mydf$string_data <- as.factor(mydf$string_data)
> >
> > mydf <- data.frame(id_station = 1234, string_data = as.factor("2024, 12,
> > 1, 0, 0"), rainfall_value= 55)
> >
> > mydf <- data.frame(id_station = 1234, string_data = as.factor("2024, 12,
> > 1, 0, 0"), rainfall_value= 55)
> > mydf$string_data2 <- as.character(mydf$string_data)
> >
> > #I assume there are many records in the data frame and your example is
> > for demonstration only.
> > #I cannot assume that all records are the same, though you may be able to
> > simplify if that is true.
> > #Split the string based on commas.
> > split_values <- strsplit(mydf$string_data2, ",")
> >
> > # find the maximum string length
> > max_length <- max(lengths(split_values))
> >
> > # Add new variables to the data frame
> > for (i in 1:max_length) {
> >   new_var_name <- paste0("VAR_", i)
> >   mydf[[new_var_name]] <- sapply(split_values, function(x)
> > ifelse(length(x) >= i, x[i], NA))
> > }
> >
> > # Convert to numeric
> >  for (i in 1:max_length) {
> >new_var_name <- paste0("VAR_", i)
> >mydf[[new_var_name]] <- as.numeric(mydf[[new_var_name]])
> >  }
> > # remove trash
> > mydf <- mydf[,-4]
> > # Provide more useful names
> > colnames(mydf) <- c("id_station", "string_data", "rainfall_mm", "Year",
> > "Month", "Day", "hour", "minute")
> >
> > Regards,
> > Tim
> >
> > -Original Message-
> > From: R-help  On Behalf Of Stefano Sofia
> > Sent: Thursday, March 28, 2024 7:48 AM
> > To: Fabio D'Agostino ; r-help@R-project.org
> > Subject: Re: [R] split a factor into single elements
> >
> > [External Email]
> >
> > Sorry for my hurry.
> >
> > The correct reproducible code is different from the initial one. The
> > correct example is
> >
> >
> > mydf <- data.frame(id_station = 1234, string_data = as.factor(2024, 12,
> > 1, 0, 0), rainfall_value= 55)
> >
> >
> > In this case mydf$string_data is a factor, but of length 1 (and not 5
> > like in the initial example).
> >
> > Therefore the suggestion offered by Fabio does not work.
> >
> >
> > Any suggestion?
> >
> > Sorry again for my mistake
> >
> > Stefano
> >
> >
> >
> >  (oo)
> > --oOO--( )--OOo--
> > Stefano Sofia PhD
> > Civil Protection - Marche Region - Italy Meteo Section Snow Section Via
> > del Colle Ameno 5
> > 60126 Torrette di Ancona, Ancona (AN)
> > Uff: +39 071 806 7743
> > E-mail: stefano.so...@regione.marche.it
> > ---Oo-oO
> >
> >
> > 
> > Da: Fabio D'Agostino 
> > Inviato: gioved  28 marzo 2024 12:20
> > A: Stefano Sofia; r-help@R-project.org
> > Oggetto: Re: [R] split a factor into single elements
> >
> >
> > Non si ricevono spesso messaggi di posta elettronica da
> > dagostinof...@gmail.com. Informazioni sul perch
> > importante
> >
> > Hi Stefano,
> > maybe something like this can help you?
> >
> > myfactor <- as.factor(c(2024, 2, 1, 0, 0))
> >
> > # Convert factor values to integers
> > first_element <- as.integer(as.character(myfactor)[1])
> > second_element <- as.integer(as.character(myfactor)[2])
> > third_element <- as.integer(as.character(myfactor)[3])
> >
> > # Print the results
> > first_element
> > [1] 2024
> > second_element
> > [1] 2
> > third_element
> > [1] 1
> >
> > # Check the type of the object
> > typeof(first_element)
> > [1] "integer"
> >
> > Fabio
> >
> > Il giorno gio 28 mar 2024 alle ore 11:29 Stefano Sofia
> > mailto:stefano.so...@regione.marche.it
> >>
>

Re: [R] Double buffering plots on Windows

2024-03-23 Thread Bert Gunter

A search on "make animated plots in R" brought up many hits and the
gganimate package (and maybe others, as I didn't scroll through).

Bert

On Fri, Mar 22, 2024, 18:45 Bickis, Mikelis  wrote:

> Hello:
>
> I want to present a sequence of plots as an animation.   As a toy example
> consider the code
>
> function(n){for (i in 1:n){
> plot(1:100,sin(i*(1:100)),type="l")
> title(paste("n=",i))
> segments(0,0,100,0,col=2)
> }}
>
> This sort-of works on a MacOS platform, but the rendering of the plots is
> a bit choppy.  Inserting a sleep function allows the plots to evolve
> smoothly.
>
> function(n){for (i in 1:n){
> plot(1:100,sin(i*(1:100)),type="l")
> title(paste("n=",i))
> segments(0,0,100,0,col=2)
> Sys.sleep(.2)
> }}
>
> However, on a Windows platform, only the last plot is rendered without the
> Sys.sleep, so the dynamic element is lost.   Inserting the Sys.sleep does
> allow all the plots to be rendered, but they seem to be erased before they
> are drawn again, so there is substantial flicker in the appearance.
>
> Is there some kind of double-buffering available within R, so that plots
> are rendered only after they are fully drawn, leaving the previous plot
> visible until it is replaced?   I just used the default graphics driver on
> Windows — is there perhaps a different driver that will the graphics
> smoother?
>
> Mik Bickis
> Professor Emeritus
> Department of Mathematics and Statistics
> University of Saskatchewan
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem when trying to install packages

2024-03-16 Thread Bert Gunter

Though Navigator may mess up any Rtools stuff because it handles the
directory trees where packages and dependencies are located, does it not?
If so, maybe just reinstall RStudio directly from its website to proceed.
Just a guess obviously.

Bert

On Sat, Mar 16, 2024, 05:09 javad bayat  wrote:

>  Dear Rui;
> Many thanks for your reply. I have installed Rtools (rtools43-5958-5975) on
> my PC and I have R version 4.3.3 and 4.3.2 to install. Also I have
> installed Rstudio through Anaconda Navigator.
> But I do not know how to use Rtools for installing the R packages. I would
> be more than happy if you help me.
> Sincerely yours
>
>
>
> > Dear Rui;
> > I hope this email finds you well. I have a problem installing packages in
> > Rstudio and R software. When I try to install a package, the software
> tries
> > to download but after downloading, it gives some errors and does not
> work.
> > I would be more than happy if you please help me to solve this issue.
> > Warm regards.
> >
> >
> >> install.packages("openair", type = "source")Installing package into
> ‘C:/R_Libs’
> > (as ‘lib’ is unspecified)Warning in install.packages :
> >dependencies ‘lattice’, ‘MASS’ are not availablealso installing the
> > dependencies ‘deldir’, ‘RcppEigen’, ‘cli’, ‘glue’, ‘lifecycle’,
> > ‘pillar’, ‘rlang’, ‘tibble’, ‘tidyselect’, ‘vctrs’, ‘png’, ‘jpeg’,
> > ‘interp’, ‘timechange’, ‘maps’, ‘nlme’, ‘Matrix’, ‘cluster’, ‘dplyr’,
> > ‘hexbin’, ‘latticeExtra’, ‘lubridate’, ‘mapproj’, ‘mgcv’, ‘purrr’
> > trying URL '
> https://cran.rstudio.com/src/contrib/deldir_2.0-4.tar.gz'Content
> > type 'application/x-gzip' length 103621 bytes (101 KB)downloaded 101
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/RcppEigen_0.3.4.0.0.tar.gz'Content
> > type 'application/x-gzip' length 1765714 bytes (1.7 MB)downloaded 1.7
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/cli_3.6.2.tar.gz'Content
> > type 'application/x-gzip' length 569771 bytes (556 KB)downloaded 556
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/glue_1.7.0.tar.gz'Content
> > type 'application/x-gzip' length 105420 bytes (102 KB)downloaded 102
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/lifecycle_1.0.4.tar.gz'Content
> > type 'application/x-gzip' length 107656 bytes (105 KB)downloaded 105
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/pillar_1.9.0.tar.gz'Content
> > type 'application/x-gzip' length 444528 bytes (434 KB)downloaded 434
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/rlang_1.1.3.tar.gz'Content
> > type 'application/x-gzip' length 763765 bytes (745 KB)downloaded 745
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/tibble_3.2.1.tar.gz'Content
> > type 'application/x-gzip' length 565982 bytes (552 KB)downloaded 552
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/tidyselect_1.2.1.tar.gz'Content
> > type 'application/x-gzip' length 103591 bytes (101 KB)downloaded 101
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/vctrs_0.6.5.tar.gz'Content
> > type 'application/x-gzip' length 969066 bytes (946 KB)downloaded 946
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/png_0.1-8.tar.gz'Content
> > type 'application/x-gzip' length 24880 bytes (24 KB)downloaded 24 KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/jpeg_0.1-10.tar.gz'Content
> > type 'application/x-gzip' length 18667 bytes (18 KB)downloaded 18 KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/interp_1.1-6.tar.gz'Content
> > type 'application/x-gzip' length 1112116 bytes (1.1 MB)downloaded 1.1
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/timechange_0.3.0.tar.gz'Content
> > type 'application/x-gzip' length 103439 bytes (101 KB)downloaded 101
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/maps_3.4.2.tar.gz'Content
> > type 'application/x-gzip' length 2278051 bytes (2.2 MB)downloaded 2.2
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/nlme_3.1-164.tar.gz'Content
> > type 'application/x-gzip' length 836832 bytes (817 KB)downloaded 817
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/Matrix_1.6-5.tar.gz'Content
> > type 'application/x-gzip' length 2883851 bytes (2.8 MB)downloaded 2.8
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/cluster_2.1.6.tar.gz'Content
> > type 'application/x-gzip' length 369050 bytes (360 KB)downloaded 360
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/dplyr_1.1.4.tar.gz'Content
> > type 'application/x-gzip' length 1207521 bytes (1.2 MB)downloaded 1.2
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/hexbin_1.28.3.tar.gz'Content
> > type 'application/x-gzip' length 1199967 bytes (1.1 MB)downloaded 1.1
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/latticeExtra_0.6-30.tar.gz'Content
> > type 'application/x-gzip' length 1292936 bytes (1.2 MB)downloaded 1.2
> > MB
> > trying URL '
>

Re: [R] Problem when trying to install packages

2024-03-16 Thread Bert Gunter

? Google it!  "How to install packages using Rtools"

Bert

On Sat, Mar 16, 2024, 05:09 javad bayat  wrote:

>  Dear Rui;
> Many thanks for your reply. I have installed Rtools (rtools43-5958-5975) on
> my PC and I have R version 4.3.3 and 4.3.2 to install. Also I have
> installed Rstudio through Anaconda Navigator.
> But I do not know how to use Rtools for installing the R packages. I would
> be more than happy if you help me.
> Sincerely yours
>
>
>
> > Dear Rui;
> > I hope this email finds you well. I have a problem installing packages in
> > Rstudio and R software. When I try to install a package, the software
> tries
> > to download but after downloading, it gives some errors and does not
> work.
> > I would be more than happy if you please help me to solve this issue.
> > Warm regards.
> >
> >
> >> install.packages("openair", type = "source")Installing package into
> ‘C:/R_Libs’
> > (as ‘lib’ is unspecified)Warning in install.packages :
> >dependencies ‘lattice’, ‘MASS’ are not availablealso installing the
> > dependencies ‘deldir’, ‘RcppEigen’, ‘cli’, ‘glue’, ‘lifecycle’,
> > ‘pillar’, ‘rlang’, ‘tibble’, ‘tidyselect’, ‘vctrs’, ‘png’, ‘jpeg’,
> > ‘interp’, ‘timechange’, ‘maps’, ‘nlme’, ‘Matrix’, ‘cluster’, ‘dplyr’,
> > ‘hexbin’, ‘latticeExtra’, ‘lubridate’, ‘mapproj’, ‘mgcv’, ‘purrr’
> > trying URL '
> https://cran.rstudio.com/src/contrib/deldir_2.0-4.tar.gz'Content
> > type 'application/x-gzip' length 103621 bytes (101 KB)downloaded 101
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/RcppEigen_0.3.4.0.0.tar.gz'Content
> > type 'application/x-gzip' length 1765714 bytes (1.7 MB)downloaded 1.7
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/cli_3.6.2.tar.gz'Content
> > type 'application/x-gzip' length 569771 bytes (556 KB)downloaded 556
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/glue_1.7.0.tar.gz'Content
> > type 'application/x-gzip' length 105420 bytes (102 KB)downloaded 102
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/lifecycle_1.0.4.tar.gz'Content
> > type 'application/x-gzip' length 107656 bytes (105 KB)downloaded 105
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/pillar_1.9.0.tar.gz'Content
> > type 'application/x-gzip' length 444528 bytes (434 KB)downloaded 434
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/rlang_1.1.3.tar.gz'Content
> > type 'application/x-gzip' length 763765 bytes (745 KB)downloaded 745
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/tibble_3.2.1.tar.gz'Content
> > type 'application/x-gzip' length 565982 bytes (552 KB)downloaded 552
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/tidyselect_1.2.1.tar.gz'Content
> > type 'application/x-gzip' length 103591 bytes (101 KB)downloaded 101
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/vctrs_0.6.5.tar.gz'Content
> > type 'application/x-gzip' length 969066 bytes (946 KB)downloaded 946
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/png_0.1-8.tar.gz'Content
> > type 'application/x-gzip' length 24880 bytes (24 KB)downloaded 24 KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/jpeg_0.1-10.tar.gz'Content
> > type 'application/x-gzip' length 18667 bytes (18 KB)downloaded 18 KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/interp_1.1-6.tar.gz'Content
> > type 'application/x-gzip' length 1112116 bytes (1.1 MB)downloaded 1.1
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/timechange_0.3.0.tar.gz'Content
> > type 'application/x-gzip' length 103439 bytes (101 KB)downloaded 101
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/maps_3.4.2.tar.gz'Content
> > type 'application/x-gzip' length 2278051 bytes (2.2 MB)downloaded 2.2
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/nlme_3.1-164.tar.gz'Content
> > type 'application/x-gzip' length 836832 bytes (817 KB)downloaded 817
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/Matrix_1.6-5.tar.gz'Content
> > type 'application/x-gzip' length 2883851 bytes (2.8 MB)downloaded 2.8
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/cluster_2.1.6.tar.gz'Content
> > type 'application/x-gzip' length 369050 bytes (360 KB)downloaded 360
> > KB
> > trying URL '
> https://cran.rstudio.com/src/contrib/dplyr_1.1.4.tar.gz'Content
> > type 'application/x-gzip' length 1207521 bytes (1.2 MB)downloaded 1.2
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/hexbin_1.28.3.tar.gz'Content
> > type 'application/x-gzip' length 1199967 bytes (1.1 MB)downloaded 1.1
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/latticeExtra_0.6-30.tar.gz'Content
> > type 'application/x-gzip' length 1292936 bytes (1.2 MB)downloaded 1.2
> > MB
> > trying URL '
> https://cran.rstudio.com/src/contrib/lubridate_1.9.3.tar.gz'Content
> > type 'application/x-gzip' length 428043 bytes (418 KB)downloaded 418
> > KB
> > trying URL '
>

Re: [R] Initializing vector and matrices

2024-03-02 Thread Bert Gunter

"It would be really really helpful to have a clearer idea of what you
are trying to do."

Amen!

But in R, "constructing" objects by extending them piece by piece is
generally very inefficient (e.g.
https://r-craft.org/growing-objects-and-loop-memory-pre-allocation/),
although sometimes?/often? unavoidable (hence the relevance of your
comment above). R generally prefers to take a "whole object" point of
view ( see R books by Chambers, et. al.) and provides code for basic
operations like vector/matrix, etc. construction to do so. When this
is not possible, I suspect "optimal" efficient strategies for
allocating space to build objects gets you into the weeds of how R
works.

Cheers,
Bert




On Sat, Mar 2, 2024 at 1:02 AM Richard O'Keefe  wrote:
>
> The matrix equivalent of
>   x <- ...
>   v <- ...
>   x[length(x)+1] <- v
> is
>   m <- ...
>   r <- ...
>   m <- rbind(m, r)
> or
>   m <- ...
>   k <- ...
>   m <- cbind(m, c)
>
> A vector or matrix so constructed never has "holes" in it.
> It's better to think of CONSTRUCTING vectors and matrices rather than
> INITIALISING them,
> because always being fully defined is important.
>
> It would be really really helpful to have a clearer idea of what you
> are trying to do.
>
> On Fri, 1 Mar 2024 at 03:31, Ebert,Timothy Aaron  wrote:
> >
> > You could declare a matrix much larger than you intend to use. This works 
> > with a few megabytes of data. It is not very efficient, so scaling up may 
> > become a problem.
> > m22 <- matrix(NA, 1:60, ncol=6)
> >
> > It does not work to add a new column to the matrix, as in you get an error 
> > if you try m22[ , 7] but convert to data frame and add a column
> >
> > m23 <- data.frame(m22)
> > m23$x7 <- 12
> >
> > The only penalty that I know of to having unused space in a matrix is the 
> > amount of memory it takes. One side effect is that your program may have a 
> > mistake that you would normally catch with a subscript out of bounds error 
> > but with the extra space it now runs without errors.
> >
> > Tim
> >
> >
> >
> > -Original Message-
> > From: R-help  On Behalf Of Richard O'Keefe
> > Sent: Thursday, February 29, 2024 5:29 AM
> > To: Steven Yen 
> > Cc: R-help Mailing List 
> > Subject: Re: [R] Initializing vector and matrices
> >
> > [External Email]
> >
> > x <- numeric(0)
> > for (...) {
> > x[length(x)+1] <- ...
> > }
> > works.
> > You can build a matrix by building a vector one element at a time this way, 
> > and then reshaping it at the end.  That only works if you don't need it to 
> > be a matrix at all times.
> > Another approach is to build a list of rows.  It's not a matrix, but a list 
> > of rows can be a *ragged* matrix with rows of varying length.
> >
> > On Wed, 28 Feb 2024 at 21:57, Steven Yen  wrote:
> > >
> > > Is there as way to initialize a vector (matrix) with an unknown length
> > > (dimension)? NULL does not seem to work. The lines below work with a
> > > vector of length 4 and a matrix of 4 x 4. What if I do not know
> > > initially the length/dimension of the vector/matrix?
> > >
> > > All I want is to add up (accumulate)  the vector and matrix as I go
> > > through the loop.
> > >
> > > Or, are there other ways to accumulate such vectors and matrices?
> > >
> > >  > x<-rep(0,4)  # this works but I like to leave the length open  >
> > > for (i in 1:3){
> > > +  x1<-1:4
> > > +  x<-x+x1
> > > + }
> > >  > x
> > > [1]  3  6  9 12
> > >
> > >  > y = 0*matrix(1:16, nrow = 4, ncol = 4); # this works but I like to
> > > leave the dimension open
> > >   [,1] [,2] [,3] [,4]
> > > [1,]0000
> > > [2,]0000
> > > [3,]0000
> > > [4,]0000
> > >  > for (i in 1:3){
> > > +   y1<-matrix(17:32, nrow = 4, ncol = 4)
> > > +   y<-y+y1
> > > + }
> > >  > y
> > >   [,1] [,2] [,3] [,4]
> > > [1,]   51   63   75   87
> > > [2,]   54   66   78   90
> > > [3,]   57   69   81   93
> > > [4,]   60   72   84   96
> > >  >
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat/
> > > .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C02%7Ctebert%40ufl.edu
> > > %7Cdbccaccf29674b10b17308dc39114d38%7C0d4da0f84a314d76ace60a62331e1b84
> > > %7C0%7C0%7C638447993707432549%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=
> > > PtWjcDOnwO7PArVOSdgYbpz8ksjDPK%2Bn9ySyhwQC0gE%3D=0
> > > PLEASE do read the posting guide
> > > http://www.r/
> > > -project.org%2Fposting-guide.html=05%7C02%7Ctebert%40ufl.edu%7Cdb
> > > ccaccf29674b10b17308dc39114d38%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
> > > 7C0%7C638447993707438911%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> > > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C=Igb16
> > > CBYgG21HLEDH4I4gfjjFBa3KjDFK8yEZUmBo8s%3D=0
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >

Re: [R] gsub issue with consecutive pattern finds

2024-03-01 Thread Bert Gunter

Oh, wait a second. I misread your original post. Please ignore my
truly incorrect suggestion.

-- Bert

On Fri, Mar 1, 2024 at 7:57 AM Bert Gunter  wrote:
>
> Here's another *incorrect* way to do it -- incorrect because it will
> not always work, unlike Iris's correct solution. But it does not
> require PERL type matching. The idea: separate the two vowels in the
> regex by a character that you know cannot appear (if there is such)
> and match it optionally, e.g. with '*" repetition specifier. I used
> "?" for the optional character below (which must be escaped).
>
> >gsub("([aeiouAEIOU])\\?*([aeiouAEIOU])", "\\1_\\2", "aerioue")
> [1] "a_eri_ou_e"
>
> Cheers,
> Bert
>
>
> On Fri, Mar 1, 2024 at 3:59 AM Iago Giné Vázquez  wrote:
> >
> > Hi Iris,
> >
> > Thank you. Further, very nice solution.
> >
> > Best,
> >
> > Iago
> >
> > On 01/03/2024 12:49, Iris Simmons wrote:
> > > Hi Iago,
> > >
> > >
> > > This is not a bug. It is expected. Patterns may not overlap. However, 
> > > there
> > > is a way to get the result you want using perl:
> > >
> > > ```R
> > > gsub("([aeiouAEIOU])(?=[aeiouAEIOU])", "\\1_", "aerioue", perl = TRUE)
> > > ```
> > >
> > > The specific change I made is called a positive lookahead, you can read
> > > more about it here:
> > >
> > > https://www.regular-expressions.info/lookaround.html
> > >
> > > It's a way to check for a piece of text without consuming it in the match.
> > >
> > > Also, since you don't care about character case, it might be more legible
> > > to add ignore.case = TRUE and remove the upper case characters:
> > >
> > > ```R
> > > gsub("([aeiou])(?=[aeiou])", "\\1_", "aerioue", perl = TRUE, ignore.case =
> > > TRUE)
> > >
> > > ## or
> > >
> > > gsub("(?i)([aeiou])(?=[aeiou])", "\\1_", "aerioue", perl = TRUE)
> > > ```
> > >
> > > I hope this helps!
> > >
> > >
> > > On Fri, Mar 1, 2024, 06:37 Iago Giné Vázquez  wrote:
> > >
> > >> Hi all,
> > >>
> > >> I tested next command:
> > >>
> > >> gsub("([aeiouAEIOU])([aeiouAEIOU])", "\\1_\\2", "aerioue")
> > >>
> > >> with the following output:
> > >>
> > >> [1] "a_eri_ou_e"
> > >>
> > >> So, there are two consecutive vowels where an underscore is not added.
> > >>
> > >> May it be a bug? Is it expected (bug or not)? Is there any chance to get
> > >> what I want (an underscore between each pair of consecutive vowels)?
> > >>
> > >>
> > >> Thank you!
> > >>
> > >> Best regards,
> > >>
> > >> Iago
> > >>
> > >>  [[alternative HTML version deleted]]
> > >>
> > >> __
> > >> R-help@r-project.org  mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gsub issue with consecutive pattern finds

2024-03-01 Thread Bert Gunter

Here's another *incorrect* way to do it -- incorrect because it will
not always work, unlike Iris's correct solution. But it does not
require PERL type matching. The idea: separate the two vowels in the
regex by a character that you know cannot appear (if there is such)
and match it optionally, e.g. with '*" repetition specifier. I used
"?" for the optional character below (which must be escaped).

>gsub("([aeiouAEIOU])\\?*([aeiouAEIOU])", "\\1_\\2", "aerioue")
[1] "a_eri_ou_e"

Cheers,
Bert


On Fri, Mar 1, 2024 at 3:59 AM Iago Giné Vázquez  wrote:
>
> Hi Iris,
>
> Thank you. Further, very nice solution.
>
> Best,
>
> Iago
>
> On 01/03/2024 12:49, Iris Simmons wrote:
> > Hi Iago,
> >
> >
> > This is not a bug. It is expected. Patterns may not overlap. However, there
> > is a way to get the result you want using perl:
> >
> > ```R
> > gsub("([aeiouAEIOU])(?=[aeiouAEIOU])", "\\1_", "aerioue", perl = TRUE)
> > ```
> >
> > The specific change I made is called a positive lookahead, you can read
> > more about it here:
> >
> > https://www.regular-expressions.info/lookaround.html
> >
> > It's a way to check for a piece of text without consuming it in the match.
> >
> > Also, since you don't care about character case, it might be more legible
> > to add ignore.case = TRUE and remove the upper case characters:
> >
> > ```R
> > gsub("([aeiou])(?=[aeiou])", "\\1_", "aerioue", perl = TRUE, ignore.case =
> > TRUE)
> >
> > ## or
> >
> > gsub("(?i)([aeiou])(?=[aeiou])", "\\1_", "aerioue", perl = TRUE)
> > ```
> >
> > I hope this helps!
> >
> >
> > On Fri, Mar 1, 2024, 06:37 Iago Giné Vázquez  wrote:
> >
> >> Hi all,
> >>
> >> I tested next command:
> >>
> >> gsub("([aeiouAEIOU])([aeiouAEIOU])", "\\1_\\2", "aerioue")
> >>
> >> with the following output:
> >>
> >> [1] "a_eri_ou_e"
> >>
> >> So, there are two consecutive vowels where an underscore is not added.
> >>
> >> May it be a bug? Is it expected (bug or not)? Is there any chance to get
> >> what I want (an underscore between each pair of consecutive vowels)?
> >>
> >>
> >> Thank you!
> >>
> >> Best regards,
> >>
> >> Iago
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org  mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] converting MATLAB -> R | element-wise operation

2024-02-27 Thread Bert Gunter

... and here is a more or less direct translation of the Matlab code that
should now be obvious given your previous responses:

> m <- matrix(1:6, nr=2, byrow = TRUE) ## Matlab order
> m
 [,1] [,2] [,3]
[1,]123
[2,]456
> sweep(m, 2, 2:4, "/")
 [,1]  [,2] [,3]
[1,]  0.5 0.667 0.75
[2,]  2.0 1.667 1.50

Cheers,
Bert

On Tue, Feb 27, 2024 at 1:03 PM Evan Cooch  wrote:

> So, trying to convert a very long, somewhat technical bit of lin alg
> MATLAB code to R. Most of it working, but raninto a stumbling block that
> is probaably simple enough for someone to explain.
>
> Basically, trying to 'line up' MATLAB results from an element-wise
> division of a matrix by a vector with R output.
>
> Here is a simplified version of the MATLAB code I'm translating:
>
> NN = [1, 2, 3; 4, 5, 6];  % Example matrix
> lambda = [2, 3, 4];  % Example vector
> result_matlab = NN ./ lambda;
>
> which yields
>
>   0.5   0.7   0.75000
>   2.0   1.7   1.5
>
>
> So, the only way I have stumbled onto in R to generate the same results
> is to use 'sweep'. The following 'works', but I'm hoping someone can
> explain why I need something as convoluted as this seems (to me, at least).
>
> NN <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, byrow = TRUE)  # Example matrix
> lambda <- c(2, 3, 4)  # Example vector
> sweep(NN, 2, lambda, "/")
>
>
>   [,1]  [,2] [,3]
> [1,]  0.5 0.667 0.75
> [2,]  2.0 1.667 1.50
>
> First tried the more 'obvious' NN/lambda, but that yields 'the wrong
> answer' (based solely on what I'm trying to accomplish):
>
>
> [,1] [,2] [,3]
> [1,] 0.50  0.5  1.0
> [2,] 1.33  2.5  1.5
>
> So, why 'sweep'?
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interactions in regression

2024-02-25 Thread Bert Gunter

It is trivial in R to add whatever decorations to a plot that you would
like, but that requires that you go beyond point and click production of
graphics and write actual code. If you are unwilling or unable to do this,
you are stuck with whatever various packaged graphics functionality
provides.So you might want to search on "interaction plots for linear
models in R" or similar at rseek.org or in your favorite web search engine
if you haven't already done so. My minimal efforts brought up lots of hits,
though none may be useful for your concerns, especially, as has already
been pointed out, as your query doesn't seem to make much sense
statistically.

Cheers,
Bert

On Sun, Feb 25, 2024 at 7:46 AM Jacek Kownacki 
wrote:

> Hi All,
> I stumbled upon some topics regarding interactions in anova and regression
> and packages for tabulating and visualizations the results of them.
> Here we are:
>
> https://stackoverflow.com/questions/77933272/how-to-add-a-reference-level-for-interaction-in-gtsummary-and-sjplot/77935742#77935742
> ,
>
> https://stackoverflow.com/questions/78016795/how-to-add-reference-levels-for-interaction-in-r?noredirect=1=1
> .
> I was wondering because I usually use GUI software and these questions did
> not get answers, if from a technical point of view
> how to do it, using these (sjPlot, gtsummary) or other ways to make such
> tables, inserting the reference levels of these mentioned interactions.
> This is not likely to be used in publications (including three base
> levels), but from the point of view of solving the topics this questions
> have interested me.
> I tried myself to make it happen, but so far without success.
> I recall this reprex based on SO:
>
> set.seed(1000)
> my_data <- rbind(
>   data.frame(time = "Pre", treatment = "Control", response =
> rnorm(100, mean=1)),
>   data.frame(time = "Pre", treatment = "Treatment", response =
> rnorm(100, mean=2)),
>   data.frame(time = "Post", treatment = "Control", response =
> rnorm(100, mean=1)),
>   data.frame(time = "Post", treatment = "Treatment", response =
> rnorm(100, mean=2))
> ) %>% mutate(time = factor(time, levels = c("Pre", "Post")))
> %>%mutate(treatment = factor(treatment, levels = c("Control",
> "Treatment")))
> model3 <- lm(response ~ time * treatment, data = my_data)
>
> Thanks,
> Jacek
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Including an external set of coded

2024-02-20 Thread Bert Gunter

I believe you will have to expain what you want more fully, as what you
requested appears to be exactly what source() does, to me anyway. Please
reread its help file more carefully perhaps?

-- Bert

On Tue, Feb 20, 2024 at 7:36 AM Steven Yen  wrote:

> How can I call and include an external set of R codes, not necessarily a
> complete procedure (which can be include with a “source” command). Example:
>
> #I like to include and run the following lines residing in a file outside
> the main program:
>
> mydata<-transform(mydata,
> a<-b+c
> d<-e+f
> }
>
>
> Steven from iPhone
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is simplify2array working for dimension > 2?

2024-02-08 Thread Bert Gunter

Jean-Claude:

Well, here's my "explanation". Caveat emptor!

Note that:
"simplify2array() is the utility called from sapply() when simplify is
not false"

and

> sapply(a, I, simplify = "array")
 [,1]   [,2]
[1,] list,2 list,2
[2,] list,2 list,2

So it seems that simplify2array() is not intended to operate in the
way that you expected, i.e. that recursive simplification is done.
And, indeed, if you check the code for the function, you will see that
that is the case. Perhaps the key phrase in the docs is in the
sapply() part that says:

"sapply is a user-friendly version and wrapper of lapply by default
returning a vector, matrix or, if simplify = "array", an array ***if
appropriate***, by applying simplify2array(). "   In other words,
recursive simplification is considered not "appropriate".

FWIW I also find this somewhat confusing and think that explicitly
saying that recursive simplification is not done might make it less
so. But writing docs that  address all our possible misconceptions is
pretty difficult (or impossible!), and maybe adding that explicit
caveat would confuse others even more... :-(

Cheers,
Bert

On Thu, Feb 8, 2024 at 12:12 AM Jean-Claude Arbaut  wrote:
>
> Reading the doc for ?simplify2array, I got the impression that with the
> 'higher = T' argument the function returns an array of dimension greater
> than 2 when it makes sense (the doc says "when appropriate", which is
> rather vague). I would expect
>
> a <- list(
>   list(list(1, 2), list(3, 4)),
>   list(list(5, 6), list(7, 8))
> )
> simplify2array(a, higher = T)
>
> to return the same (possibly up to a dimension permutation) as
> array(1:8, dim = c(2, 2, 2))
>
> However, in this case simplify2array returns a matrix (i.e. 2 dimensional
> array), whose elements are lists.
> It's the same as
> structure(list(list(1, 2), list(3, 4), list(5, 6), list(7, 8)), dim = c(2,
> 2))
>
> I get the same behavior with
> a <- list(
>   list(c(1, 2), c(3, 4)),
>   list(c(5, 6), c(7, 8))
> )
> but then the matrix elements are numeric vectors instead of lists.
>
> Did I miss something to get the result I expected with this function? Or is
> it a bug? Or maybe the function is not supposed to return a higher
> dimensional array, and I didn't understand the documentation correctly?
>
> There is a workaround, one can do for instance
> array(unlist(a), dim = c(2, 2, 2))
> and there may be better options (checking dimensions?).
>
> In case it's important: running R 4.3.2 on Debian 12.4.
>
> Best regards,
>
> Jean-Claude Arbaut
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gathering denominator under frac

2024-02-02 Thread Bert Gunter

BTW, for your amusement,

ylab = ~ frac(additive ~ HCO[3]^"-",
   true ~ HCO[3]^"-" ))

also should work. The reason is (from ?plotmath):

"In most cases other language objects (names and calls, including
formulas) are coerced to expressions and so can also be used."

Cheers,
Bert


On Fri, Feb 2, 2024 at 8:33 AM Bert Gunter  wrote:
>
> ... or if I understand correctly, simply
>
> expression(frac(additive ~ HCO[3]^"-",
>true ~ HCO[3]^"-" )))
>
> Cheers,
> Bert
>
> On Fri, Feb 2, 2024 at 3:06 AM Rui Barradas  wrote:
>>
>> Às 10:01 de 02/02/2024, Troels Ring escreveu:
>> > Hi friends - I'm plotting a ratio of bicarbonates i ggplot2 and
>> >
>> > ylab(expression(paste(frac("additive BIC","true BIC" worked OK - but
>> > now I have been asked to put the chemistry instead - so I wrote
>> >
>> >   ylab(expression(paste(frac("additive",HCO[3]^"-","true",HCO[3]^"-"
>> > - and frac saw that as additive = numerator and HCO3- = denominator and
>> > the rest was ignored-
>> >
>> > So how do I make frac ignore the first ","  and print the fraction as I
>> > want?
>> >
>> >
>> > All best wishes
>> > Troels
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> Hello,
>>
>> This seems to work. Instead of separating the two numerator strings with
>> a comma, separate them with a tilde. The same goes for the denominator.
>> And there is no need for double quotes around "additive" and "true".
>>
>>
>> library(ggplot2)
>>
>> g <- ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
>>geom_point()
>>
>> g + ylab(expression(paste(frac(
>>additive~HCO[3]^"-",
>>true~HCO[3]^"-"
>> 
>>
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>>
>> --
>> Este e-mail foi analisado pelo software antivírus AVG para verificar a 
>> presença de vírus.
>> www.avg.com
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gathering denominator under frac

2024-02-02 Thread Bert Gunter

... or if I understand correctly, simply

expression(frac(additive ~ HCO[3]^"-",
   true ~ HCO[3]^"-" )))

Cheers,
Bert

On Fri, Feb 2, 2024 at 3:06 AM Rui Barradas  wrote:

> Às 10:01 de 02/02/2024, Troels Ring escreveu:
> > Hi friends - I'm plotting a ratio of bicarbonates i ggplot2 and
> >
> > ylab(expression(paste(frac("additive BIC","true BIC" worked OK - but
> > now I have been asked to put the chemistry instead - so I wrote
> >
> >   ylab(expression(paste(frac("additive",HCO[3]^"-","true",HCO[3]^"-"
> > - and frac saw that as additive = numerator and HCO3- = denominator and
> > the rest was ignored-
> >
> > So how do I make frac ignore the first ","  and print the fraction as I
> > want?
> >
> >
> > All best wishes
> > Troels
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> Hello,
>
> This seems to work. Instead of separating the two numerator strings with
> a comma, separate them with a tilde. The same goes for the denominator.
> And there is no need for double quotes around "additive" and "true".
>
>
> library(ggplot2)
>
> g <- ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) +
>geom_point()
>
> g + ylab(expression(paste(frac(
>additive~HCO[3]^"-",
>true~HCO[3]^"-"
> 
>
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> --
> Este e-mail foi analisado pelo software antivírus AVG para verificar a
> presença de vírus.
> www.avg.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] List of Words in BioWordVec

2024-02-01 Thread Bert Gunter

I *think* this might be better posted here:
https://bioconductor.org/help/support/

Cheers,
Bert

On Thu, Feb 1, 2024 at 4:37 PM TJUN KIAT TEO  wrote:

> Is there a way to extract list of words in  BioWordVec  in R
>
> Thank you
>
> Tjun Kiat
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R interpreting numeric field as a boolean field

2024-01-30 Thread Bert Gunter

Incidentally, "didn't work" is not very useful information. Please tell us
exactly what error message or apparently aberrant result you received.
Also, what do you get from:

sapply(your_dataframe, "class")
nrow(your_dataframe)

(as I suspect what you think it is, isn't).

Cheers,
Bert

On Tue, Jan 30, 2024 at 11:01 AM Bert Gunter  wrote:

> "I cannot change the data type from
> boolean to numeric. I tried doing dataset$my_field =
> as.numeric(dataset$my_field), I also tried to do dataset <-
> dataset[complete.cases(dataset), ], didn't work either. "
>
> Sorry, but all I can say is: huh?
>
> > dt <- data.frame(a = c(NA,NA, FALSE, TRUE), b = 1:4)
> > dt
>   a b
> 1NA 1
> 2NA 2
> 3 FALSE 3
> 4  TRUE 4
> > sapply(dt, class)
> a b
> "logical" "integer"
> > dt$a <- as.numeric(dt$a)
> > dt
>a b
> 1 NA 1
> 2 NA 2
> 3  0 3
> 4  1 4
> > sapply(dt, class)
> a b
> "numeric" "integer"
>
> So either I'm missing something or you are. Happy to be corrected and
> chastised if the former.
>
> Cheers,
> Bert
>
>
> On Tue, Jan 30, 2024 at 10:41 AM Paul Bernal 
> wrote:
>
>> Dear friend Duncan,
>>
>> Thank you so much for your kind reply. Yes, that is exactly what is
>> happening, there are a lot of NA values at the start, so R assumes that
>> the
>> field is of type boolean. The challenge that I am facing is that I want to
>> read into R an Excel file that has many sheets (46 in this case) but I
>> wanted to combine all 46 sheets into a single dataframe (since the columns
>> are exactly the same for all 46 sheets). The rio package does this nicely,
>> the problem is that, once I have the full dataframe (which amounts to
>> roughly 2.98 million rows total), I cannot change the data type from
>> boolean to numeric. I tried doing dataset$my_field =
>> as.numeric(dataset$my_field), I also tried to do dataset <-
>> dataset[complete.cases(dataset), ], didn't work either.
>>
>> The only thing that worked for me was to take a single sheed and through
>> the read_excel function use the guess_max parameter and set it to a
>> sufficiently large number (a number >= to the total amount of the full
>> merged dataset). I want to automate the merging of the N number of Excel
>> sheets so that I don't have to be manually doing it. Unless there is a way
>> to accomplish something similar to what rio's package function import_list
>> does, that is able to keep the field's numeric data type nature.
>>
>> Cheers,
>> Paul
>>
>> El mar, 30 ene 2024 a las 12:23, Duncan Murdoch (<
>> murdoch.dun...@gmail.com>)
>> escribió:
>>
>> > On 30/01/2024 11:10 a.m., Paul Bernal wrote:
>> > > Dear friends,
>> > >
>> > > Hope you are doing well. I am currently using R version 4.3.2, and I
>> > have a
>> > > .xlsx file that has 46 sheets on it. I basically combined  all 46
>> sheets
>> > > and read them as a single dataframe in R using package rio.
>> > >
>> > > I read a solution using package readlx, as suggested in a
>> StackOverflow
>> > > discussion as follows:
>> > > df <- read_excel(path = filepath, sheet = sheet_name, guess_max =
>> > 10).
>> > > Now, when you have so many sheets (46 in my case) in an Excel file,
>> the
>> > rio
>> > > methodology is more practical.
>> > >
>> > > This is what I did:
>> > > path =
>> > >
>> >
>> "C:/Users/myuser/Documents/DataScienceF/Forecast_and_Econometric_Analysis_FIGI
>> > > (4).xlsx"
>> > > figidat = import_list(path, rbind = TRUE) #here figidat refers to my
>> > dataset
>> > >
>> > > Now, it successfully imports and merges all records, however, some
>> fields
>> > > (despite being numeric), R interprets as a boolean field.
>> > >
>> > > Here is the structure of the field that is causing me problems (I
>> > apologize
>> > > for the length):
>> > > structure(list(StoreCharges = c(NA, NA, NA, NA, NA, NA, NA, NA,
>> > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
>> > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
>> > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
>> > > NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
>> > > NA, NA, NA, NA, NA, NA, NA

Re: [R] Use of geometric mean .. in good data analysis

2024-01-22 Thread Bert Gunter

In the spirit of Martin's comments, it is perhaps worthwhile to note one of
John Tukey's (who I actually knew) pertinent quotes:
"The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
<https://www.azquotes.com/quote/603406>"

"Sunset Salvo" by John Tukey in The American Statistician, Volume 40, No. 1
(pp. 72-76), www.jstor.org. February 1986.

Cheers,
Bert

<https://www.azquotes.com/author/14847-John_Tukey>

On Mon, Jan 22, 2024 at 12:23 PM Bert Gunter  wrote:

>
> Ah LOD's, typically LLOD's ("lower limits of detection").
>
> Disclaimer: I am *NOT* in any sense an expert on such matters. What
> follows are just some comments based on my personal experience. Please
> filter accordingly. Also, while I kept it on list as Martin suggested it
> might be useful to do so, most folks probably can safely ignore the rant
> that follows as off topic and not of interest. So you've been warned!!
>
> The rant:
> My experience is: data that contain a "bunch" of values that are, e.g.
> below a LLOD, are frequently reported and/or analyzed by various ad hoc,
> and imho, uniformly bad methods. e.g.:
>
> 1) The censored values are recorded and analyzed as at the LLOD;
> 2) The censored values are recorded and analyzed at some arbitrary value
> below the LLOD, like LLOD/2;
> 3) The censored values are are "imputed" by ad hoc methods, e.g. uniform
> random values between 0 and the LLOD for left censoring.
>
> To repeat, *IMO*, all of this is junk and will produced misleading
> statistical results. Whether they mislead enough to substantively affect
> the science or regulatory decisions depend on the specifics of the
> circumstances. I accept no general claim as to their innocuousness.
>
> Further:
>
> a) When you have a "lot" of values -- 50%? 75%?, 25%? -- face facts: you
> have (practically) no useful information from the values that you do have
> to infer what the distribution of values that you don't have looks like.
> All one can sensibly do is say that x% of the values are below a LOD and
> here's the distribution of what lies above. Presumably, if you have such
> data conditional on covariates with the obvious intent to determine the
> relationship to those covariates, you could analyze the percentages of
> LLOD's and known values separately. There are undoubtedly more
> sophisticated methods out there, so this is where you need to go to the
> literature to see what might suit; though I think it will still have to
> come down to looking at these separately (e.g. with extra parameters to
> account for unmeasurable values). Another way of saying this is: any
> analysis which treats all the data as arising from a single distribution
> will depend more on the assumptions you make than on the data. So good luck
> with that!
>
> b) If you have a "modest" amount of (known) censoring -- 5%?, 20%? 10%? --
> methods for the analysis of censored data should be useful. My
> understanding is that MI (multiple imputation) is regarded as a generally
> useful approach, and there are many R packages that can do various flavors
> of this. Again, you should consult the literature: there are very likely
> nontechnical reviews of this topic, too, as well as online discussions and
> tutorials.
>
> So if you are serious about dealing with this and have a lot of data with
> these issues, my advice would be to stop looking for ad hoc advice and dig
> into the literature: it's one of the many areas of "data science" where
> seemingly simple but pervasive questions require complex answers.
>
> And, again, heed my personal caveats.
>
> Thus endeth my rant.
>
> Cheers to all,
> Bert
>
>
>
> On Mon, Jan 22, 2024 at 9:29 AM Rich Shepard 
> wrote:
>
>> On Mon, 22 Jan 2024, Martin Maechler wrote:
>>
>> > I think it is a good question, not really only about geo-chemistry, but
>> > about statistics in applied sciences (and engineering for that matter).
>>
>> > John W Tukey (and several other of the grands of the time) had the log
>> > transform among the "First aid transformations":
>> >
>> > If the data for a continuous variable must all be positive it is also
>> > typically the case that the distribution is considerably skewed to the
>> > right. In such a case behave as a good human who sees another human in
>> > health distress: apply First Aid -- do the things you learned to do
>> > quickly without too much thought, because things must happen fast ---to
>> > hopefully save the other's life.
>>
>> Martin,

Re: [R] Use of geometric mean .. in good data analysis

2024-01-22 Thread Bert Gunter

Ah LOD's, typically LLOD's ("lower limits of detection").

Disclaimer: I am *NOT* in any sense an expert on such matters. What follows
are just some comments based on my personal experience. Please filter
accordingly. Also, while I kept it on list as Martin suggested it might be
useful to do so, most folks probably can safely ignore the rant that
follows as off topic and not of interest. So you've been warned!!

The rant:
My experience is: data that contain a "bunch" of values that are, e.g.
below a LLOD, are frequently reported and/or analyzed by various ad hoc,
and imho, uniformly bad methods. e.g.:

1) The censored values are recorded and analyzed as at the LLOD;
2) The censored values are recorded and analyzed at some arbitrary value
below the LLOD, like LLOD/2;
3) The censored values are are "imputed" by ad hoc methods, e.g. uniform
random values between 0 and the LLOD for left censoring.

To repeat, *IMO*, all of this is junk and will produced misleading
statistical results. Whether they mislead enough to substantively affect
the science or regulatory decisions depend on the specifics of the
circumstances. I accept no general claim as to their innocuousness.

Further:

a) When you have a "lot" of values -- 50%? 75%?, 25%? -- face facts: you
have (practically) no useful information from the values that you do have
to infer what the distribution of values that you don't have looks like.
All one can sensibly do is say that x% of the values are below a LOD and
here's the distribution of what lies above. Presumably, if you have such
data conditional on covariates with the obvious intent to determine the
relationship to those covariates, you could analyze the percentages of
LLOD's and known values separately. There are undoubtedly more
sophisticated methods out there, so this is where you need to go to the
literature to see what might suit; though I think it will still have to
come down to looking at these separately (e.g. with extra parameters to
account for unmeasurable values). Another way of saying this is: any
analysis which treats all the data as arising from a single distribution
will depend more on the assumptions you make than on the data. So good luck
with that!

b) If you have a "modest" amount of (known) censoring -- 5%?, 20%? 10%? --
methods for the analysis of censored data should be useful. My
understanding is that MI (multiple imputation) is regarded as a generally
useful approach, and there are many R packages that can do various flavors
of this. Again, you should consult the literature: there are very likely
nontechnical reviews of this topic, too, as well as online discussions and
tutorials.

So if you are serious about dealing with this and have a lot of data with
these issues, my advice would be to stop looking for ad hoc advice and dig
into the literature: it's one of the many areas of "data science" where
seemingly simple but pervasive questions require complex answers.

And, again, heed my personal caveats.

Thus endeth my rant.

Cheers to all,
Bert

On Mon, Jan 22, 2024 at 9:29 AM Rich Shepard 
wrote:

> On Mon, 22 Jan 2024, Martin Maechler wrote:
>
> > I think it is a good question, not really only about geo-chemistry, but
> > about statistics in applied sciences (and engineering for that matter).
>
> > John W Tukey (and several other of the grands of the time) had the log
> > transform among the "First aid transformations":
> >
> > If the data for a continuous variable must all be positive it is also
> > typically the case that the distribution is considerably skewed to the
> > right. In such a case behave as a good human who sees another human in
> > health distress: apply First Aid -- do the things you learned to do
> > quickly without too much thought, because things must happen fast ---to
> > hopefully save the other's life.
>
> Martin,
>
> Thanks very much. I will look further into this because toxic metals and
> organic compounds in geochemical collections almost always have censored
> lab
> results (below method dection limits) that range from about 15% to 80% or
> more, and there almost always are very high extreme values.
>
> I'll learn to understand what benefits log transforms have over
> compositional data analyses.
>
> Best regards,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Use of geometric mean for geochemical concentrations

2024-01-22 Thread Bert Gunter

better posted on r-sig-ecology? -- or maybe even stack exchange?

Cheers,
Bert

On Mon, Jan 22, 2024 at 7:45 AM Rich Shepard 
wrote:

> A statistical question, not specific to R.
>
> I'm asking for a pointer for a source of definitive descriptions of what
> types of data are best summarized by the arithmetic, geometric, and
> harmonic
> means.
>
> As an aquatic ecologist I see regulators apply the geometric mean to
> geochemical concentrations rather than using the arithmetic mean. I want to
> know whether the geometric mean of a set of chemical concentrations (e.g.,
> in mg/L) is an appropriate representation of the expected value. If not, I
> want to explain this to non-technical decision-makers; if so, I want to
> understand why my assumption is wrong.
>
> TIA,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Strange results : bootrstrp CIs

2024-01-13 Thread Bert Gunter

Well, this would seem to work:

e <- data.frame(Score = Score
 , Country = factor(Country)
 , Time = Time)

ncountry <- nlevels(e$Country)
func= function(dat,idx) {
   if(length(unique(dat[idx,'Country'])) < ncountry) NA
   else coef(lm(Score~ Time + Country,data = dat[idx,]))
}
B <-  boot(e, func, R=1000)

boot.ci(B, index=2, type="perc")

Caveats:
1) boot.ci handles the NA's by omitting them, which of course gives a
smaller resample and longer CI's than the value of R specified in the call
to boot().

2) I do not know if the *nice* statistical properties of the nonparametric
bootstrap, e.g. asymptotic correctness, hold when bootstrap samples are
produced in this way.  I leave that to wiser heads than me.

Cheers,
Bert

On Sat, Jan 13, 2024 at 2:51 PM Ben Bolker  wrote:

>It took me a little while to figure this out, but: the problem is
> that if your resampling leaves out any countries (which is very likely),
> your model applied to the bootstrapped data will have fewer coefficients
> than your original model.
>
> I tried this:
>
> cc <- unique(e$Country)
> func <- function(data, idx) {
> coef(lm(Score~ Time + factor(Country, levels =cc),data=data[idx,]))
> }
>
> but lm() automatically drops coefficients for missing countries (I
> didn't think about it too hard, but I thought they might get returned as
> NA and that boot() might be able to handle that ...)
>
>If you want to do this I think you'll have to find a way to do a
> *stratified* bootstrap, restricting the bootstrap samples so that they
> always contain at least one sample from each country ... (I would have
> expected "strata = as.numeric(e$Country)" to do this, but it doesn't
> work the way I thought ... it tries to compute the statistics for *each*
> stratum ...)
>
>
>
> 
>
>   Debugging attempts:
>
> set.seed(101)
> options(error=recover)
> B= boot(e, func, R=1000)
>
>
> Error in t.star[r, ] <- res[[r]] :
>number of items to replace is not a multiple of replacement length
>
> Enter a frame number, or 0 to exit
>
> 1: boot(e, func, R = 1000)
>
> 
>
> Selection: 1
> Called from: top level
> Browse[1]> print(r)
> [1] 2
> Browse[1]> t.star[r,]
> [1] NA NA NA NA NA NA NA NA NA
>
> i[2,]
>   [1] 14 15 22 22 21 14  8  2 12 22 10 15  9  7  9 13 12 23  1 20 15  7
> 5 10
>
>
>
>
> On 2024-01-13 5:22 p.m., varin sacha via R-help wrote:
> > Dear Duncan,
> > Dear Ivan,
> >
> > I really thank you a lot for your response.
> > So, if I correctly understand your answers the problem is coming from
> this line:
> >
> > coef(lm(Score~ Time + factor(Country)),data=data[idx,])
> >
> > This line should be:
> > coef(lm(Score~ Time + factor(Country),data=data[idx,]))
> >
> > If yes, now I get an error message (code here below)! So, it still does
> not work.
> >
> > Error in t.star[r, ] <- res[[r]] :
> >number of items to replace is not a multiple of replacement length
> >
> >
> > ##
> >
> Score=c(345,564,467,675,432,346,476,512,567,543,234,435,654,411,356,658,432,345,432,345,
> 345,456,543,501)
> >
> > Country=c("Italy", "Italy", "Italy", "Turkey", "Turkey", "Turkey",
> "USA", "USA", "USA", "Korea", "Korea", "Korea", "Portugal", "Portugal",
> "Portugal", "UK", "UK", "UK", "Poland", "Poland", "Poland", "Austria",
> "Austria", "Austria")
> >
> > Time=c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3)
> >
> > e=data.frame(Score, Country, Time)
> >
> >
> > library(boot)
> > func= function(data, idx) {
> > coef(lm(Score~ Time + factor(Country),data=data[idx,]))
> > }
> > B= boot(e, func, R=1000)
> >
> > boot.ci(B, index=2, type="perc")
> > #
> >
> >
> >
> >
> >
> >
> >
> >
> > Le samedi 13 janvier 2024 à 21:56:58 UTC+1, Ivan Krylov <
> ikry...@disroot.org> a écrit :
> >
> >
> >
> >
> >
> > В Sat, 13 Jan 2024 20:33:47 + (UTC)
> >
> > varin sacha via R-help  пишет:
> >
> >> coef(lm(Score~ Time + factor(Country)),data=data[idx,])
> >
> >
> > Wrong place for the data=... argument. You meant to give it to lm(...),
> > but in the end it went to coef(...). Without the data=... argument, the
> > formula passed to lm() picks up the global variables inherited by the
> > func() closure.
> >
> > Unfortunately, S3 methods really do have to ignore extra arguments they
> > don't understand if the class is to be extended, so coef.lm isn't
> > allowed to complain to you about it.
> >
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

Re: [R] Exponential Autoregressive time series model EAR(1)

2023-12-17 Thread Bert Gunter

If you have not already done so, I suggest you look here:
https://cran.r-project.org/web/views/TimeSeries.html

(R task views are an excellent place to look for such queries)

Or a web search here:
https://rseek.org/

-- Bert

On Sat, Dec 16, 2023 at 11:31 PM Mohamed Ezzat <
mohamed.ezzat.abdela...@gmail.com> wrote:

> Dears,
>
> I hope that email finds you well.
>
> I'm sending you this email to ask if there is any built in R code for
> simulating the Exponential Autoregressive time series model of order 1
> EAR(1).
>
> So, please provide me with the code if the code already exists
>
>
> Thanks in advance.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Advice on starting to analyze smokestack emissions?

2023-12-12 Thread Bert Gunter

My point was only that there might be functionality there that might be
relevant to his concerns. .. with help on how to use it.

Bert

On Tue, Dec 12, 2023, 19:37 Ebert,Timothy Aaron  wrote:

> That depends on how exactly everything must match your primary question.
> The ecology group might be helpful for how biodiversity changes with
> proximity to a smokestack. They might have a better idea if the smokestack
> was from a coal fired powerplant or oil refinery. The modeling process
> would be similar, though the abundance of individual contaminants would be
> quite different. Just my thought for what it is worth.
> Tim
>
> -Original Message-
> From: R-help  On Behalf Of Bert Gunter
> Sent: Tuesday, December 12, 2023 10:53 AM
> To: Kevin Zembower 
> Cc: R-help email list 
> Subject: Re: [R] Advice on starting to analyze smokestack emissions?
>
> [External Email]
>
> You might also try the R-Sig-ecology list, though I would agree that it's
> not clearly related. Still, air pollution effects...?
>
> -- Bert
>
> On Tue, Dec 12, 2023 at 3:15 AM Kevin Zembower via R-help <
> r-help@r-project.org> wrote:
>
> > Hello, all,
> >
> > [Originally sent to r-sig-geo list, with no response. Cross-posting
> > here, in the hope of a wider audience. Anyone with any experience in
> > this topic? Thanks.]
> >
> > I'm trying to get started analyzing the concentrations of smokestack
> > emissions. I don't have any professional background or training for
> > this; I'm just an old, retired guy who thinks playing with numbers is
> > fun.
> >
> > A local funeral home in my neighborhood (less than 1200 ft from my
> > home) is proposing to construct a crematorium for human remains. I
> > have some experience with the tidycensus package and thought it might
> > be interesting to construct a model for the changes in concentrations
> > of the pollutants from the smokestack and, using recorded wind speeds
> > and directions, see which US Census blocks would be affected.
> >
> > I have the US Government EPA SCREEN3 output on how concentration
> > varies with distance from the smokestack.
> > See
> > https://www/.
> > epa.gov%2Fscram%2Fair-quality-dispersion-modeling-screening-models%23s
> > creen3=05%7C02%7Ctebert%40ufl.edu%7C3097c182143c47a6789c08dbfb2a7
> > ed2%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638379932467260671%7C
> > Unknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1h
> > aWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=QgsYQ9w28caBmEGwJ9Kei2x0fSkH3
> > 4v3%2BfAo37GdcYQ%3D=0 if curious. As a first task, I'd like
> > to see if I can calculate similar results in R. I'm aware of the
> > 'plume' steady-state Gaussian dispersion package
> > (https://rdr/
> > r.io
> %2Fgithub%2Fholstius%2Fplume%2Ff%2Finst%2Fdoc%2Fplume-intro.pdf=05%7C02%7Ctebert%
> 40ufl.edu%7C3097c182143c47a6789c08dbfb2a7ed2%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638379932467260671%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=DN9oxiJnDFvvmY968G9t9Sagr8UfJ2ySZiGWV1%2F9AC8%3D=0),
> but am a little concerned that this package was last updated 11 years ago.
> >
> > Do you have any recommendations for me on how to get started analyzing
> > this problem? Is 'plume' still the way to go? I'm aware that there are
> > many atmospheric dispersion models from the US EPA, but I was hoping
> > to keep my work within R, which I'm really enjoying using and learning
> > about. Are SCREEN3 and 'plume' comparable? Is this the best R list to
> > ask questions about this topic?
> >
> > Thanks for any advice or guidance you have for me.
> >
> > -Kevin
> >
> >
> >
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat/
> > .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C02%7Ctebert%40ufl.edu
> > %7C3097c182143c47a6789c08dbfb2a7ed2%7C0d4da0f84a314d76ace60a62331e1b84
> > %7C0%7C0%7C638379932467260671%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> > ta=dxsuLWVRx8wNnu49SJ34AAh7oRECvDIrQh9%2Bpx48SL0%3D=0
> > PLEASE do read the posting guide
> > http://www.r/
> > -project.org%2Fposting-guide.html=05%7C02%7Ctebert%40ufl.edu%7C30
> > 97c182143c47a6789c08dbfb2a7ed2%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
> > 7C0%7C638379932467260671%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=QY
> > AiKA8xDhcPyQmRZ6Vqcr5mdszE8WSRyFmCqzQ7Rog%3D=0
> &g

Re: [R] Advice on starting to analyze smokestack emissions?

2023-12-12 Thread Bert Gunter

You might also try the R-Sig-ecology list, though I would agree that it's
not clearly related. Still, air pollution effects...?

-- Bert

On Tue, Dec 12, 2023 at 3:15 AM Kevin Zembower via R-help <
r-help@r-project.org> wrote:

> Hello, all,
>
> [Originally sent to r-sig-geo list, with no response. Cross-posting
> here, in the hope of a wider audience. Anyone with any experience in
> this topic? Thanks.]
>
> I'm trying to get started analyzing the concentrations of smokestack
> emissions. I don't have any professional background or training for
> this; I'm just an old, retired guy who thinks playing with numbers is
> fun.
>
> A local funeral home in my neighborhood (less than 1200 ft from my
> home) is proposing to construct a crematorium for human remains. I have
> some experience with the tidycensus package and thought it might be
> interesting to construct a model for the changes in concentrations of
> the pollutants from the smokestack and, using recorded wind speeds and
> directions, see which US Census blocks would be affected.
>
> I have the US Government EPA SCREEN3 output on how concentration varies
> with distance from the smokestack.
> See
> https://www.epa.gov/scram/air-quality-dispersion-modeling-screening-models#screen3
> if curious. As a first task, I'd like to see if I can calculate similar
> results in R. I'm aware of the 'plume' steady-state Gaussian dispersion
> package
> (https://rdrr.io/github/holstius/plume/f/inst/doc/plume-intro.pdf), but
> am a little concerned that this package was last updated 11 years ago.
>
> Do you have any recommendations for me on how to get started analyzing
> this problem? Is 'plume' still the way to go? I'm aware that there are
> many atmospheric dispersion models from the US EPA, but I was hoping to
> keep my work within R, which I'm really enjoying using and learning
> about. Are SCREEN3 and 'plume' comparable? Is this the best R list to
> ask questions about this topic?
>
> Thanks for any advice or guidance you have for me.
>
> -Kevin
>
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2: Get the regression line with 95% confidence bands

2023-12-10 Thread Bert Gunter

This can easily be done using predict.lm to get the intervals (confidence
or prediction).
?predict.lm contains a plotting example using ?matplot from the graphics
package. Here's a somewhat verbose version for your example (first
converting Year to numeric, of course):

df=data.frame(year= c(2012,2015,2018,2022), score=c(495,493, 495, 474))

fitted <- lm(score ~ year, data = df)
with(df,
   matplot(x = year, y = cbind(score,predict(fitted,interval = 'conf',
level = .95))
,type = c('p', rep('l',3))
,pch = 16
,lty = c('blank','solid', 'dashed','dashed') ## or use numeric
values of 0,1,2,2
,lwd = c(0,2,1,1)
,col = c('black','darkblue', 'red','red')
,xlab = 'Year'
,ylab = 'Data with Fitted Line and Conf Intervals'

   )
)

Cheers,
Bert



On Sun, Dec 10, 2023 at 2:51 PM Rui Barradas  wrote:

> Às 22:35 de 10/12/2023, varin sacha via R-help escreveu:
> >
> > Dear R-experts,
> >
> > Here below my R code, as my X-axis is "year", I must be missing one or
> more steps! I am trying to get the regression line with the 95% confidence
> bands around the regression line. Any help would be appreciated.
> >
> > Best,
> > S.
> >
> >
> > #
> > library(ggplot2)
> >
> > df=data.frame(year=factor(c("2012","2015","2018","2022")),
> score=c(495,493, 495, 474))
> >
> > ggplot(df, aes(x=year, y=score)) + geom_point( ) +
> geom_smooth(method="lm", formula = score ~ factor(year), data = df) +
> labs(title="Standard linear regression for France", y="PISA score in
> mathematics") + ylim(470, 500)
> > #
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> Hello,
>
> I don't see a reason why year should be a factor and the formula in
> geom_smooth is wrong, it should be y ~ x, the aesthetics envolved.
> It still doesn't plot the CI's though. There's a warning and I am not
> understanding where it comes from. But the regression line is plotted.
>
>
>
> ggplot(df, aes(x = as.numeric(year), y = score)) +
>geom_point() +
>geom_smooth(method = "lm", formula = y ~ x) +
>labs(
>  title = "Standard linear regression for France",
>  x = "Year",
>  y = "PISA score in mathematics"
>) +
>ylim(470, 500)
> #> Warning message:
> #> In max(ids, na.rm = TRUE) : no non-missing arguments to max;
> returning -Inf
>
>
>
> Hope this helps,
>
> Rui Barradas
>
>
>
> --
> Este e-mail foi analisado pelo software antivírus AVG para verificar a
> presença de vírus.
> www.avg.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reshape() not dropping varaibles

2023-12-10 Thread Bert Gunter

Posting a few rows, say 5, of your data using dput() along with the result
that you would like to get for those rows would help get you a quicker and
more accurate response, I believe. This is as suggested by the posting
guide, linked below, which you should read if you have not already.

-- Bert



On Sun, Dec 10, 2023 at 2:38 AM Bob O'Hara  wrote:

> Hi all!
>
> I1m trying to re-format some data from long to wide format with reshape().
> Specifically, the data has SURVEYDATE, which I want to be in the rows, and
> COMMON_NAME which should be the columns. The entries should be TOTAL_CATCH.
> The data has a bunch of other variables, which can be ignored.
>
> When I run reshape(), it includes all of the variables, not just
> TOTAL_CATCH:
>
> Data <- read.csv("
>
> https://conservancy.umn.edu/bitstream/handle/11299/227105/fish_data_raw.csv?sequence=6=y
> ")
> Data.wide <- reshape(Data, direction = "wide",
> idvar = "SURVEYDATE", timevar = "COMMON_NAME",
> v.names = "TOTAL_CATCH")
> names(Data.wide)
>
> I tried with the example on the help page, which works fine:
>
> # this works
> Indometh$thing <- 1:nrow(Indometh)
> wide <- reshape(Indometh, direction = "wide", idvar = "Subject",
> timevar = "time", v.names = "conc", sep= "_")
> names(wide)
>
> There are some obvious work-arounds and alternatives, but it would be nice
> to have this sorted. Can anyone help?
>
> Bob
>
> Bob
>
> --
> Bob O'Hara
> Institutt for matematiske fag
> NTNU
> 7491 Trondheim
> Norway
>
> Mobile: +47 915 54 416
> Journal of Negative Results - EEB: www.jnr-eeb.org
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear model and approx function

2023-12-09 Thread Bert Gunter

1. You should regress Elevation on Volume, no?

2. You are calling lm incorrectly for prediction. Please read ?lm and
related links carefully and/or consult a tutorial. R-Help is really not the
first place you should look for this sort of detailed info.

3. I think this is what you want:

lm1 <- lm(Elevation ~ Vol, data = x6)  ## assuming (1) above is correct
d <- data.frame(Vol = 3000)  ## assuming (1) above is correct
predict(lm1, newdata = d)

Cheers,
Bert


On Sat, Dec 9, 2023 at 10:50 AM javad bayat  wrote:

> Dear all;
>
> I have a dataframe with several columns. The columns are the elevation,
> volume  and the area of the cells (which were placed inside a polygon). I
> have extracted them from DEM raster to calculate the volume under polygon
> and the elevation for a specific volume of the reservoir.
>
> > head(x6,2)
>   Elevation   Vol  Area V_sum  A_sum
> 1 2145  13990.38  85.83053  13990.38   85.83053
> 2 2147  43129.18 267.88312  57119.56  353.71365
>
> > tail(x6,2)
>  Elevation  Vol  Area  V_sumA_sum
> 158  2307 233.0276 233.02756 1771806968 15172603
> 159  2308   0.  71.65642 1771806968 15172674
>
> I used a linear model to estimate the elevation for a specific volume, but
> the codes do not work properly.
>
> lm1 = lm(x6[,1]~x6[,4])
> new_volume <- 3,000,000,000
> pred_elev <- predict(lm1, newdata = data.frame(volume = new_volume))
> pred_elev
>
> The results just estimated for the 159 rows of the dataframe, not the new
> volume.
>
> > tail(pred_elev)
>  154  155  156  157  158  159
> 2254.296 2254.296 2254.296 2254.296 2254.296 2254.296
>
> Also I have used the approx function, but it does not work for the new
> volume, too.
>
> > a = x6[,1]
> > b = x6[,4]
> > estimate <- 3,000,000,000
> > appro <- approx(b,a, xout = estimate)
> > appro
> $x
> [1] 3e+09
>
> $y
> [1] NA
>
> I do not know why it has happened.
>
> Is there any way to do this?
> Or maybe there is another way to do that.
> I would be more than happy if anyone help me.
>
> Sincerely
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert two-dimensional array into a three-dimensional array.

2023-12-08 Thread Bert Gunter

OK. I'm not getting what you want, so feel free to ignore this if you think
I've missed the point completely and don't want to waste your time. Won't
be my first time clueless.

A 3-D array can be  thought of as as a "pile" of 2-D flats, so a 10 x 2 x
10 array consists of 10 2-d flats, each 10 x 2. So tell me what you want
the first 10x2 flat to contain, then the second, etc.  Here is a print
representation of a 2 x 4 x 3 array that might help:

> array(1:24, dim = c(2,4,3))
, , 1

 [,1] [,2] [,3] [,4]
[1,]1357
[2,]2468

, , 2

 [,1] [,2] [,3] [,4]
[1,]9   11   13   15
[2,]   10   12   14   16

, , 3

 [,1] [,2] [,3] [,4]
[1,]   17   19   21   23
[2,]   18   20   22   24

FWIW, it sounds to me like you just do something like:

> dval <- matrix(1:8, nrow = 2)
> dval
 [,1] [,2] [,3] [,4]
[1,]1357
[2,]2468
> ar <- array(dval, dim = c(2,4,3))
> ar
, , 1

 [,1] [,2] [,3] [,4]
[1,]1357
[2,]2468

, , 2

 [,1] [,2] [,3] [,4]
[1,]1357
[2,]2468

, , 3

 [,1] [,2] [,3] [,4]
[1,]1357
[2,]2468

since the 3rd array index itself provides the values you refer to. But this
doesn't make sense to me, so I've probably misinterpreted.

Cheers,
Bert


On Fri, Dec 8, 2023 at 2:58 PM Sorkin, John 
wrote:

> Colleagues
>
> I want to convert a 10x2 array:
> # create a 10x2 matrix.
> datavals <- matrix(nrow=10,ncol=2)
> datavals[,] <- rep(c(1,2),10)+c(rnorm(10),rnorm(10))
> datavals
>
> into a 10x3 array, ThreeDArray, dim(10,2,10).
>
> The values storede in  ThreeDArray's first dimensions will be the data
> stored in datavalues.
> ThreeDArray[i,,] <- datavals[i,]
>
> The values storede in  ThreeDArray's second dimensions will be the data
> stored in datavalues.
> ThreeDArray[,j,] <- datavals[,j]
>
> The data stored in ThreeDArray[,,1] will be 1,
> The data stored in ThreeDArray[,,2] will be 2.
>  . . .
> The data stored in ThreeDArray[,,10] will be 10.
>
> I have no idea how to code the coversion of the 10x2 matrix into a 10,2,10
> array.
> I may be able to acomplish my mission by coding each line of the plan
> described above,
> but there has to be a more efficient and elegant way to accompish my goal.
>
> Many thanks for your help!
> John
>
>
>
>
> John David Sorkin M.D., Ph.D.
> Professor of Medicine, University of Maryland School of Medicine;
>
> Associate Director for Biostatistics and Informatics, Baltimore VA Medical
> Center Geriatrics Research, Education, and Clinical Center;
>
> PI Biostatistics and Informatics Core, University of Maryland School of
> Medicine Claude D. Pepper Older Americans Independence Center;
>
> Senior Statistician University of Maryland Center for Vascular Research;
>
> Division of Gerontology and Paliative Care,
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> Cell phone 443-418-5382
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Volume of polygon

2023-12-05 Thread Bert Gunter

The volume of a polygon  = 0.  Polyhedra  have volumes.

This may be irrelevant, but if the lake is cylindrical == constant cross
sectional area at all depths, then height doubles when the volume does and
vice versa.  Otherwise you have to know how area varies with height or use
more sensible approximations thereto.

Cheers,
Bert

On Tue, Dec 5, 2023, 20:13 javad bayat  wrote:

>  Dear all;
> I am trying to calculate the volume of a polygon shapefile according to a
> DEM raster. I have provided some codes at the end of this email.I dont know
> if the codes are correct or not. Following this, I have another question
> too.
> I want to know if the volume of the reservoir rises or doubles, what would
> be the elevation?
> I would be more than happy if anyone could help me.
> Sincerely
>
> "
> library(raster)
> library(terra)
> library(exactextractr)
> library(dplyr)
> library(sf)
> r <- raster("Base.tif")
> p <- shapefile("p.shp")
> r <- crop(r, p)
> r <- mask(r, p)
> x <- exact_extract(r, p, coverage_area = TRUE)
>
> x1 = as.data.frame(x[1])
> head(x1)
> x1 = na.omit(x1)
>
> x1$Height = max(x1[,1]) - x1[,1]
>
> x1$Vol = x1[,2] * x1[,3]
>
> sum(x1$Vol)
>
> "
>
> --
> Best Regards
> Javad Bayat
> M.Sc. Environment Engineering
> Alternative Mail: bayat...@yahoo.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] back tick names with predict function

2023-11-30 Thread Bert Gunter

"Thank you Rui.  I didn't know about the check.names = FALSE argument.
> Another good reminder to always read help, but I'm not sure I understood
> what help to read in this case"

?data.frame , of course, which says:

"check.names

logical. If TRUE then the names of the variables in the data frame are
checked to ensure that they are syntactically valid variable names and
are not duplicated. If necessary they are adjusted (by make.names) so
that they are. "

-- Bert

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Code editor for writing R code

2023-11-29 Thread Bert Gunter

This might be of use to you:

https://everyday.codes/tutorials/how-to-use-latex-in-rmarkdown/

-- Bert

On Wed, Nov 29, 2023 at 8:21 AM Bert Gunter  wrote:
>
> I believe RMarkdown can use and render latex comments. RStudio/Posix
> provides ide extensions (e.g. R Notebooks) that seem to do what you
> want, but I have no experience with them. As Ben suggested, there are
> likely others, depending on exactly what you want to do.
>
> -- Bert
>
> On Wed, Nov 29, 2023 at 7:58 AM Christofer Bogaso
>  wrote:
> >
> > Hi,
> >
> > Currently I use VS-Code to write codes in R. While it is very good, it
> > does not allow me to write Latex expressions in comments, which I am
> > willing to have to write corresponding mathematical expressions as
> > comments in my code files.
> >
> > Does there exist any Code editor for R, that allows me to write Latex
> > in comments?
> >
> > Any information will be appreciated.
> >
> > Thanks,
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Code editor for writing R code

2023-11-29 Thread Bert Gunter

I believe RMarkdown can use and render latex comments. RStudio/Posix
provides ide extensions (e.g. R Notebooks) that seem to do what you
want, but I have no experience with them. As Ben suggested, there are
likely others, depending on exactly what you want to do.

-- Bert

On Wed, Nov 29, 2023 at 7:58 AM Christofer Bogaso
 wrote:
>
> Hi,
>
> Currently I use VS-Code to write codes in R. While it is very good, it
> does not allow me to write Latex expressions in comments, which I am
> willing to have to write corresponding mathematical expressions as
> comments in my code files.
>
> Does there exist any Code editor for R, that allows me to write Latex
> in comments?
>
> Any information will be appreciated.
>
> Thanks,
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] OFF TOPIC: Fabrication of Scientific Data by Generative AI

2023-11-22 Thread Bert Gunter

All:
**OFF TOPIC** -- feel free to respond to me personally, but I will not
respond to on-list comments.

https://www.nature.com/articles/d41586-023-03635-w

Many of you will no doubt know of this already.  I hope it will be of
interest to all concerned with research replicability, integrity, and
the public's view of the value of scientific research.

Best to all (and happy Thanksgiving to US readers),

Bert

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cannot calculate confidence intervals NULL

2023-11-15 Thread Bert Gunter

I believe the problem is here:

cor1 <- cor(x1, y1, method="spearman")
 cor2 <- cor(x2, y2, method="spearman")

The x's and y's are not looked for in data (i.e. NSE) but in the
environment where the function was defined, which is standard evaluation.
Change the above to:

cor1 <- with(d, cor(x1, y1, method="spearman"))
 cor2 <- with(d, cor(x2, y2, method="spearman"))

and all should be fine.

-- Bert

On Wed, Nov 15, 2023 at 12:54 PM varin sacha via R-help <
r-help@r-project.org> wrote:

> R-Experts,
>
> Here below my R code working without error message but I don't get the
> results I am expecting.
> Here is the result I get:
>
> [1] "All values of t are equal to 0.28611928397257 \n Cannot calculate
> confidence intervals"
> NULL
>
> If someone knows how to solve my problem, really appreciate.
> Best,
> S
>
>
> #
> # Difference in Spearman rho
>
> library(boot)
>
>
> x1=c(4,6,5,7,8,4,2,3,5.5,6.7,5.5,3.5,2,1,3,5,6,3.5,2.5,2,1,2,3,2,1,2,3,4,3,4)
>
>
> y1=c(10,14,12.5,21,15,16,17.5,11,11.5,21,19,16,17.5,18,18.5,12,13,14,11,11,12,18,20,13,23,12,11,14,16,11)
>
> x2=c(5,3,4,2,1,1,1,2,3,4,5,4,3,2,1,3,4.5,4.5,5.5,6,5,4,7,8,3,4,2,5,4,3)
>
>
> y2=c(11,12,13,11,10,19,21,21,13,15,18,13,12,14,19,18.5,17.5,12.5,10,9,11,13,14,16,11,18,14,13,12,12)
>
> # Function to calculate the difference in Spearman coefficients
> pearson_diff <- function(data, indices) {
>
> # Sample the data
>   d <- data[indices, ]
>
>  # Calculate the Spearman correlation coefficients for every sample
>   cor1 <- cor(x1, y1, method="spearman")
>   cor2 <- cor(x2, y2, method="spearman")
>
> # Return the difference
>   return(cor1 - cor2)
> }
>
> # Create a data.frame with the data
> data <- data.frame(x1, y1, x2, y2)
>
> # Use the boot function to apply the bootstrap
> set.seed(123) # For reproducibility
> bootstrap_results <- boot(data = data, statistic = pearson_diff, R = 1000)
>
> # Calculate all the 95% confidence interval
> boot.ci(bootstrap_results, type = "all")
> ###
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] kindly unsubscribe

2023-11-15 Thread Bert Gunter

Please see the bottom of this and every message for the link to unsubscribe.

-- Bert

On Wed, Nov 15, 2023 at 10:25 AM Saikat Dutta Chowdhury <
saikatduttachowdh...@gmail.com> wrote:

> --
> Saikat Dutta Chowdhury
> Mobile: 8017650842
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can someone please have a look at this query on stackoverflow?

2023-11-15 Thread Bert Gunter

... and note also that there may be clipping options you can change/set to
approximate your desiderata.

-- Bert

On Wed, Nov 15, 2023 at 9:02 AM Bert Gunter  wrote:

> Well, as no one else has offered an answer ...
>
> I haven't looked at this closely, but could it not simply be the case that
> the aspect ratio set by "Landscape Mode" just differs from that of your
> display device? -- i.e., it is impossible to have the figure displayed in
> landscape ratio *and* simultaneously fill your display device?
>
> If this is obviously wrong, feel free to ignore without replying.
>
> Cheers,
> Bert
>
> On Tue, Nov 14, 2023 at 8:48 PM Ashim Kapoor 
> wrote:
>
>> Dear John,
>>
>> Many thanks for your reply.
>>
>> I wish 2 things :
>> 1. Landscape mode
>> 2. No wasted space on the sides when I maximise the pdf.
>>
>> When I download an open and maximise the pdf it wastes space. Please
>> see the attached screenshot.
>>
>> Query: When you maximise the PDF does it occupy the full screen ?
>>
>> Best,
>> Ashim
>>
>> On Tue, Nov 14, 2023 at 5:14 PM John Kane  wrote:
>> >
>> > I ran the code from the answer and it seems to work well. It,
>> definitely, is giving a landscape output.
>> >
>> > ---
>> > title: "Testing landscape and aspect ratio"
>> > output:
>> >   pdf_document:
>> > number_sections: true
>> > classoption:
>> >   - landscape
>> >   - "aspectratio=169"
>> > header-includes:
>> >- \usepackage{dcolumn}
>> > documentclass: article
>> > geometry: margin=1.5cm
>> > ---
>> >
>> > ```{r, out.extra='keepaspectratio=true', out.height='100%',
>> out.width="100%"}
>> > plot(rnorm(100))
>> > ```
>> >
>> >
>> >
>> >
>> > On Mon, 13 Nov 2023 at 23:33, Ashim Kapoor 
>> wrote:
>> >>
>> >> Dear all,
>> >>
>> >> I have posted a query which has received a response but that is not
>> >> working on my computer.
>> >>
>> >> Here is the query:
>> >>
>> >>
>> https://stackoverflow.com/questions/77387434/pdf-from-rmarkdown-landscape-and-aspectratio-169
>> >>
>> >> Can someone please help me ?
>> >>
>> >> Best Regards,
>> >> Ashim
>> >>
>> >> __
>> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>> >
>> > --
>> > John Kane
>> > Kingston ON Canada
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can someone please have a look at this query on stackoverflow?

2023-11-15 Thread Bert Gunter

Well, as no one else has offered an answer ...

I haven't looked at this closely, but could it not simply be the case that
the aspect ratio set by "Landscape Mode" just differs from that of your
display device? -- i.e., it is impossible to have the figure displayed in
landscape ratio *and* simultaneously fill your display device?

If this is obviously wrong, feel free to ignore without replying.

Cheers,
Bert

On Tue, Nov 14, 2023 at 8:48 PM Ashim Kapoor  wrote:

> Dear John,
>
> Many thanks for your reply.
>
> I wish 2 things :
> 1. Landscape mode
> 2. No wasted space on the sides when I maximise the pdf.
>
> When I download an open and maximise the pdf it wastes space. Please
> see the attached screenshot.
>
> Query: When you maximise the PDF does it occupy the full screen ?
>
> Best,
> Ashim
>
> On Tue, Nov 14, 2023 at 5:14 PM John Kane  wrote:
> >
> > I ran the code from the answer and it seems to work well. It,
> definitely, is giving a landscape output.
> >
> > ---
> > title: "Testing landscape and aspect ratio"
> > output:
> >   pdf_document:
> > number_sections: true
> > classoption:
> >   - landscape
> >   - "aspectratio=169"
> > header-includes:
> >- \usepackage{dcolumn}
> > documentclass: article
> > geometry: margin=1.5cm
> > ---
> >
> > ```{r, out.extra='keepaspectratio=true', out.height='100%',
> out.width="100%"}
> > plot(rnorm(100))
> > ```
> >
> >
> >
> >
> > On Mon, 13 Nov 2023 at 23:33, Ashim Kapoor 
> wrote:
> >>
> >> Dear all,
> >>
> >> I have posted a query which has received a response but that is not
> >> working on my computer.
> >>
> >> Here is the query:
> >>
> >>
> https://stackoverflow.com/questions/77387434/pdf-from-rmarkdown-landscape-and-aspectratio-169
> >>
> >> Can someone please help me ?
> >>
> >> Best Regards,
> >> Ashim
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
> >
> > --
> > John Kane
> > Kingston ON Canada
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] make a lattice dotplot with symbol size proportional to a variable in the plotted dataframe

2023-11-08 Thread Bert Gunter

... and see also Section 3.5, Scope of Variables in the "R Language
Definition" manual that ships with R.

Cheers,
Bert

On Wed, Nov 8, 2023 at 8:06 AM Deepayan Sarkar 
wrote:

> On Wed, 8 Nov 2023 at 10:56, Christopher W. Ryan via R-help
>  wrote:
> >
> > Very helpful, Deepayan, and educational. Thank you.
> >
> > What does NSE stand for?
>
> Non-standard evaluation, used widely in formula-interface functions as
> well as the tidyverse. with() in my example is a less nuanced version
> of this. See
>
> http://adv-r.had.co.nz/Computing-on-the-language.html
>
> https://developer.r-project.org/nonstandard-eval.pdf
>
> Best,
> -Deepayan
>
>
> > Thanks,
> > Chris
> >
> > Deepayan Sarkar wrote:
> > >
> > > --Chris Ryan
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding columns to a tibble based on a value in a different tibble

2023-11-04 Thread Bert Gunter

I think a simple reproducible example ("reprex") may be necessary for you
to get a useful reply. Questions with vague specifications such as yours
often result in going round and round with attempts to clarify what you
mean without a satisfactory answer. Clarification at the outset with a
reprex may save you and others a lot of frustration.

Cheers,
Bert

On Sat, Nov 4, 2023 at 1:41 AM Alessandro Puglisi <
alessandro.pugl...@gmail.com> wrote:

> Hi everyone,
>
> I have a tibble with various ids and associated information.
>
> I need to add a new column to this tibble that retrieves a specific 'y'
> value from a different tibble that has some of the mentioned ids in the
> first column and a 'y' value in the second one. If the id, and so the 'y'
> value is found, it will be included; otherwise, 'NA' will be used.
>
> Could you please help me?
>
> Thanks,
> Alessandro
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I need to create new variables based on two numeric variables and one dichotomize conditional category variables.

2023-11-03 Thread Bert Gunter

Well, something like:

LAP <- ifelse(gender =='male', (WC-65)*TG, (WC-58)*TG)

The exact code depends on whether your variables are in a data frame or
list or whatever, which you failed to specify. If so, ?with  may be useful.

Cheers,
Bert



On Fri, Nov 3, 2023 at 3:43 AM Md. Kamruzzaman  wrote:

> Hello Everyone,
> I have three variables: Waist circumference (WC), serum triglyceride (TG)
> level and gender. Waist circumference and serum triglyceride is numeric and
> gender (male and female) is categorical. From these three variables, I want
> to calculate the "Lipid Accumulation Product (LAP) Index". The equation to
> calculate LAP is different for male and females. I am giving both equations
> below.
>
> LAP for male = (WC-65)*TG
> LAP for female = (WC-58)*TG
>
> My question is 'how can I calculate the LAP and create a single new column?
>
> Your cooperation will be highly appreciated.
>
> Thanks in advance.
>
> With Regards
>
> **
>
> *Md Kamruzzaman*
>
> *PhD **Research Fellow (**Medicine**)*
> Discipline of Medicine and Centre of Research Excellence in Translating
> Nutritional Science to Good Health
> Adelaide Medical School | Faculty of Health and Medical Sciences
> The University of Adelaide
> Adelaide SA 5005
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Virus alert because of an R-help e-mail

2023-10-31 Thread Bert Gunter

No attachments. Most are deleted by  ETH mailman ... because they might
contain viruses.

-- Bert

On Tue, Oct 31, 2023 at 8:59 AM David Croll  wrote:

> I just received a virus warning from my e-mail provider, GMX. See the
> attached image below.
>
> The virus detection can be spurious - but the e-mail was automatically
> deleted by GMX.
>
> With the best regards,
>
>
> David
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Tagged] Re: col.names in as.data.frame() ?

2023-10-28 Thread Bert Gunter

Jeff, et. al. : but ...

Note that as.data.frame() *already* changes the matrix object by adding
column names of *its own choosing* when the matrix has none. So the issue
here is not *whether* col names should be added, but *what*/*how* they
should be. Unless you wish to extend your criticism to the current version
for its failure to adhere to your proscription.

Cheers,
Bert

On Sat, Oct 28, 2023 at 11:55 AM Jeff Newmiller via R-help <
r-help@r-project.org> wrote:

> as.data.frame is a _converter_, while data.frame is a _constructor_.
> Changing the object contents is not what a conversion is for.
>
> On October 28, 2023 11:39:22 AM PDT, Boris Steipe <
> boris.ste...@utoronto.ca> wrote:
> >Thanks Duncan and Avi!
> >
> >That you could use NULL in a matrix() dimnames = list(...) argument
> wasn't clear to me. I thought that would be equivalent to a one-element
> list - and thereby define rownames. So that's good to know.
> >
> >The documentation could be more explicit - but it is probably more work
> to do that than just patch the code to honour a col.names argument. (At
> least I can't see a reason not to.)
> >
> >Thanks again!
> >:-)
> >
> >
> >
> >
> >> On Oct 28, 2023, at 14:24, avi.e.gr...@gmail.com wrote:
> >>
> >> Борис,
> >>
> >> Try this where you tell matrix the column names you want:
> >>
> >> nouns <- as.data.frame(
> >>  matrix(c(
> >>"gaggle",
> >>"geese",
> >>
> >>"dule",
> >>"doves",
> >>
> >>"wake",
> >>"vultures"
> >>  ),
> >>  ncol = 2,
> >>  byrow = TRUE,
> >>  dimnames=list(NULL, c("collective", "category"
> >>
> >> Result:
> >>
> >>> nouns
> >>  collective category
> >> 1 gagglegeese
> >> 2   duledoves
> >> 3   wake vultures
> >>
> >>
> >> The above simply names the columns earlier when creating the matrix.
> >>
> >> There are other ways and the way you tried LOOKS like it should work but
> >> fails for me with a message about it weirdly expecting three rows
> versus two
> >> which seems to confuse rows and columns. My version of R is recent and I
> >> wonder if there is a bug here.
> >>
> >> Consider whether you really need the data.frame created in a single
> >> statement or can you change the column names next as in:
> >>
> >>
> >>> nouns
> >>  V1   V2
> >> 1 gagglegeese
> >> 2   duledoves
> >> 3   wake vultures
> >>> colnames(nouns)
> >> [1] "V1" "V2"
> >>> colnames(nouns) <- c("collective", "category")
> >>> nouns
> >>  collective category
> >> 1 gagglegeese
> >> 2   duledoves
> >> 3   wake vultures
> >>
> >> Is there a known bug here or is the documentation wrong?
> >>
> >> -Original Message-
> >> From: R-help  On Behalf Of Boris Steipe
> >> Sent: Saturday, October 28, 2023 1:54 PM
> >> To: R. Mailing List 
> >> Subject: [R] col.names in as.data.frame() ?
> >>
> >> I have been trying to create a data frame from some structured text in a
> >> single expression. Reprex:
> >>
> >> nouns <- as.data.frame(
> >>  matrix(c(
> >>"gaggle",
> >>"geese",
> >>
> >>"dule",
> >>"doves",
> >>
> >>"wake",
> >>"vultures"
> >>  ), ncol = 2, byrow = TRUE),
> >>  col.names = c("collective", "category")
> >> )
> >>
> >> But ... :
> >>
> >>> str(nouns)
> >> 'data.frame': 3 obs. of  2 variables:
> >> $ V1: chr  "gaggle" "dule" "wake"
> >> $ V2: chr  "geese" "doves" "vultures"
> >>
> >> i.e. the col.names argument does nothing. From my reading of
> ?as.data.frame,
> >> my example should have worked.
> >>
> >> I know how to get the required result with colnames(), but I would like
> to
> >> understand why the idiom as written didn't work, and how I could have
> known
> >> that from the help file.
> >>
> >>
> >> Thanks!
> >> Boris
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

Re: [R] Plot for 10 years extrapolation

2023-10-26 Thread Bert Gunter

Incidentally, if all you wanted to do was plot fitted values, the
predict method is kinda overkill, as it's just the fitted line from
the model. But I assume you wanted to plot CI's/PI's also, as the
example illustrated.

-- Bert

On Thu, Oct 26, 2023 at 1:56 PM Bert Gunter  wrote:
>
> from ?predict.lm:
>
> "predict.lm produces a vector of predictions or a matrix of
> predictions and bounds with column names fit, lwr, and upr if interval
> is set. "
>
> ergo:
> predict(model, dfuture, interval = "prediction")[,"fit"]  ## or [,1]
> as it's the first column in the returned matrix
>
> is your vector of predicted values that you can plot against
> dfuture$date however you would like, e.g. with different colors,
> symbols, or whatever. Exactly how you do this depends on what graphics
> package you are using. The example in ?predict.lm shows you how to do
> it with R's base graphics and overlaying prediction and confidence
> intervals.
>
> Cheers,
> Bert
>
> On Thu, Oct 26, 2023 at 11:27 AM varin sacha via R-help
>  wrote:
> >
> > Dear R-Experts,
> >
> > Here below my R code working but I don't know how to complete/finish my R 
> > code to get the final plot with the extrapolation for the10 more years.
> >
> > Indeed, I try to extrapolate my data with a linear fit over the next 10 
> > years. So I create a date sequence for the next 10 years and store as a 
> > dataframe to make the prediction possible.
> > Now, I am trying to get the plot with the actual data (from year 2004 to 
> > 2018) and with the 10 more years extrapolation.
> >
> > Thanks for your help.
> >
> > 
> > date <-as.Date(c("2018-12-31", "2017-12-31", "2016-12-31", "2015-12-31", 
> > "2014-12-31", "2013-12-31", "2012-12-31", "2011-12-31", "2010-12-31", 
> > "2009-12-31", "2008-12-31", "2007-12-31", "2006-12-31", "2005-12-31", 
> > "2004-12-31"))
> >
> > value <-c(15348, 13136, 11733, 10737, 15674, 11098, 13721, 13209, 11099, 
> > 10087, 14987, 11098, 13421, 9023, 12098)
> >
> > model <- lm(value~date)
> >
> > plot(value~date ,col="grey",pch=20,cex=1.5,main="Plot")
> > abline(model,col="darkorange",lwd=2)
> >
> > dfuture <- data.frame(date=seq(as.Date("2019-12-31"), by="1 year", 
> > length.out=10))
> >
> > predict(model,dfuture,interval="prediction")
> > 
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plot for 10 years extrapolation

2023-10-26 Thread Bert Gunter

from ?predict.lm:

"predict.lm produces a vector of predictions or a matrix of
predictions and bounds with column names fit, lwr, and upr if interval
is set. "

ergo:
predict(model, dfuture, interval = "prediction")[,"fit"]  ## or [,1]
as it's the first column in the returned matrix

is your vector of predicted values that you can plot against
dfuture$date however you would like, e.g. with different colors,
symbols, or whatever. Exactly how you do this depends on what graphics
package you are using. The example in ?predict.lm shows you how to do
it with R's base graphics and overlaying prediction and confidence
intervals.

Cheers,
Bert

On Thu, Oct 26, 2023 at 11:27 AM varin sacha via R-help
 wrote:
>
> Dear R-Experts,
>
> Here below my R code working but I don't know how to complete/finish my R 
> code to get the final plot with the extrapolation for the10 more years.
>
> Indeed, I try to extrapolate my data with a linear fit over the next 10 
> years. So I create a date sequence for the next 10 years and store as a 
> dataframe to make the prediction possible.
> Now, I am trying to get the plot with the actual data (from year 2004 to 
> 2018) and with the 10 more years extrapolation.
>
> Thanks for your help.
>
> 
> date <-as.Date(c("2018-12-31", "2017-12-31", "2016-12-31", "2015-12-31", 
> "2014-12-31", "2013-12-31", "2012-12-31", "2011-12-31", "2010-12-31", 
> "2009-12-31", "2008-12-31", "2007-12-31", "2006-12-31", "2005-12-31", 
> "2004-12-31"))
>
> value <-c(15348, 13136, 11733, 10737, 15674, 11098, 13721, 13209, 11099, 
> 10087, 14987, 11098, 13421, 9023, 12098)
>
> model <- lm(value~date)
>
> plot(value~date ,col="grey",pch=20,cex=1.5,main="Plot")
> abline(model,col="darkorange",lwd=2)
>
> dfuture <- data.frame(date=seq(as.Date("2019-12-31"), by="1 year", 
> length.out=10))
>
> predict(model,dfuture,interval="prediction")
> 
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inquiry about bandwidth rescaling in Ksmooth

2023-10-26 Thread Bert Gunter

Apologies in advance if my comments don't help, in which case, no need
to respond,  but I noted in ?ksmooth:

"bandwidth
the bandwidth. The kernels are scaled so that their quartiles (viewed
as probability densities) are at ± 0.25*bandwidth." So, could this be
a source of the discrepancies you cited?

Given that ?ksmooth explicitly says:

"Note:
This function was implemented for compatibility with S, although it is
nowhere near as slow as the S function. Better kernel smoothers are
available in other packages such as KernSmooth."

One wonder why you bother with it at all? (That was rhetorical -- do
not answer).

Cheers,
Bert

On Thu, Oct 26, 2023 at 11:06 AM Jan Failenschmid via R-help
 wrote:
>
> Dear Sir, Madam, or to whom this may concern,
>
> my name is Jan Failenschmid and I am a Ph.D. student at Tilburg University.
> For my project I have been looking into different types of kernel regression 
> estimators and corresponding R functions.
> While comparing different functions I noticed that stats::ksmooth returned 
> different estimates for the same bandwidth
> as other kernel regression estimators that should be equivalent (i.e. the 
> local polynomial estimators KernSmooth::locpoly and
> locpol::locpol with degree 0). However, when optimizing the bandwidth of 
> ksmooth separately using the same loss function, I find comparable estimates 
> to the other two estimators for a (larger) different bandwidth. To confirm 
> this, I wrote my own Nadaraya-Watson kernel regression estimator, which is 
> consistent with the two local polynomial estimators and shows the same 
> discordance with ksmooth.
>
> This led me to the suspicion that the bandwidth that is passed to kmooth is 
> rescaled or transformed within the function. Unfortunately, I was not able to 
> confirm this with either the code of the function or the documentation. It 
> would be of great help to me if you could clarify this for me.
>
> Thank you very much for your time and help in advance.
>
> Kind regards,
>
> Jan Failenschmid
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create a call but evaluate only some elements

2023-10-25 Thread Bert Gunter

As you seem to have a need for this sort of capability (e.g. bquote),
see Section 6: "Computing on the Language" in the R Language
Definition manual. Actually, if you are interested in a concise
(albeit dense) overview of the R Language, you might consider going
through the whole manual.

Cheers,
Bert

On Wed, Oct 25, 2023 at 3:57 AM Shu Fai Cheung  wrote:
>
> Dear Iris,
>
> Many many thanks! This is exactly what I need! I have never heard
> about bquote(). This function will also be useful to me on other
> occasions.
>
> I still have a lot to learn about the R language ...
>
> Regards,
> Shu Fai
>
>
> On Wed, Oct 25, 2023 at 5:24 PM Iris Simmons  wrote:
> >
> > You can try either of these:
> >
> > expr <- bquote(lm(.(as.formula(mod)), dat))
> > lm_out5 <- eval(expr)
> >
> > expr <- call("lm", as.formula(mod), as.symbol("dat"))
> > lm_out6 <- eval(expr)
> >
> > but bquote is usually easier and good enough.
> >
> > On Wed, Oct 25, 2023, 05:10 Shu Fai Cheung  wrote:
> >>
> >> Hi All,
> >>
> >> I have a problem that may have a simple solution, but I am not
> >> familiar with creating calls manually.
> >>
> >> This is example calling lm()
> >>
> >> ``` r
> >> set.seed(1234)
> >> n <- 10
> >> dat <- data.frame(x1 = rnorm(n),
> >>   x2 = rnorm(n),
> >>   y = rnorm(n))
> >>
> >> lm_out <- lm(y ~ x1 + x2, dat)
> >> lm_out
> >> #>
> >> #> Call:
> >> #> lm(formula = y ~ x1 + x2, data = dat)
> >> #>
> >> #> Coefficients:
> >> #> (Intercept)   x1   x2
> >> #> -0.5755  -0.4151  -0.2411
> >> lm_out$call
> >> #> lm(formula = y ~ x1 + x2, data = dat)
> >> ```
> >>
> >> The call is stored, "lm(formula = y ~ x1 + x2, data = dat)", and names
> >> are not evaluated.
> >>
> >> I want to create a similar call, but only one of the elements is from a 
> >> string.
> >>
> >> ```r
> >> mod <- "y ~ x1 + x2"
> >> ```
> >>
> >> This is what I tried but failed:
> >>
> >> ```r
> >> lm_out2 <- do.call("lm",
> >>list(formula = as.formula(mod),
> >> data = dat))
> >> lm_out2
> >> #>
> >> #> Call:
> >> #> lm(formula = y ~ x1 + x2, data = structure(list(x1 = 
> >> c(-1.20706574938542,
> >> #> 0.27742924211066, 1.08444117668306, -2.34569770262935, 0.42912468881105,
> >> #> 0.506055892157574, -0.574739960134649, -0.546631855784187,
> >> -0.564451999093283,
> >> #> -0.890037829044104), x2 = c(-0.477192699753547, -0.998386444859704,
> >> #> -0.77625389463799, 0.0644588172762693, 0.959494058970771, 
> >> -0.110285494390774,
> >> #> -0.511009505806642, -0.911195416629811, -0.83717168026894, 
> >> 2.41583517848934
> >> #> ), y = c(0.134088220152031, -0.490685896690943, -0.440547872353227,
> >> #> 0.459589441005854, -0.693720246937475, -1.44820491038647, 
> >> 0.574755720900728,
> >> #> -1.02365572296388, -0.0151383003641817, -0.935948601168394)), class
> >> = "data.frame", row.names = c(NA,
> >> #> -10L)))
> >> #>
> >> #> Coefficients:
> >> #> (Intercept)   x1   x2
> >> #> -0.5755  -0.4151  -0.2411
> >> ```
> >>
> >> It does not have the formula, "as a formula": y ~ x1 + x2.
> >> However, the name "dat" is evaluated. Therefore, the call stored does
> >> not have the name 'dat', but has the evaluated content.
> >>
> >> The following fits the same model. However, the call stores the name,
> >> 'mod', not the evaluated result, y ~ x1 + x2.
> >>
> >> ```r
> >> lm_out3 <- lm(mod, data = dat)
> >> lm_out3
> >> #>
> >> #> Call:
> >> #> lm(formula = mod, data = dat)
> >> #>
> >> #> Coefficients:
> >> #> (Intercept)   x1   x2
> >> #> -0.5755  -0.4151  -0.2411
> >> ```
> >>
> >> The following method works. However, I have to do a dummy call,
> >> extract the stored call, and set formula to the result of
> >> as.formula(mod):
> >>
> >> ```r
> >> lm_out3 <- lm(mod, data = dat)
> >> lm_out3
> >> #>
> >> #> Call:
> >> #> lm(formula = mod, data = dat)
> >> #>
> >> #> Coefficients:
> >> #> (Intercept)   x1   x2
> >> #> -0.5755  -0.4151  -0.2411
> >>
> >> call1 <- lm_out3$call
> >> call1$formula <- as.formula(mod)
> >> lm_out4 <- eval(call1)
> >> lm_out4
> >> #>
> >> #> Call:
> >> #> lm(formula = y ~ x1 + x2, data = dat)
> >> #>
> >> #> Coefficients:
> >> #> (Intercept)   x1   x2
> >> #> -0.5755  -0.4151  -0.2411
> >> ```
> >>
> >> Is it possible to create the call directly, with only 'mod' evaluated,
> >> and other arguments, e.g., 'dat', not evaluated?
> >>
> >> Regards,
> >> Shu Fai
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>

Re: [R] running crossvalidation many times MSE for Lasso regression

2023-10-22 Thread Bert Gunter

No error message shown Please include the error message so that it is
not necessary to rerun your code. This might enable someone to see the
problem without running the code (e.g. downloading packages, etc.)

-- Bert

On Sun, Oct 22, 2023 at 1:36 PM varin sacha via R-help
 wrote:
>
> Dear R-experts,
>
> Here below my R code with an error message. Can somebody help me to fix this 
> error?
> Really appreciate your help.
>
> Best,
>
> 
> # MSE CROSSVALIDATION Lasso regression
>
> library(glmnet)
>
>
> x1=c(34,35,12,13,15,37,65,45,47,67,87,45,46,39,87,98,67,51,10,30,65,34,57,68,98,86,45,65,34,78,98,123,202,231,154,21,34,26,56,78,99,83,46,58,91)
> x2=c(1,3,2,4,5,6,7,3,8,9,10,11,12,1,3,4,2,3,4,5,4,6,8,7,9,4,3,6,7,9,8,4,7,6,1,3,2,5,6,8,7,1,1,2,9)
> y=c(2,6,5,4,6,7,8,10,11,2,3,1,3,5,4,6,5,3.4,5.6,-2.4,-5.4,5,3,6,5,-3,-5,3,2,-1,-8,5,8,6,9,4,5,-3,-7,-9,-9,8,7,1,2)
> T=data.frame(y,x1,x2)
>
> z=matrix(c(x1,x2), ncol=2)
> cv_model=glmnet(z,y,alpha=1)
> best_lambda=cv_model$lambda.min
> best_lambda
>
>
> # Create a list to store the results
> lst<-list()
>
> # This statement does the repetitions (looping)
> for(i in 1 :1000) {
>
> n=45
>
> p=0.667
>
> sam=sample(1 :n,floor(p*n),replace=FALSE)
>
> Training =T [sam,]
> Testing = T [-sam,]
>
> test1=matrix(c(Testing$x1,Testing$x2),ncol=2)
>
> predictLasso=predict(cv_model, newx=test1)
>
>
> ypred=predict(predictLasso,newdata=test1)
> y=T[-sam,]$y
>
> MSE = mean((y-ypred)^2)
> MSE
> lst[i]<-MSE
> }
> mean(unlist(lst))
> ##
>
>
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create new data frame with conditional sums

2023-10-16 Thread Bert Gunter

Sorry, misstatements. It should (of course) read:

If one makes the reasonable assumption that Pct is much larger than
Cutoff, sorting Pct is the expensive part e.g O(nlog2(n)  for
Quicksort (n = length Pct). I believe looping is O(n^2).
etc.

On Mon, Oct 16, 2023 at 7:48 AM Bert Gunter  wrote:
>
> If one makes the reasonable assumption that Pct is much larger than
> Cutoff, sorting Cutoff is the expensive part e.g O(nlog2(n)  for
> Quicksort (n = length Cutoff). I believe looping is O(n^2). Jeff's
> approach using findInterval may be faster. Of course implementation
> details matter.
>
> -- Bert
>
> On Mon, Oct 16, 2023 at 4:41 AM Leonard Mada  wrote:
> >
> > Dear Jason,
> >
> > The code could look something like:
> >
> > dummyData = data.frame(Tract=seq(1, 10, by=1),
> >  Pct = c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03),
> >  Totpop = c(4000,3500,4500,4100,3900,4250,5100,4700,4950,4800))
> >
> > # Define the cutoffs
> > # - allow for duplicate entries;
> > by = 0.03; # by = 0.01;
> > cutoffs <- seq(0, 0.20, by = by)
> >
> > # Create a new column with cutoffs
> > dummyData$Cutoff <- cut(dummyData$Pct, breaks = cutoffs,
> >  labels = cutoffs[-1], ordered_result = TRUE)
> >
> > # Sort data
> > # - we could actually order only the columns:
> > #   Totpop & Cutoff;
> > dummyData = dummyData[order(dummyData$Cutoff), ]
> >
> > # Result
> > cs = cumsum(dummyData$Totpop)
> >
> > # Only last entry:
> > # - I do not have a nice one-liner, but this should do it:
> > isLast = rev(! duplicated(rev(dummyData$Cutoff)))
> >
> > data.frame(Total = cs[isLast],
> >  Cutoff = dummyData$Cutoff[isLast])
> >
> >
> > Sincerely,
> >
> > Leonard
> >
> >
> > On 10/15/2023 7:41 PM, Leonard Mada wrote:
> > > Dear Jason,
> > >
> > >
> > > I do not think that the solution based on aggregate offered by GPT was
> > > correct. That quasi-solution only aggregates for every individual level.
> > >
> > >
> > > As I understand, you want the cumulative sum. The idea was proposed by
> > > Bert; you need only to sort first based on the cutoff (e.g. using an
> > > ordered factor). And then only extract the last value for each level.
> > > If Pct is unique, than you can skip this last step and use directly
> > > the cumsum (but on the sorted data set).
> > >
> > >
> > > Alternatives: see the solutions with loops or with sapply.
> > >
> > >
> > > Sincerely,
> > >
> > >
> > > Leonard
> > >
> > >

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create new data frame with conditional sums

2023-10-16 Thread Bert Gunter

If one makes the reasonable assumption that Pct is much larger than
Cutoff, sorting Cutoff is the expensive part e.g O(nlog2(n)  for
Quicksort (n = length Cutoff). I believe looping is O(n^2). Jeff's
approach using findInterval may be faster. Of course implementation
details matter.

-- Bert

On Mon, Oct 16, 2023 at 4:41 AM Leonard Mada  wrote:
>
> Dear Jason,
>
> The code could look something like:
>
> dummyData = data.frame(Tract=seq(1, 10, by=1),
>  Pct = c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03),
>  Totpop = c(4000,3500,4500,4100,3900,4250,5100,4700,4950,4800))
>
> # Define the cutoffs
> # - allow for duplicate entries;
> by = 0.03; # by = 0.01;
> cutoffs <- seq(0, 0.20, by = by)
>
> # Create a new column with cutoffs
> dummyData$Cutoff <- cut(dummyData$Pct, breaks = cutoffs,
>  labels = cutoffs[-1], ordered_result = TRUE)
>
> # Sort data
> # - we could actually order only the columns:
> #   Totpop & Cutoff;
> dummyData = dummyData[order(dummyData$Cutoff), ]
>
> # Result
> cs = cumsum(dummyData$Totpop)
>
> # Only last entry:
> # - I do not have a nice one-liner, but this should do it:
> isLast = rev(! duplicated(rev(dummyData$Cutoff)))
>
> data.frame(Total = cs[isLast],
>  Cutoff = dummyData$Cutoff[isLast])
>
>
> Sincerely,
>
> Leonard
>
>
> On 10/15/2023 7:41 PM, Leonard Mada wrote:
> > Dear Jason,
> >
> >
> > I do not think that the solution based on aggregate offered by GPT was
> > correct. That quasi-solution only aggregates for every individual level.
> >
> >
> > As I understand, you want the cumulative sum. The idea was proposed by
> > Bert; you need only to sort first based on the cutoff (e.g. using an
> > ordered factor). And then only extract the last value for each level.
> > If Pct is unique, than you can skip this last step and use directly
> > the cumsum (but on the sorted data set).
> >
> >
> > Alternatives: see the solutions with loops or with sapply.
> >
> >
> > Sincerely,
> >
> >
> > Leonard
> >
> >

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create new data frame with conditional sums

2023-10-14 Thread Bert Gunter

Well, here's one way to do it:
(dat is your example data frame)

Cutoff <- seq(0, .15, .01)
Pop <- with(dat, sapply(Cutoff, \(p)sum(Totpop[Pct >= p])))

I think there must be a more efficient way to do it with cumsum(), though.

Cheers,
Bert

On Sat, Oct 14, 2023 at 12:53 AM Jason Stout, M.D.  wrote:
>
> This seems like it should be simple but I can't get it to work properly.  I'm 
> starting with a data frame like this:
>
> Tract  Pct  Totpop
> 1  0.054000
> 2  0.033500
> 3  0.014500
> 4  0.124100
> 5  0.213900
> 6  0.044250
> 7  0.075100
> 8  0.094700
> 9  0.064950
> 10   0.034800
>
> And I want to end up with a data frame with two columns, a "Cutoff" column 
> that is a simple sequence of equally spaced cutoffs (let's say in this case 
> from 0-0.15 by 0.01) and a "Pop" column which equals the sum of "Totpop" in 
> the prior data frame in which "Pct" is greater than or equal to "cutoff."  So 
> in this toy example, this is what I want for a result:
>
>Cutoff   Pop
> 10.00 43800
> 20.01 43800
> 30.02 39300
> 40.03 39300
> 50.04 31000
> 60.05 26750
> 70.06 22750
> 80.07 17800
> 90.08 12700
> 10   0.09 12700
> 11   0.10  8000
> 12   0.11  8000
> 13   0.12  8000
> 14   0.13  3900
> 15   0.14  3900
> 16   0.15  3900
>
> I can do this with a for loop but it seems there should be an easier, 
> vectorized way that would be more efficient.  Here is a reproducible example:
>
> dummydata<-data.frame(Tract=seq(1,10,by=1),Pct=c(0.05,0.03,0.01,0.12,0.21,0.04,0.07,0.09,0.06,0.03),Totpop=c(4000,3500,4500,4100,
>   
>3900,4250,5100,4700,
>   
>4950,4800))
> dfrm<-data.frame(matrix(ncol=2,nrow=0,dimnames=list(NULL,c("Cutoff","Pop"
> for (i in seq(0,0.15,by=0.01)) {
>  temp<-sum(dummydata[dummydata$Pct>=i,"Totpop"])
> dfrm[nrow(dfrm)+1,]<-c(i,temp)
> }
>
> Jason Stout, MD, MHS
> Division of Infectious Diseases
> Dept of Medicine
> Duke University
> Box 102359-DUMC
> Durham, NC 27710
> FAX 919-681-7494
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Gigs

2023-10-06 Thread Bert Gunter

May be an age gap here, but I assume "gigs" = freelance jobs. If so,
https://stat.ethz.ch/mailman/listinfo/r-sig-jobs
might be useful. As well as an online search in all the usual places.
Otherwise, please excuse my out-of-date ignorance.

Cheers,
Bert

On Fri, Oct 6, 2023 at 1:23 PM Fred Kwebiha  wrote:
>
> Dear Community,
>
> Where Can I get Gigs related to R programming language?
>
> Thanks in Advance for your help.
>
> *Best Regards,*
>
> *FRED KWEBIHA*
> *+256-782-746-154*
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Off Topic, but perhaps of interest to many on this list

2023-10-06 Thread Bert Gunter

** Please Do Not Respond**
This is only FYI for those who care to follow the link below.

Explanation: Many questions that appear on this list are about how to
organize and format data -- unsurprising, as data structures are an
essential component of software and algorithm development in general,
and data science in particular. Those who are interested in such
queries and/or  the related area of reproducible research may find the
stupefying mess that biologist microscopists have to contend with to
be of interest:

(from Nature)
"How open-source software could finally get the world’s microscopes
speaking the same language"

https://www.nature.com/articles/d41586-023-03064-9

Best to all,
Bert

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems Structuring My Data

2023-10-03 Thread Bert Gunter

I'll make an attempt to answer your question, but the results seems
extremely wasteful, as there appears to be only one REMNAME for each
FAILDATE. This means for my "solution", almost all of the entries are
empty (NULL, actually). So I assume I must misunderstand your
specification, but maybe showing you my misunderstanding will help you
and/or others to clarify or it.

Anyway, calling the data frame you provided, d, my "solution" is
simply (I changed your variable names to fdate, name, and fdays to
save typing):

w <- with(d, tapply(fdays, list(factor(fdate), name), I))
## that's an I = "eye" at the end not an "ell"

w is a *matrix* of the fdays, with the rownames of the matrix your
dates, and the colnames your remnames.  The dimension of w is 640 x
43.  As I said, most of the entries are NULL.

If this *is* perchance what you want, then I suggest you stay with
your original data frame and rethink how to work with it more
efficiently. You should *not* be guided by imitating what one does in
Excel.

However, as is more likely, I've just misunderstood what you want, so
a bit of clarification might get you the answer you seek.

Bon chance.

Cheers,
Bert

On Tue, Oct 3, 2023 at 4:25 PM Bert Gunter  wrote:
>
> I won't attempt to answer your post here, but:
>
> #I don´t know why R is showing the dates as numbers
>
> Please read the Help files!
> You say FAILDATE has "type" = S3 class, "Date".
> The first thing ?Dates tells you in the "Details" is:
>
> "Dates are represented as the number of days since 1970-01-01, with
> negative values for earlier dates. They are always printed following
> the rules of the current Gregorian calendar, even though that calendar
> was not in use long ago (it was adopted in 1752 in Great Britain and
> its colonies)."
>
> That answers your question without ever needing to post here.
>
> Cheers,
> Bert
>
>
>
> On Tue, Oct 3, 2023 at 3:18 PM Paul Bernal  wrote:
> >
> > Dear friends,
> >
> > Hope you are all doing great. I am working with a dataset which is a subset
> > of the original one, that has the following columns: FAILDATE, REM_NAME,
> > and Days_At_Failure.
> >
> > I want to structure the data in such a way that I have the unique FAILDATE
> > in "%Y-%m-%d" format, then a column for each REM_NAME and then the
> > Days_At_Failure displayed for each REM_NAME and FAILDATE (just as if I was
> > using a pivot table).
> >
> > The structure I want is the following:
> > FAILDATE  REM1 REM2  REM3  REM N
> > 2000-01-01 # days at failure #days at
> > failure
> > 2000-01-02 # days at failure #days at
> > failure
> > 2000-01-03 # days at failure #days at
> > failure
> >
> > Here is the dput() of my dataframe:
> > #I don´t know why R is showing the dates as numbers
> > #when I do str, it actually shows that FAIL date has type Date
> >  dput(head(failure_subset2, n=1000))
> > structure(list(FAILDATE = structure(c(17597, 17597, 17347, 17334,
> > 17148, 17168, 17299, 17402, 17406, 17347, 17505, 17449, 17352,
> > 17931, 17424, 17439, 17406, 17292, 17390, 17373, 17259, 17561,
> > 17563, 17550, 17723, 17814, 17564, 17299, 17307, 17296, 17483,
> > 17644, 17394, 17360, 17850, 17744, 17719, 17712, 17710, 18048,
> > 18069, 17876, 17506, 17821, 18041, 17586, 18069, 18069, 18048,
> > 17899, 17899, 17759, 17732, 17822, 17771, 17821, 17837, 17824,
> > 17469, 17483, 17582, 17613, 18016, 18036, 18030, 17862, 17871,
> > 17899, 17651, 17684, 17844, 17632, 17784, 17855, 17764, 17915,
> > 18245, 18260, 18166, 18295, 18094, 18062, 18083, 18223, 18237,
> > 18197, 18284, 18289, 18218, 18218, 18298, 18297, 18299, 17910,
> > 18089, 18304, 18141, 18272, 18387, 18183, 18184, 18422, 18422,
> > 18422, 18038, 17988, 18413, 17836, 18328, 18230, 18011, 18011,
> > 17991, 18041, 18041, 18070, 18432, 18031, 18165, 18345, 18386,
> > 17899, 18374, 18427, 18098, 18416, 18397, 18458, 18126, 18126,
> > 18123, 18286, 17827, 18069, 18081, 18505, 18508, 18086, 18468,
> > 18482, 18107, 18146, 18371, 18368, 18186, 18270, 17772, 18054,
> > 17959, 18106, 18148, 18380, 18398, 17921, 18265, 18273, 18030,
> > 18473, 18166, 18006, 18006, 18006, 18000, 17938, 18155, 18175,
> > 18047, 18503, 18042, 18072, 17964, 18223, 17850, 17871, 18071,
> > 18174, 18154, 18153, 18344, 18384, 18512, 18112, 18131, 18085,
> > 18094, 18096, 18100, 18477, 17967, 17752, 17964, 18491, 18124,
> > 18115, 18166, 17912, 18489, 18087, 18130, 18170, 18169, 18175,
> > 18177, 18226, 18376, 18374, 1821

Re: [R] Problems Structuring My Data

2023-10-03 Thread Bert Gunter

I won't attempt to answer your post here, but:

#I don´t know why R is showing the dates as numbers

Please read the Help files!
You say FAILDATE has "type" = S3 class, "Date".
The first thing ?Dates tells you in the "Details" is:

"Dates are represented as the number of days since 1970-01-01, with
negative values for earlier dates. They are always printed following
the rules of the current Gregorian calendar, even though that calendar
was not in use long ago (it was adopted in 1752 in Great Britain and
its colonies)."

That answers your question without ever needing to post here.

Cheers,
Bert



On Tue, Oct 3, 2023 at 3:18 PM Paul Bernal  wrote:
>
> Dear friends,
>
> Hope you are all doing great. I am working with a dataset which is a subset
> of the original one, that has the following columns: FAILDATE, REM_NAME,
> and Days_At_Failure.
>
> I want to structure the data in such a way that I have the unique FAILDATE
> in "%Y-%m-%d" format, then a column for each REM_NAME and then the
> Days_At_Failure displayed for each REM_NAME and FAILDATE (just as if I was
> using a pivot table).
>
> The structure I want is the following:
> FAILDATE  REM1 REM2  REM3  REM N
> 2000-01-01 # days at failure #days at
> failure
> 2000-01-02 # days at failure #days at
> failure
> 2000-01-03 # days at failure #days at
> failure
>
> Here is the dput() of my dataframe:
> #I don´t know why R is showing the dates as numbers
> #when I do str, it actually shows that FAIL date has type Date
>  dput(head(failure_subset2, n=1000))
> structure(list(FAILDATE = structure(c(17597, 17597, 17347, 17334,
> 17148, 17168, 17299, 17402, 17406, 17347, 17505, 17449, 17352,
> 17931, 17424, 17439, 17406, 17292, 17390, 17373, 17259, 17561,
> 17563, 17550, 17723, 17814, 17564, 17299, 17307, 17296, 17483,
> 17644, 17394, 17360, 17850, 17744, 17719, 17712, 17710, 18048,
> 18069, 17876, 17506, 17821, 18041, 17586, 18069, 18069, 18048,
> 17899, 17899, 17759, 17732, 17822, 17771, 17821, 17837, 17824,
> 17469, 17483, 17582, 17613, 18016, 18036, 18030, 17862, 17871,
> 17899, 17651, 17684, 17844, 17632, 17784, 17855, 17764, 17915,
> 18245, 18260, 18166, 18295, 18094, 18062, 18083, 18223, 18237,
> 18197, 18284, 18289, 18218, 18218, 18298, 18297, 18299, 17910,
> 18089, 18304, 18141, 18272, 18387, 18183, 18184, 18422, 18422,
> 18422, 18038, 17988, 18413, 17836, 18328, 18230, 18011, 18011,
> 17991, 18041, 18041, 18070, 18432, 18031, 18165, 18345, 18386,
> 17899, 18374, 18427, 18098, 18416, 18397, 18458, 18126, 18126,
> 18123, 18286, 17827, 18069, 18081, 18505, 18508, 18086, 18468,
> 18482, 18107, 18146, 18371, 18368, 18186, 18270, 17772, 18054,
> 17959, 18106, 18148, 18380, 18398, 17921, 18265, 18273, 18030,
> 18473, 18166, 18006, 18006, 18006, 18000, 17938, 18155, 18175,
> 18047, 18503, 18042, 18072, 17964, 18223, 17850, 17871, 18071,
> 18174, 18154, 18153, 18344, 18384, 18512, 18112, 18131, 18085,
> 18094, 18096, 18100, 18477, 17967, 17752, 17964, 18491, 18124,
> 18115, 18166, 17912, 18489, 18087, 18130, 18170, 18169, 18175,
> 18177, 18226, 18376, 18374, 18217, 18226, 18135, 18136, 17942,
> 18099, 18031, 18032, 18107, 18041, 18062, 18078, 18087, 18249,
> 18081, 18231, 18195, 18192, 18213, 18209, 18156, 18157, 18157,
> 18158, 18159, 18053, 18221, 18185, 18311, 18239, 18258, 18390,
> 18390, 17997, 18197, 18095, 18145, 18101, 18194, 18260, 18260,
> 18202, 18447, 18450, 18088, 18249, 18206, 18290, 17995, 18270,
> 18282, 18251, 18157, 18094, 18437, 18299, 18333, 18035, 18146,
> 18010, 18395, 18204, 18259, 18311, 18335, 18444, 18444, 18453,
> 18453, 18192, 18098, 17969, 17680, 18147, 18147, 17977, 17996,
> 18001, 18399, 18432, 18354, 18147, 17912, 17995, 17995, 18221,
> 18428, 18447, 18452, 18264, 18321, 18334, 18386, 18156, 18371,
> 18345, 18156, 18164, 18029, 18433, 17602, 17931, 18290, 18339,
> 18424, 18052, 18052, 18166, 18540, 18783, 18166, 18607, 18606,
> 18606, 18594, 18592, 18805, 18648, 18768, 18887, 1, 18648,
> 18630, 18812, 18812, 18845, 18932, 18418, 18950, 18166, 18702,
> 18684, 18812, 18589, 18597, 18611, 18648, 18577, 19024, 19041,
> 19043, 18768, 19216, 18787, 19159, 18529, 18589, 18646, 18166,
> 18571, 18637, 18624, 18716, 18166, 18568, 18527, 18535, 18542,
> 18494, 18846, 18166, 18877, 18548, 18659, 18839, 18905, 18632,
> 18804, 18648, 18892, 18921, 18718, 18805, 18805, 18561, 18682,
> 18943, 18949, 18542, 18927, 18409, 19047, 19059, 18524, 18941,
> 18941, 18941, 18648, 19044, 18758, 19034, 18609, 18788, 18812,
> 18676, 19194, 18740, 18812, 18842, 18659, 19042, 18230, 19036,
> 19144, 19032, 18542, 19097, 18564, 18166, 18552, 18557, 18467,
> 19002, 19174, 18624, 18828, 18542, 18542, 18686, 18628, 18702,
> 19227, 19239, 19264, 17597, 17470, 17401, 17344, 17562, 17578,
> 17476, 17539, 17348, 17326, 17420, 17423, 17427, 17347, 17359,
> 17372, 17385, 17365, 17365, 17598, 17977, 17977, 17736, 17684,
> 17612, 17792, 17220, 17310,

Re: [R] Question about R software and output

2023-10-03 Thread Bert Gunter

I am pretty sure you'll get more replies than mine, so just consider this
as part of the story.

Your understanding is confused/flawed.

1. R can be downloaded from hundreds/thousands of software repositories,
not just CRAN.

2. R can read/upload data in hundreds of different formats, not just
Excel's. R makes no use of Excel to read external files (I wasn't clear
what you meant here).

3. As Ben said, it is certainly possible that some R packages -- optional
add-ons extending R capabilities ---  communicate with and store data or
results on external servers. R,  itself, can run locally and can store
results either locally or externally. Like most software, it can also be
integrated as part of the infrastructure on a server for web applications.

If you have a specific question not addressed by these various replies, ask
it. You will most likely get a useful reply.

Cheers,
Bert

On Tue, Oct 3, 2023 at 7:17 AM Ferguson Charity (CEMINFERGUSON) <
charity.eminfergu...@gstt.nhs.uk> wrote:

> To whom it may concern,
>
>
>
> My understanding is that the R software is downloaded from a CRAN network
> and data is imported into it using Microsoft Excel for example. Could I
> please just double check whether any data or results from the output is
> held on external servers or is it just held on local files on the computer?
>
>
>
> Many thanks,
>
>
>
> Charity
>
>
>
> *
>
> The information contained in this message and or attachments is intended
> only for the
> person or entity to which it is addressed and may contain confidential
> and/or
> privileged material. Unless otherwise specified, the opinions expressed
> herein do not
> necessarily represent those of Guy's and St Thomas' NHS Foundation Trust or
> any of its subsidiaries. The information contained in this e-mail may be
> subject to
> public disclosure under the Freedom of Information Act 2000. Unless the
> information
> is legally exempt from disclosure, the confidentiality of this e-mail and
> any replies
> cannot be guaranteed.
>
> Any review, retransmission,dissemination or other use of, or taking of any
> action in
> reliance upon, this information by persons or entities other than the
> intended
> recipient is prohibited. If you received this in error, please contact the
> sender
> and delete the material from any system and destroy any copies.
>
> We make every effort to keep our network free from viruses. However, it is
> your
> responsibility to ensure that this e-mail and any attachments are free of
> viruses as
> we can take no responsibility for any computer virus which might be
> transferred by
> way of this e-mail.
>
>
> *
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice densityplot with weights

2023-09-29 Thread Bert Gunter

Unless I misunderstand...

See ?panel.densityplot

Lattice functions do their work through a host of panel functions,
typically passing their ... arguments to the panel functions.
panel.densityplot has a weights argument.

Cheers,
Bert




On Fri, Sep 29, 2023 at 3:32 AM Naresh Gurbuxani <
naresh_gurbux...@hotmail.com> wrote:

>
> density() function in R accepts weights as an input.  Using this
> function, one can calculate density and plot it.  Is it possible to
> combined these two operations in lattice densityplot()?
>
> mydf <- data.frame(name = "A", x = seq(-2.9, 2.9, by = 0.2), wt =
> diff(pnorm(seq(-3, 3, by = 0.2
> mydf <- rbind(mydf, data.frame(name = "B", x = mydf$x + 0.5, wt =
> mydf$wt))
> with(subset(mydf, name == "A"), density(x, weights = wt / sum(wt)) |>
> plot(xlim = c(-3, 3.5), xlab = "", main = "Density Plots"))
> with(subset(mydf, name == "B"), density(x, weights = wt / sum(wt)) |>
> lines(lty = 2, col = 2))
> grid()
> legend("topright", legend = c("A", "B"), col = c(1, 2), lty = c(1, 2),
> bty = "n")
>
> # I want to do something like this:
> # densityplot(~ x, weights = wt, groups = name, data = mydf, type = c("l",
> "g"))
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Print hypothesis warning- Car package

2023-09-16 Thread Bert Gunter

The factor names are legal but the Warnings tell you pretty clearly that
Car doesn't like such things. So why don't you just use something else that
is more conventional.

-- Bert

On Sat, Sep 16, 2023 at 1:40 PM Robert Baer  wrote:

> When doing Anova using the car package,  I get a print warning that is
> unexpected.  It seemingly involves have my flow cytometry factor levels
> named CD271+ and CD171-.  But I am not sure this warning should be
> intended behavior.  Any explanation about whether I'm doing something
> wrong? Why can't I have CD271+ and CD271- as factor levels?  Its legal
> text isn't it?
>
> library(car) mod = aov(Viability ~ Treatment*Expression, data = dat1)
> Anova(mod, type =2) Anova Table (Type II tests) Response: Viability Sum
> Sq Df F value Pr(>F) Treatment 19447.3 3 9.2942 0.0002927 *** Expression
> 2669.8 1 3.8279 0.0621394 . Treatment:Expression 2226.3 3 1.0640
> 0.3828336 Residuals 16739.3 24 --- Signif. codes: 0 ‘***’ 0.001 ‘**’
> 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning messages: 1: In printHypothesis(L,
> rhs, names(b)) : one or more coefficients in the hypothesis include
> arithmetic operators in their names; the printed representation of the
> hypothesis will be omitted 2: In printHypothesis(L, rhs, names(b)) : one
> or more coefficients in the hypothesis include arithmetic operators in
> their names; the printed representation of the hypothesis will be
> omitted 3: In printHypothesis(L, rhs, names(b)) : one or more
> coefficients in the hypothesis include arithmetic operators in their
> names; the printed representation of the hypothesis will be omitted
>
>
> The code to reproduce:
>
> ```
>
>
> dat1 <-structure(list(Treatment = structure(c(1L, 1L, 1L, 1L, 3L, 1L,
>1L, 1L, 1L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
>3L, 3L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L), levels = c("Control",
> "Dabrafenib", "Trametinib", "Combination"), class = "factor"),
>Expression = structure(c(2L, 2L, 2L, 2L, 2L, 1L,
> 1L, 1L,
> 1L, 2L, 2L, 2L, 2L, 1L,
> 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
> 1L, 2L, 2L, 2L, 2L, 1L,
> 1L, 1L, 1L), levels = c("CD271-",
> "CD271+"), class = "factor"),
>Viability = c(128.329809725159, 24.2360176821065,
> 76.3597924274457, 11.0128771862387, 21.4683836248318,
>  140.784162982894, 87.4303286565443,
> 118.181818181818, 53.603690178743,
>  51.2973284643475, 5.47760907168941,
> 27.1574091870075, 50.8360561214684,
>  56.5250816836441, 28.6949836632712,
> 93.2731116663463, 71.900826446281,
>  32.2314049586777, 24.2360176821065,
> 27.4649240822602, 24.0822602344801,
>  26.542379396502, 30.693830482414,
> 27.772438977513, 13.4729963482606,
>  8.24524312896406, 18.5469921199308,
> 13.9342686911397, 13.3192389006342,
>  19.9308091485681, 17.6244474341726,
> 16.2406304055353)),
>   row.names = c(NA,
> -32L),
>   class = c("tbl_df", "tbl", "data.frame"))
>
> mod = aov(Viability ~ Treatment*Expression, data = dat1)
> summary(mod)
> library(car)
> Anova(mod, type =2)
>
> ```
>
>
> > sessionInfo() R version 4.3.1 (2023-06-16 ucrt) Platform:
> x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build
> 25951) Matrix products: default locale: [1] LC_COLLATE=English_United
> States.utf8 LC_CTYPE=English_United States.utf8
> LC_MONETARY=English_United States.utf8 [4] LC_NUMERIC=C
> LC_TIME=English_United States.utf8 time zone: America/Chicago tzcode
> source: internal attached base packages: [1] stats graphics grDevices
> utils datasets methods base other attached packages: [1] car_3.1-2
> carData_3.0-5 tidyr_1.3.0 readr_2.1.4 readxl_1.4.3 ggplot2_3.4.3
> dplyr_1.1.3 loaded via a namespace (and not attached): [1] crayon_1.5.2
> vctrs_0.6.3 cli_3.6.1 rlang_1.1.1 purrr_1.0.2 generics_0.1.3
> labeling_0.4.3 [8] bit_4.0.5 glue_1.6.2 colorspace_2.1-0 hms_1.1.3
> scales_1.2.1 fansi_1.0.4 grid_4.3.1 [15] cellranger_1.1.0 abind_1.4-5
> munsell_0.5.0 tibble_3.2.1 tzdb_0.4.0 lifecycle_1.0.3 compiler_4.3.1
> [22] pkgconfig_2.0.3 rstudioapi_0.15.0 farver_2.1.1 R6_2.5.1
> tidyselect_1.2.0 utf8_1.2.3 parallel_4.3.1 [29] vroom_1.6.3 pillar_1.9.0
> magrittr_2.0.3 bit64_4.0.5 tools_4.3.1 withr_2.5.0 gtable_0.3.4
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal,

Re: [R] graph in R with grouping letters from the turkey test with agricolae package

2023-09-14 Thread Bert Gunter

No graphs. The link is paywalled.

Bert Gunter

On Thu, Sep 14, 2023 at 10:55 AM Loop Vinyl  wrote:

> Yes, the data and the R code used are attached.
>
> I would like to produce the attached graph (graph1) with the R package
> agricolae, could someone give me an example with the attached data
> (vermiwash and Rcode_vermiwash)?
>
> Fig. 7, https://doi.org/10.1007/s42729-023-01295-3
>
> I expect an adapted graph (graphDoubleFactor) with the data (vermiwash and
> Rcode_vermiwash)
>
>
> Best regards
>
> Em ter., 12 de set. de 2023 às 18:54, Rui Barradas 
> escreveu:
>
> > Às 16:24 de 12/09/2023, Loop Vinyl escreveu:
> > > I would like to produce the attached graph (graph1) with the R package
> > > agricolae, could someone give me an example with the attached data
> > (data)?
> > >
> > > I expect an adapted graph (graph2) with the data (data)
> > >
> > > Best regards
> > >
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > Hello,
> >
> > There are no attached graphs, only data.
> > Can you post the code have you tried?
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> >
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to draw plot marks without label?

2023-09-05 Thread Bert Gunter

Luigi:

For base R graphics, you should always consult ?par for optional graphical
parameters when the high level graphics function doesn't appear to provide
the control you would like. I don't use base graphics -- i.e. i"m not
particularly facile with it -- but as you provided an example, I decided to
try to help. I think this is what you may want:

stripchart(
   Y ~ X, data = df,  ## better way to specify data  instead of $
   method = "jitter", offset=1/3,
   vertical = TRUE, las=1,
   pch=16, cex=2,
   ylab="Y", xlab="X",
   yaxt= "n",  ## don't plot y axes
   main="Example")

## Now add y axis on right with default tick mark locations
axis(side = 4, labels = FALSE)

How I got this (you may find this useful to modify the above if I didn't
get what you wanted) was as follows:

stripchart() does not seem to allow the direct control over the axes you
want. The 'axes' parameter help in ?stripchart sent me to?par to see if I
could specify an axis on the right instead of on the left, but it did not
seem to provide such an option. So I decided to use what the yaxt parameter
described to omit the y axis in the high level (stripchart) call, and then
add it back with a subsequent call to axis(). I just used default tick
locations in that call, but you can specify them explicitly as ?axis
describes

Again, if this is not what you wanted, I hope you can use the above to
modify stripchart() to your desired specification.

Cheers,
Bert

On Tue, Sep 5, 2023 at 12:57 PM Luigi Marongiu 
wrote:

> I would like to draw a graph where the y-lables are missing but the
> marks still present.
> In this example, I get marks from 2 to 140 000 with increments of
> 20 000. I could use `plot(... yaxt="n"...)` combined with `axis(2,
> at..., label="")` but this needs to know exactly the sequence of marks
> provided by plot. But this sequence might change.
>
> Thus is there a way to plot the marks without labels?
> Also, is it possible to draw the marks on the right side?
>
> Thank you.
>
> ```
> y = c(42008, 19076, 150576, 48192, 26153, 37931, 36103, 17692,
>   61538,41027, 71052, 94571)
> df = data.frame(X = c(rep(0, 6), rep(25, 6)), Y = y)
> stripchart(df$Y ~ df$X,
>method = "jitter", offset=1/3,
>vertical = TRUE, las=1,
>pch=16, cex=2,
>ylab="Y", xlab="X",
>main="Example")
> ```
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding combination of states

2023-09-05 Thread Bert Gunter

Oh I liked that.

I was actually thinking about something similar, but couldn't figure it
out.  The idiom you showed is very clever imo and taught me something about
regexes that I never properly understood.

Bert

On Tue, Sep 5, 2023, 01:04 Eric Berger  wrote:

> Hi Bert,
> I really liked your solution.
> In the spirit of code golf, I wondered if there is a shorter way to do
> the regular expression test.
> Kudos to my coding buddy GPT-4 for the following:
>
> You can replace your statement
>
> out[-grep(paste(paste0(states,states),  collapse = "|"),out)]
>
> by
>
> out[-grep("(.)\\1",out)]
>
> Best,
> Eric
>
> On Tue, Sep 5, 2023 at 3:08 AM Bert Gunter  wrote:
> >
> > ... and just for fun, here is a non-string version (more appropriate for
> complex state labels??):
> >
> > gvec <- function(ntimes, states, init, final, repeats = TRUE)
> >## ntimes: integer, number of unique times
> >## states: vector of unique states
> >## init: initial state
> >## final: final state
> > {
> >out <- cbind(init,
> > as.matrix(expand.grid(rep(list(states),ntimes -2 ))),final)
> >if(!repeats)
> >  out[ apply(out,1,\(x)all(x[-1] != x[-ntimes])), ]
> >else out
> > }
> >
> > yielding:
> >
> >
> > > gvec(4, letters[1:5], "b", "e", repeats = TRUE)
> >   init Var1 Var2 final
> >  [1,] "b"  "a"  "a"  "e"
> >  [2,] "b"  "b"  "a"  "e"
> >  [3,] "b"  "c"  "a"  "e"
> >  [4,] "b"  "d"  "a"  "e"
> >  [5,] "b"  "e"  "a"  "e"
> >  [6,] "b"  "a"  "b"  "e"
> >  [7,] "b"  "b"  "b"  "e"
> >  [8,] "b"  "c"  "b"  "e"
> >  [9,] "b"  "d"  "b"  "e"
> > [10,] "b"  "e"  "b"  "e"
> > [11,] "b"  "a"  "c"  "e"
> > [12,] "b"  "b"  "c"  "e"
> > [13,] "b"  "c"  "c"  "e"
> > [14,] "b"  "d"  "c"  "e"
> > [15,] "b"  "e"  "c"  "e"
> > [16,] "b"  "a"  "d"  "e"
> > [17,] "b"  "b"  "d"  "e"
> > [18,] "b"  "c"  "d"  "e"
> > [19,] "b"  "d"  "d"  "e"
> > [20,] "b"  "e"  "d"  "e"
> > [21,] "b"  "a"  "e"  "e"
> > [22,] "b"  "b"  "e"  "e"
> > [23,] "b"  "c"  "e"  "e"
> > [24,] "b"  "d"  "e"  "e"
> > [25,] "b"  "e"  "e"  "e"
> > >
> > > gvec(4, letters[1:5], "b", "e", repeats = FALSE)
> >   init Var1 Var2 final
> >  [1,] "b"  "c"  "a"  "e"
> >  [2,] "b"  "d"  "a"  "e"
> >  [3,] "b"  "e"  "a"  "e"
> >  [4,] "b"  "a"  "b"  "e"
> >  [5,] "b"  "c"  "b"  "e"
> >  [6,] "b"  "d"  "b"  "e"
> >  [7,] "b"  "e"  "b"  "e"
> >  [8,] "b"  "a"  "c"  "e"
> >  [9,] "b"  "d"  "c"  "e"
> > [10,] "b"  "e"  "c"  "e"
> > [11,] "b"  "a"  "d"  "e"
> > [12,] "b"  "c"  "d"  "e"
> > [13,] "b"  "e"  "d"  "e"
> >
> > :-)
> >
> > -- Bert
> >
> > On Mon, Sep 4, 2023 at 2:04 PM Bert Gunter 
> wrote:
> >>
> >> Well, if strings with repeats (as you defined them) are to be excluded,
> I think it's simple just to use regular expressions to remove them.
> >>
> >> e.g.
> >> g <- function(ntimes, states, init, final, repeats = TRUE)
> >>## ntimes: integer, number of uni

Re: [R] Finding combination of states

2023-09-04 Thread Bert Gunter

... and just for fun, here is a non-string version (more appropriate for
complex state labels??):

gvec <- function(ntimes, states, init, final, repeats = TRUE)
   ## ntimes: integer, number of unique times
   ## states: vector of unique states
   ## init: initial state
   ## final: final state
{
   out <- cbind(init,
as.matrix(expand.grid(rep(list(states),ntimes -2 ))),final)
   if(!repeats)
 out[ apply(out,1,\(x)all(x[-1] != x[-ntimes])), ]
   else out
}

yielding:


> gvec(4, letters[1:5], "b", "e", repeats = TRUE)
  init Var1 Var2 final
 [1,] "b"  "a"  "a"  "e"
 [2,] "b"  "b"  "a"  "e"
 [3,] "b"  "c"  "a"  "e"
 [4,] "b"  "d"  "a"  "e"
 [5,] "b"  "e"  "a"  "e"
 [6,] "b"  "a"  "b"  "e"
 [7,] "b"  "b"  "b"  "e"
 [8,] "b"  "c"  "b"  "e"
 [9,] "b"  "d"  "b"  "e"
[10,] "b"  "e"  "b"  "e"
[11,] "b"  "a"  "c"  "e"
[12,] "b"  "b"  "c"  "e"
[13,] "b"  "c"  "c"  "e"
[14,] "b"  "d"  "c"  "e"
[15,] "b"  "e"  "c"  "e"
[16,] "b"  "a"  "d"  "e"
[17,] "b"  "b"  "d"  "e"
[18,] "b"  "c"  "d"  "e"
[19,] "b"  "d"  "d"  "e"
[20,] "b"  "e"  "d"  "e"
[21,] "b"  "a"  "e"  "e"
[22,] "b"  "b"  "e"  "e"
[23,] "b"  "c"  "e"  "e"
[24,] "b"  "d"  "e"  "e"
[25,] "b"  "e"  "e"  "e"
>
> gvec(4, letters[1:5], "b", "e", repeats = FALSE)
  init Var1 Var2 final
 [1,] "b"  "c"  "a"  "e"
 [2,] "b"  "d"  "a"  "e"
 [3,] "b"  "e"  "a"  "e"
 [4,] "b"  "a"  "b"  "e"
 [5,] "b"  "c"  "b"  "e"
 [6,] "b"  "d"  "b"  "e"
 [7,] "b"  "e"  "b"  "e"
 [8,] "b"  "a"  "c"  "e"
 [9,] "b"  "d"  "c"  "e"
[10,] "b"  "e"  "c"  "e"
[11,] "b"  "a"  "d"  "e"
[12,] "b"  "c"  "d"  "e"
[13,] "b"  "e"  "d"  "e"

:-)

-- Bert

On Mon, Sep 4, 2023 at 2:04 PM Bert Gunter  wrote:

> Well, if strings with repeats (as you defined them) are to be excluded, I
> think it's simple just to use regular expressions to remove them.
>
> e.g.
> g <- function(ntimes, states, init, final, repeats = TRUE)
>## ntimes: integer, number of unique times
>## states: vector of unique states
>## init: initial state
>## final: final state
> {
> out <- do.call(paste0,c(init,expand.grid(rep(list(states), ntimes-2)),
> final))
> if(!repeats)
>out[-grep(paste(paste0(states,states),  collapse = "|"),out)]
> else out
> }
> So:
>
> > g(4, LETTERS[1:5], "B", "E", repeats = FALSE)
>  [1] "BCAE" "BDAE" "BEAE" "BABE" "BCBE" "BDBE" "BEBE" "BACE"
>  [9] "BDCE" "BECE" "BADE" "BCDE" "BEDE"
>
> Perhaps not the most efficient way to do this, of course.
>
> Cheers,
> Bert
>
>
> On Mon, Sep 4, 2023 at 12:57 PM Eric Berger  wrote:
>
>> My initial response was buggy and also used a deprecated function.
>> Also, it seems possible that one may want to rule out any strings where
>> the same state appears consecutively.
>> I say that such a string has a repeat.
>>
>> myExpand <- function(v, n) {
>>   do.call(tidyr::expand_grid, replicate(n, v, simplify = FALSE))
>> }
>>
>> no_repeat <- function(s) {
>>   v <- unlist(strsplit(s, NULL))
>>   sum(v[-1]==v[-length(v)]) == 0
>> }
>>
>> f <- function(states, nsteps, first, last, rm_repeat=TRUE) {
>>   if (nsteps < 3) stop("

Re: [R] Finding combination of states

2023-09-04 Thread Bert Gunter

Well, if strings with repeats (as you defined them) are to be excluded, I
think it's simple just to use regular expressions to remove them.

e.g.
g <- function(ntimes, states, init, final, repeats = TRUE)
   ## ntimes: integer, number of unique times
   ## states: vector of unique states
   ## init: initial state
   ## final: final state
{
out <- do.call(paste0,c(init,expand.grid(rep(list(states), ntimes-2)),
final))
if(!repeats)
   out[-grep(paste(paste0(states,states),  collapse = "|"),out)]
else out
}
So:

> g(4, LETTERS[1:5], "B", "E", repeats = FALSE)
 [1] "BCAE" "BDAE" "BEAE" "BABE" "BCBE" "BDBE" "BEBE" "BACE"
 [9] "BDCE" "BECE" "BADE" "BCDE" "BEDE"

Perhaps not the most efficient way to do this, of course.

Cheers,
Bert


On Mon, Sep 4, 2023 at 12:57 PM Eric Berger  wrote:

> My initial response was buggy and also used a deprecated function.
> Also, it seems possible that one may want to rule out any strings where
> the same state appears consecutively.
> I say that such a string has a repeat.
>
> myExpand <- function(v, n) {
>   do.call(tidyr::expand_grid, replicate(n, v, simplify = FALSE))
> }
>
> no_repeat <- function(s) {
>   v <- unlist(strsplit(s, NULL))
>   sum(v[-1]==v[-length(v)]) == 0
> }
>
> f <- function(states, nsteps, first, last, rm_repeat=TRUE) {
>   if (nsteps < 3) stop("nsteps must be at least 3")
> out <- paste(first,
>   myExpand(states, nsteps-2) |>
> apply(MAR=1, \(x) paste(x, collapse="")),
>   last, sep="")
>     if (rm_repeat) {
>   ok <- sapply(out, no_repeat)
>   out <- out[ok]
> }
> out
> }
>
> f(LETTERS[1:5],4,"B","E")
>
> #  [1] "BABE" "BACE" "BADE" "BCAE" "BCBE" "BCDE" "BDAE" "BDBE" "BDCE"
> "BEAE" "BEBE" "BECE" "BEDE"
>
> On Mon, Sep 4, 2023 at 10:33 PM Bert Gunter 
> wrote:
>
>> Sorry, my last line should have read:
>>
>> If neither this nor any of the other suggestions is what is desired, I
>> think the OP will have to clarify his query.
>>
>> Bert
>>
>> On Mon, Sep 4, 2023 at 12:31 PM Bert Gunter 
>> wrote:
>>
>>> I think there may be some uncertainty here about what the OP requested.
>>> My interpretation is:
>>>
>>> n different times
>>> k different states
>>> Any state can appear at any time in the vector of times and can be
>>> repeated
>>> Initial and final states are given
>>>
>>> So modifying Tim's expand.grid() solution a bit yields:
>>>
>>> g <- function(ntimes, states, init, final){
>>>## ntimes: integer, number of unique times
>>>## states: vector of unique states
>>>## init: initial state
>>>## final: final state
>>> do.call(paste0,c(init,expand.grid(rep(list(states), ntimes-2)), final))
>>> }
>>>
>>> e.g.
>>>
>>> > g(4, LETTERS[1:5], "B", "D")
>>>  [1] "BAAD" "BBAD" "BCAD" "BDAD" "BEAD" "BABD" "BBBD" "BCBD"
>>>  [9] "BDBD" "BEBD" "BACD" "BBCD" "BCCD" "BDCD" "BECD" "BADD"
>>> [17] "BBDD" "BCDD" "BDDD" "BEDD" "BAED" "BBED" "BCED" "BDED"
>>> [25] "BEED"
>>>
>>> If neither this nor any of the other suggestions is not what is desired,
>>> I think the OP will have to clarify his query.
>>>
>>> Cheers,
>>> Bert
>>>
>>> On Mon, Sep 4, 2023 at 9:25 AM Ebert,Timothy Aaron 
>>> wrote:
>>>
>>>> Does this work for you?
>>>>
>>>> t0<-t1<-t2<-LETTERS[1:5]
>>>> al2<-expand.grid(t0, t1, t2)
>>>> al3<-paste(al2$Var1, al2$Var2, al2$Var3)
>>>> al4 <- gsub(" ", "", al3)
>>>> head(al3)
>>>>
>>>> Tim
>>>>
>>>> -Original Message-
>>>> From: R-help  On Behalf Of Eric Berger
>>>> Sent: Monday, September 4, 2023 10:17 AM
>>>> To: Christofer Bogaso 
>>>> Cc: r-help 
>>>> Subject: Re: [R] Finding combination of states
>>>>
>>>> [External E

Re: [R] Finding combination of states

2023-09-04 Thread Bert Gunter

Sorry, my last line should have read:

If neither this nor any of the other suggestions is what is desired, I
think the OP will have to clarify his query.

Bert

On Mon, Sep 4, 2023 at 12:31 PM Bert Gunter  wrote:

> I think there may be some uncertainty here about what the OP requested. My
> interpretation is:
>
> n different times
> k different states
> Any state can appear at any time in the vector of times and can be repeated
> Initial and final states are given
>
> So modifying Tim's expand.grid() solution a bit yields:
>
> g <- function(ntimes, states, init, final){
>## ntimes: integer, number of unique times
>## states: vector of unique states
>## init: initial state
>## final: final state
> do.call(paste0,c(init,expand.grid(rep(list(states), ntimes-2)), final))
> }
>
> e.g.
>
> > g(4, LETTERS[1:5], "B", "D")
>  [1] "BAAD" "BBAD" "BCAD" "BDAD" "BEAD" "BABD" "BBBD" "BCBD"
>  [9] "BDBD" "BEBD" "BACD" "BBCD" "BCCD" "BDCD" "BECD" "BADD"
> [17] "BBDD" "BCDD" "BDDD" "BEDD" "BAED" "BBED" "BCED" "BDED"
> [25] "BEED"
>
> If neither this nor any of the other suggestions is not what is desired, I
> think the OP will have to clarify his query.
>
> Cheers,
> Bert
>
> On Mon, Sep 4, 2023 at 9:25 AM Ebert,Timothy Aaron  wrote:
>
>> Does this work for you?
>>
>> t0<-t1<-t2<-LETTERS[1:5]
>> al2<-expand.grid(t0, t1, t2)
>> al3<-paste(al2$Var1, al2$Var2, al2$Var3)
>> al4 <- gsub(" ", "", al3)
>> head(al3)
>>
>> Tim
>>
>> -Original Message-
>> From: R-help  On Behalf Of Eric Berger
>> Sent: Monday, September 4, 2023 10:17 AM
>> To: Christofer Bogaso 
>> Cc: r-help 
>> Subject: Re: [R] Finding combination of states
>>
>> [External Email]
>>
>> The function purrr::cross() can help you with this. For example:
>>
>> f <- function(states, nsteps, first, last) {
>>paste(first, unlist(lapply(purrr::cross(rep(list(v),nsteps-2)),
>> \(x) paste(unlist(x), collapse=""))), last, sep="") } f(LETTERS[1:5], 3,
>> "B", "E") [1] "BAE" "BBE" "BCE" "BDE" "BEE"
>>
>> HTH,
>> Eric
>>
>>
>> On Mon, Sep 4, 2023 at 3:42 PM Christofer Bogaso <
>> bogaso.christo...@gmail.com> wrote:
>> >
>> > Let say I have 3 time points.as T0, T1, and T2.(number of such time
>> > points can be arbitrary) In each time point, an object can be any of 5
>> > states, A, B, C, D, E (number of such states can be arbitrary)
>> >
>> > I need to find all possible ways, how that object starting with state
>> > B (say) at time T0, can be on state E (example) in time T2
>> >
>> > For example one possibility is BAE etc.
>> >
>> > Is there any function available with R, that can give me a vector of
>> > such possibilities for arbitrary number of states, time, and for a
>> > given initial and final (desired) states?
>> >
>> > ANy pointer will be very appreciated.
>> >
>> > Thanks for your time.
>> >
>> > __
>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat/
>> > .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C01%7Ctebert%40ufl.edu
>> > %7C25cee5ce26a8423daaa508dbad51c402%7C0d4da0f84a314d76ace60a62331e1b84
>> > %7C0%7C0%7C638294338934034595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
>> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
>> > ta=TM4jGF39Gy3PH0T3nnQpT%2BLogkVxifv%2Fudv9hWPwbss%3D=0
>> > PLEASE do read the posting guide
>> > http://www.r/
>> > -project.org%2Fposting-guide.html=05%7C01%7Ctebert%40ufl.edu%7C25
>> > cee5ce26a8423daaa508dbad51c402%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
>> > 7C0%7C638294338934034595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
>> > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=5n
>> > PTLmsz0lOz47t41u578t9oI0i7BOgIX53yx8CesLs%3D=0
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.r-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding combination of states

2023-09-04 Thread Bert Gunter

I think there may be some uncertainty here about what the OP requested. My
interpretation is:

n different times
k different states
Any state can appear at any time in the vector of times and can be repeated
Initial and final states are given

So modifying Tim's expand.grid() solution a bit yields:

g <- function(ntimes, states, init, final){
   ## ntimes: integer, number of unique times
   ## states: vector of unique states
   ## init: initial state
   ## final: final state
do.call(paste0,c(init,expand.grid(rep(list(states), ntimes-2)), final))
}

e.g.

> g(4, LETTERS[1:5], "B", "D")
 [1] "BAAD" "BBAD" "BCAD" "BDAD" "BEAD" "BABD" "BBBD" "BCBD"
 [9] "BDBD" "BEBD" "BACD" "BBCD" "BCCD" "BDCD" "BECD" "BADD"
[17] "BBDD" "BCDD" "BDDD" "BEDD" "BAED" "BBED" "BCED" "BDED"
[25] "BEED"

If neither this nor any of the other suggestions is not what is desired, I
think the OP will have to clarify his query.

Cheers,
Bert

On Mon, Sep 4, 2023 at 9:25 AM Ebert,Timothy Aaron  wrote:

> Does this work for you?
>
> t0<-t1<-t2<-LETTERS[1:5]
> al2<-expand.grid(t0, t1, t2)
> al3<-paste(al2$Var1, al2$Var2, al2$Var3)
> al4 <- gsub(" ", "", al3)
> head(al3)
>
> Tim
>
> -Original Message-
> From: R-help  On Behalf Of Eric Berger
> Sent: Monday, September 4, 2023 10:17 AM
> To: Christofer Bogaso 
> Cc: r-help 
> Subject: Re: [R] Finding combination of states
>
> [External Email]
>
> The function purrr::cross() can help you with this. For example:
>
> f <- function(states, nsteps, first, last) {
>paste(first, unlist(lapply(purrr::cross(rep(list(v),nsteps-2)),
> \(x) paste(unlist(x), collapse=""))), last, sep="") } f(LETTERS[1:5], 3,
> "B", "E") [1] "BAE" "BBE" "BCE" "BDE" "BEE"
>
> HTH,
> Eric
>
>
> On Mon, Sep 4, 2023 at 3:42 PM Christofer Bogaso <
> bogaso.christo...@gmail.com> wrote:
> >
> > Let say I have 3 time points.as T0, T1, and T2.(number of such time
> > points can be arbitrary) In each time point, an object can be any of 5
> > states, A, B, C, D, E (number of such states can be arbitrary)
> >
> > I need to find all possible ways, how that object starting with state
> > B (say) at time T0, can be on state E (example) in time T2
> >
> > For example one possibility is BAE etc.
> >
> > Is there any function available with R, that can give me a vector of
> > such possibilities for arbitrary number of states, time, and for a
> > given initial and final (desired) states?
> >
> > ANy pointer will be very appreciated.
> >
> > Thanks for your time.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat/
> > .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C01%7Ctebert%40ufl.edu
> > %7C25cee5ce26a8423daaa508dbad51c402%7C0d4da0f84a314d76ace60a62331e1b84
> > %7C0%7C0%7C638294338934034595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> > ta=TM4jGF39Gy3PH0T3nnQpT%2BLogkVxifv%2Fudv9hWPwbss%3D=0
> > PLEASE do read the posting guide
> > http://www.r/
> > -project.org%2Fposting-guide.html=05%7C01%7Ctebert%40ufl.edu%7C25
> > cee5ce26a8423daaa508dbad51c402%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
> > 7C0%7C638294338934034595%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=5n
> > PTLmsz0lOz47t41u578t9oI0i7BOgIX53yx8CesLs%3D=0
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.r-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregate formula - differing results

2023-09-04 Thread Bert Gunter

Ivan:
Just one perhaps extraneous comment.

You said that you were surprised that aggregate() and group_by() did not
have the same behavior. That is a misconception on your part. As you know,
the tidyverse recapitulates the functionality of many base R functions; but
it makes no claims to do so in exactly the same way and, indeed, often
makes deliberate changes to "improve" behavior. So if you wish to use both,
you should *expect* such differences, which, of course, are documented in
the man pages (and often elsewhere).

Cheers,
Bert

On Mon, Sep 4, 2023 at 5:21 AM Ivan Calandra  wrote:

> Haha, got it now, there is an na.action argument (which defaults to
> na.omit) to aggregate() which is applied before calling mean(na.rm =
> TRUE). Thank you Rui for pointing this out.
>
> So running it with na.pass instead of na.omit gives the same results as
> dplyr::group_by()+summarise():
> aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE,
> na.action = na.pass)
>
> Cheers,
> Ivan
>
> On 04/09/2023 13:56, Rui Barradas wrote:
> > Às 12:51 de 04/09/2023, Ivan Calandra escreveu:
> >> Thanks Rui for your help; that would be one possibility indeed.
> >>
> >> But am I the only one who finds that behavior of aggregate()
> >> completely unexpected and confusing? Especially considering that
> >> dplyr::summarise() and doBy::summaryBy() deal with NAs differently,
> >> even though they all use mean(na.rm = TRUE) to calculate the group
> >> stats.
> >>
> >> Best wishes,
> >> Ivan
> >>
> >> On 04/09/2023 13:46, Rui Barradas wrote:
> >>> Às 10:44 de 04/09/2023, Ivan Calandra escreveu:
>  Dear useRs,
> 
>  I have just stumbled across a behavior in aggregate() that I cannot
>  explain. Any help would be appreciated!
> 
>  Sample data:
>  my_data <- structure(list(ID = c("FLINT-1", "FLINT-10",
>  "FLINT-100", "FLINT-101", "FLINT-102", "HORN-10", "HORN-100",
>  "HORN-102", "HORN-103", "HORN-104"), EdgeLength = c(130.75, 168.77,
>  142.79, 130.1, 140.41, 121.37, 70.52, 122.3, 71.01, 104.5),
>  SurfaceArea = c(1736.87, 1571.83, 1656.46, 1247.18, 1177.47,
>  1169.26, 444.61, 1791.48, 461.15, 1127.2), Length = c(44.384,
>  29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 32.075, 21.337,
>  35.459), Width = c(45.982, 67.303, 52.679, 26.42, 25.149, 33.427,
>  20.683, 62.783, 26.417, 35.297), PLATWIDTH = c(38.84, NA, 15.33,
>  30.37, 11.44, 14.88, 13.86, NA, NA, 26.71), PLATTHICK = c(8.67, NA,
>  7.99, 11.69, 3.3, 16.52, 4.58, NA, NA, 9.35), EPA = c(78, NA, 78,
>  54, 72, 49, 56, NA, NA, 56), THICKNESS = c(10.97, NA, 9.36, 6.4,
>  5.89, 11.05, 4.9, NA, NA, 10.08), WEIGHT = c(34.3, NA, 25.5, 18.6,
>  14.9, 29.5, 4.5, NA, NA, 23), RAWMAT = c("FLINT", "FLINT", "FLINT",
>  "FLINT", "FLINT", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS",
>  "HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 113L,
>  114L, 115L), class = "data.frame")
> 
>  1) Simple aggregation with 2 variables:
>  aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data, FUN =
>  mean, na.rm = TRUE)
> 
>  2) Using the dot notation - different results:
>  aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE)
> 
>  3) Using dplyr, I get the same results as #1:
>  group_by(my_data, RAWMAT) %>%
> summarise(across(c("Length", "Width"), ~ mean(.x, na.rm = TRUE)))
> 
>  4) It gets weirder: using all columns in #1 give the same results
>  as in #2 but different from #1 and #3
>  aggregate(cbind(EdgeLength, SurfaceArea, Length, Width, PLATWIDTH,
>  PLATTHICK, EPA, THICKNESS, WEIGHT) ~ RAWMAT, data = my_data, FUN =
>  mean, na.rm = TRUE)
> 
>  So it seems it is not only due to the notation (cbind() vs. dot).
>  Is it a bug? A peculiar thing in my dataset? I tend to think this
>  could be due to some variables (or their names) as all notations
>  seem to agree when I remove some variables (although I haven't
>  found out which variable(s) is (are) at fault), e.g.:
> 
>  my_data2 <- structure(list(ID = c("FLINT-1", "FLINT-10",
>  "FLINT-100", "FLINT-101", "FLINT-102", "HORN-10", "HORN-100",
>  "HORN-102", "HORN-103", "HORN-104"), EdgeLength = c(130.75, 168.77,
>  142.79, 130.1, 140.41, 121.37, 70.52, 122.3, 71.01, 104.5),
>  SurfaceArea = c(1736.87, 1571.83, 1656.46, 1247.18, 1177.47,
>  1169.26, 444.61, 1791.48, 461.15, 1127.2), Length = c(44.384,
>  29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 32.075, 21.337,
>  35.459), Width = c(45.982, 67.303, 52.679, 26.42, 25.149, 33.427,
>  20.683, 62.783, 26.417, 35.297), RAWMAT = c("FLINT", "FLINT",
>  "FLINT", "FLINT", "FLINT", "HORNFELS", "HORNFELS", "HORNFELS",
>  "HORNFELS", "HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 111L,
>  112L, 113L, 114L, 115L), class = "data.frame")
> 
>  aggregate(cbind(EdgeLength, SurfaceArea, Length, Width) ~ RAWMAT,
>

Re: [R] [Pkg-Collaboratos] BioShapes Almost-Package

2023-09-03 Thread Bert Gunter

1. R-package-devel is where queries about package protocols should go.

2. But...
"Is there a succinct, but sufficiently informative description of
documentation tools?"
"Writing R Extensions" (shipped with R) is *the* reference for R
documentation. Whether it's sufficiently "succinct" for you, I cannot say.

"I find that including the documentation in the source files is very
distracting."
?? R documentation (.Rd) files are separate from source (.R) files. Inline
documentation in source files is an "add-on" capability provided by
optional packages if one prefers to do this. Such packages parse the source
files to extract the documentation into the .Rd files/ So not sure what you
mean here. Apologies if I have misunderstood.

" I would prefer to have only basic comments in the source
files and an expanded documentation in a separate location."
If I understand you correctly, this is exactly what the R package process
specifies. Again, see the "Writing R Extensions" manual for details.

Also, if you wish to have your package on CRAN, it requires that the
package documents all functions in the package as specified by the "Writing
..." manual.

Again, further questions and elaboration should go to the R-package-devel
list, although I think the manual is really the authoritative resource to
follow.

Cheers,
Bert

On Sun, Sep 3, 2023 at 5:06 PM Leonard Mada via R-help 
wrote:

> Dear R-List Members,
>
> I am looking for collaborators to further develop the BioShapes
> almost-package. I added a brief description below.
>
> A.) BioShapes (Almost-) Package
>
> The aim of the BioShapes quasi-package is to facilitate the generation
> of graphical objects resembling biological and chemical entities,
> enabling the construction of diagrams based on these objects. It
> currently includes functions to generate diagrams depicting viral
> particles, liposomes, double helix / DNA strands, various cell types
> (like neurons, brush-border cells and duct cells), Ig-domains, as well
> as more basic shapes.
>
> It should offer researchers in the field of biological and chemical
> sciences a tool to easily generate diagrams depicting the studied
> biological processes.
>
> The package lacks a proper documentation and is not yet released on
> CRAN. However, it is available on GitHub:
> https://github.com/discoleo/BioShapes
>
> Although there are 27 unique cloners on GitHub, I am still looking for
> contributors and collaborators. I would appreciate any collaborations to
> develop it further. I can be contacted both by email and on GitHub.
>
>
> B.) Documentation Tools
>
> Is there a succinct, but sufficiently informative description of
> documentation tools?
> I find that including the documentation in the source files is very
> distracting. I would prefer to have only basic comments in the source
> files and an expanded documentation in a separate location.
>
> This question may be more appropriate for the R-package-devel list. I
> can move the 2nd question to that list.
>
> ###
>
> As the biological sciences are very vast, I would be very happy for
> collaborators on the development of this package. Examples with existing
> shapes are available in (but are unfortunately not documented):
>
> Man/examples/Examples.Man.R
> R/Examples.R
> R/Examples.Cells.R
> tests/experimental/*
>
>
> Many thanks,
>
> Leonard
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Book Recommendation

2023-08-28 Thread Bert Gunter

I presume you are familiar with the RSQL and RSQLite packages and their
vignettes.

Can't offer any help, but a point of clarification:
When you say, "teach accomplishing SQL in R," do you explicitly mean using
SQL syntax in R to manipulate data or do you mean just doing SQL-like types
of data manipulation in R? For the former, I assume you would be using the
above-mentioned packages -- or perhaps others that I don't know about like
them. For the latter, which I think would be subsumed under "data wrangling
in R" there are tons of packages, tutorials, and books out there that one
could search for under that rubric. If neither of the above, further
clarification might help you get a better answer.

Cheers,
Bert

On Mon, Aug 28, 2023 at 8:47 AM Stephen H. Dawson, DSL via R-help <
r-help@r-project.org> wrote:

> Good Morning,
>
>
> I am doing some research to develop a new course where I teach. I am
> looking for a book to use in the course content to teach accomplishing
> SQL in R.
>
> Does anyone know of a book on this topic to recommend for consideration?
>
>
> Thank You,
> --
> *Stephen Dawson, DSL*
> /Executive Strategy Consultant/
> Business & Technology
> +1 (865) 804-3454
> http://www.shdawson.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interpreting Results from LOF.test() from qpcR package

2023-08-20 Thread Bert Gunter

I would suggest that a simple plot of residuals vs. fitted values and
perhaps plots of residuals vs. the independent variables are almost always
more useful than omnibus LOF tests. (many would disagree!) However,as Ben
noted, this is wandering outside R-Help's strict remit, and you would be
better served by statistics discussion/help sites rather than R-Help.
Though with this small a data set and this complex a model, I would be
surprised if there could be LOF unless it were glaringly obvious from
simple plots.

Cheers,
Bert

-- Bert

On Sun, Aug 20, 2023 at 6:02 PM Paul Bernal  wrote:

> I am using LOF.test() function from the qpcR package and got the following
> result:
>
> > LOF.test(nlregmod3)
> $pF
> [1] 0.97686
>
> $pLR
> [1] 0.77025
>
> Can I conclude from the LOF.test() results that my nonlinear regression
> model is significant/statistically significant?
>
> Where my nonlinear model was fitted as follows:
> nlregmod3 <- nlsr(formula=y ~ theta1 - theta2*exp(-theta3*x), data =
> mod14data2_random,
>   start = list(theta1 = 0.37,
>theta2 = -exp(-1.8),
>theta3 = 0.05538))
> And the data used to fit this model is the following:
> dput(mod14data2_random)
> structure(list(index = c(14L, 27L, 37L, 33L, 34L, 16L, 7L, 1L,
> 39L, 36L, 40L, 19L, 28L, 38L, 32L), y = c(0.44, 0.4, 0.4, 0.4,
> 0.4, 0.43, 0.46, 0.49, 0.41, 0.41, 0.38, 0.42, 0.41, 0.4, 0.4
> ), x = c(16, 24, 32, 30, 30, 16, 12, 8, 36, 32, 36, 20, 26, 34,
> 28)), row.names = c(NA, -15L), class = "data.frame")
>
> Cheers,
> Paul
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issues when trying to fit a nonlinear regression model

2023-08-20 Thread Bert Gunter

Basic algebra and exponentials/logs. I leave those details to you or
another HelpeR.

-- Bert

On Sun, Aug 20, 2023 at 12:17 PM Paul Bernal  wrote:

> Dear Bert,
>
> Thank you for your extremely valuable feedback. Now, I just want to
> understand why the signs for those starting values, given the following:
> > #Fiting intermediate model to get starting values
> > intermediatemod <- lm(log(y - .37) ~ x, data=mod14data2_random)
> > summary(intermediatemod)
>
> Call:
> lm(formula = log(y - 0.37) ~ x, data = mod14data2_random)
>
> Residuals:
> Min  1Q  Median  3Q Max
> -0.7946 -0.0908  0.0379  0.  0.5917
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) -1.816930.25806   -7.04  8.8e-06 ***
> x   -0.055380.00964   -5.75  6.8e-05 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.323 on 13 degrees of freedom
> Multiple R-squared:  0.717, Adjusted R-squared:  0.696
> F-statistic:   33 on 1 and 13 DF,  p-value: 6.76e-05
>
> Kind regards,
> Paul
>
> El dom, 20 ago 2023 a las 14:07, Bert Gunter ()
> escribió:
>
>> Oh, sorry; I changed signs in the model, fitting
>> theta0 + theta1*exp(theta2*x)
>>
>> So for theta0 - theta1*exp(-theta2*x) use theta1= -.exp(-1.8) and theta2
>> = +.055 as starting values.
>>
>> -- Bert
>>
>>
>>
>>
>>
>> On Sun, Aug 20, 2023 at 11:50 AM Paul Bernal 
>> wrote:
>>
>>> Dear Bert,
>>>
>>> Thank you so much for your kind and valuable feedback. I tried finding
>>> the starting values using the approach you mentioned, then did the
>>> following to fit the nonlinear regression model:
>>> nlregmod2 <- nls(y ~ theta1 - theta2*exp(-theta3*x),
>>>   start =
>>> list(theta1 = 0.37,
>>>  theta2 = exp(-1.8),
>>>  theta3 = -0.05538), data=mod14data2_random)
>>> However, I got this error:
>>> Error in nls(y ~ theta1 - theta2 * exp(-theta3 * x), start = list(theta1
>>> = 0.37,  :
>>>   step factor 0.000488281 reduced below 'minFactor' of 0.000976562
>>> nlregmod2 <- nlxb(y ~ theta1 - theta2*exp(-theta3*x),
>>>   start =
>>> list(theta1 = 0.37,
>>>  theta2 = exp(-1.8),
>>>  theta3 = -0.05538), data=mod14data2_random)
>>> summary(nlregmod2)
>>> Object has try-error or missing parameters
>>> nlregmod2
>>> And I get some NA values when retrieving the statistics for the fitted
>>> model:
>>> residual sumsquares =  0.0022973  on  15 observations
>>> after  2235Jacobian and  2861 function evaluations
>>>   namecoeff  SE   tstat  pval  gradient
>>>JSingval
>>> theta1   9330.89NA NA NA   5.275e-11
>>>  967470
>>> theta2   9330.41NA NA NA  -5.318e-11
>>>   1.772
>>> theta3   -3.0032e-07NA NA NA   1.389e-05
>>>   8.028e-12
>>>
>>> Kind regards,
>>> Paul
>>>
>>>
>>> El dom, 20 ago 2023 a las 13:21, Bert Gunter ()
>>> escribió:
>>>
>>>> I got starting values as follows:
>>>> Noting that the minimum data value is .38, I fit the linear model log(y
>>>> - .37) ~ x to get intercept = -1.8 and slope = -.055. So I used .37,
>>>> exp(-1.8)  and -.055 as the starting values for theta0, theta1, and theta2
>>>> in the nonlinear model. This converged without problems.
>>>>
>>>> Cheers,
>>>> Bert
>>>>
>>>>
>>>> On Sun, Aug 20, 2023 at 10:15 AM Paul Bernal 
>>>> wrote:
>>>>
>>>>> Dear friends,
>>>>>
>>>>> This is the dataset I am currently working with:
>>>>> >dput(mod14data2_random)
>>>>> structure(list(index = c(14L, 27L, 37L, 33L, 34L, 16L, 7L, 1L,
>>>>> 39L, 36L, 40L, 19L, 28L, 38L, 32L), y = c(0.44, 0.4, 0.4, 0.4,
>>>>> 0.4, 0.43, 0.46, 0.49, 0.41, 0.41, 0.38, 0.42, 0.41, 0.4, 0.4
>>>>> ), x = c(16, 24, 32, 30, 30, 16, 12, 8, 36, 32, 36, 20, 26, 34,
>>>>> 28)), row.names = c(NA, -15L), class = "data.frame")
>>>>>
>>>>> I did the following to try to fit a nonlinear regression

Re: [R] Issues when trying to fit a nonlinear regression model

2023-08-20 Thread Bert Gunter

Oh, sorry; I changed signs in the model, fitting
theta0 + theta1*exp(theta2*x)

So for theta0 - theta1*exp(-theta2*x) use theta1= -.exp(-1.8) and theta2 =
+.055 as starting values.

-- Bert





On Sun, Aug 20, 2023 at 11:50 AM Paul Bernal  wrote:

> Dear Bert,
>
> Thank you so much for your kind and valuable feedback. I tried finding the
> starting values using the approach you mentioned, then did the following to
> fit the nonlinear regression model:
> nlregmod2 <- nls(y ~ theta1 - theta2*exp(-theta3*x),
>   start =
> list(theta1 = 0.37,
>  theta2 = exp(-1.8),
>  theta3 = -0.05538), data=mod14data2_random)
> However, I got this error:
> Error in nls(y ~ theta1 - theta2 * exp(-theta3 * x), start = list(theta1 =
> 0.37,  :
>   step factor 0.000488281 reduced below 'minFactor' of 0.000976562
> nlregmod2 <- nlxb(y ~ theta1 - theta2*exp(-theta3*x),
>   start =
> list(theta1 = 0.37,
>  theta2 = exp(-1.8),
>  theta3 = -0.05538), data=mod14data2_random)
> summary(nlregmod2)
> Object has try-error or missing parameters
> nlregmod2
> And I get some NA values when retrieving the statistics for the fitted
> model:
> residual sumsquares =  0.0022973  on  15 observations
> after  2235Jacobian and  2861 function evaluations
>   namecoeff  SE   tstat  pval  gradient
>  JSingval
> theta1   9330.89NA NA NA   5.275e-11
>967470
> theta2   9330.41NA NA NA  -5.318e-11
> 1.772
> theta3   -3.0032e-07NA     NA     NA   1.389e-05
> 8.028e-12
>
> Kind regards,
> Paul
>
>
> El dom, 20 ago 2023 a las 13:21, Bert Gunter ()
> escribió:
>
>> I got starting values as follows:
>> Noting that the minimum data value is .38, I fit the linear model log(y -
>> .37) ~ x to get intercept = -1.8 and slope = -.055. So I used .37,
>> exp(-1.8)  and -.055 as the starting values for theta0, theta1, and theta2
>> in the nonlinear model. This converged without problems.
>>
>> Cheers,
>> Bert
>>
>>
>> On Sun, Aug 20, 2023 at 10:15 AM Paul Bernal 
>> wrote:
>>
>>> Dear friends,
>>>
>>> This is the dataset I am currently working with:
>>> >dput(mod14data2_random)
>>> structure(list(index = c(14L, 27L, 37L, 33L, 34L, 16L, 7L, 1L,
>>> 39L, 36L, 40L, 19L, 28L, 38L, 32L), y = c(0.44, 0.4, 0.4, 0.4,
>>> 0.4, 0.43, 0.46, 0.49, 0.41, 0.41, 0.38, 0.42, 0.41, 0.4, 0.4
>>> ), x = c(16, 24, 32, 30, 30, 16, 12, 8, 36, 32, 36, 20, 26, 34,
>>> 28)), row.names = c(NA, -15L), class = "data.frame")
>>>
>>> I did the following to try to fit a nonlinear regression model:
>>>
>>> #First, Procedure to Find Starting (initial) Values For Theta1, Theta2,
>>> and
>>> Theta3
>>>
>>> mymod2 <- y ~ theta1 - theta2*exp(-theta3*x)
>>>
>>> strt2 <- c(theta1 = 1, theta2 = 2, theta3 = 3)
>>>
>>> trysol2<-nlxb(formula=mymod2, data=mod14data2_random, start=strt2,
>>> trace=TRUE)
>>> trysol2
>>> trysol2$coefficients[[3]]
>>>
>>> #Fitting nonlinear Regression Model Using Starting Values From Previous
>>> Part
>>> nonlinearmod2 <- nls(mymod2, start = list(theta1 =
>>> trysol2$coefficients[[1]],
>>>  theta2 = trysol2$coefficients[[2]],
>>>  theta3 = trysol2$coefficients[[3]]), data =
>>> mod14data2_random)
>>>
>>> And I got this error:
>>> Error in nlsModel(formula, mf, start, wts, scaleOffset = scOff,
>>> nDcentral =
>>> nDcntr) :
>>>   singular gradient matrix at initial parameter estimates
>>>
>>> Any idea on how to proceed in this situation? What could I do?
>>>
>>> Kind regards,
>>> Paul
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issues when trying to fit a nonlinear regression model

2023-08-20 Thread Bert Gunter

I got starting values as follows:
Noting that the minimum data value is .38, I fit the linear model log(y -
.37) ~ x to get intercept = -1.8 and slope = -.055. So I used .37,
exp(-1.8)  and -.055 as the starting values for theta0, theta1, and theta2
in the nonlinear model. This converged without problems.

Cheers,
Bert


On Sun, Aug 20, 2023 at 10:15 AM Paul Bernal  wrote:

> Dear friends,
>
> This is the dataset I am currently working with:
> >dput(mod14data2_random)
> structure(list(index = c(14L, 27L, 37L, 33L, 34L, 16L, 7L, 1L,
> 39L, 36L, 40L, 19L, 28L, 38L, 32L), y = c(0.44, 0.4, 0.4, 0.4,
> 0.4, 0.43, 0.46, 0.49, 0.41, 0.41, 0.38, 0.42, 0.41, 0.4, 0.4
> ), x = c(16, 24, 32, 30, 30, 16, 12, 8, 36, 32, 36, 20, 26, 34,
> 28)), row.names = c(NA, -15L), class = "data.frame")
>
> I did the following to try to fit a nonlinear regression model:
>
> #First, Procedure to Find Starting (initial) Values For Theta1, Theta2, and
> Theta3
>
> mymod2 <- y ~ theta1 - theta2*exp(-theta3*x)
>
> strt2 <- c(theta1 = 1, theta2 = 2, theta3 = 3)
>
> trysol2<-nlxb(formula=mymod2, data=mod14data2_random, start=strt2,
> trace=TRUE)
> trysol2
> trysol2$coefficients[[3]]
>
> #Fitting nonlinear Regression Model Using Starting Values From Previous
> Part
> nonlinearmod2 <- nls(mymod2, start = list(theta1 =
> trysol2$coefficients[[1]],
>  theta2 = trysol2$coefficients[[2]],
>  theta3 = trysol2$coefficients[[3]]), data =
> mod14data2_random)
>
> And I got this error:
> Error in nlsModel(formula, mf, start, wts, scaleOffset = scOff, nDcentral =
> nDcntr) :
>   singular gradient matrix at initial parameter estimates
>
> Any idea on how to proceed in this situation? What could I do?
>
> Kind regards,
> Paul
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Determining Starting Values for Model Parameters in Nonlinear Regression

2023-08-19 Thread Bert Gunter

" need to determine starting (initial) values for the model parameters for
this nonlinear regression model," ...

what nonlinear regression model? Did html get stripped?

-- Bert

On Sat, Aug 19, 2023 at 12:38 PM Paul Bernal  wrote:

> Dear friends,
>
> Hope you are all doing well and having a great weekend.  I have data that
> was collected on specific gravity and spectrophotometer analysis for 26
> mixtures of NG (nitroglycerine), TA (triacetin), and 2 NDPA (2 -
> nitrodiphenylamine).
>
> In the dataset, x1 = %NG,  x2 = %TA, and x3 = %2 NDPA.
>
> The response variable is the specific gravity, and the rest of the
> variables are the predictors.
>
> This is the dataset:
> dput(mod14data_random)
> structure(list(Mixture = c(17, 14, 5, 1, 11, 2, 16, 7, 19, 23,
> 20, 6, 13, 21, 3, 18, 15, 26, 8, 22), x1 = c(69.98, 72.5, 77.6,
> 79.98, 74.98, 80.06, 69.98, 77.34, 69.99, 67.49, 67.51, 77.63,
> 72.5, 67.5, 80.1, 69.99, 72.49, 64.99, 75.02, 67.48), x2 = c(29,
> 25.48, 21.38, 19.85, 22, 18.91, 29.99, 19.65, 26.99, 29.49, 32.47,
> 20.35, 26.48, 31.47, 16.87, 27.99, 24.49, 31.99, 24.96, 30.5),
> x3 = c(1, 2, 1, 0, 3, 1, 0, 2.99, 3, 3, 0, 2, 1, 1, 3, 2,
> 3, 3, 0, 2), y = c(1.4287, 1.4426, 1.4677, 1.4774, 1.4565,
> 1.4807, 1.4279, 1.4684, 1.4301, 1.4188, 1.4157, 1.4686, 1.4414,
> 1.4172, 1.4829, 1.4291, 1.4438, 1.4068, 1.4524, 1.4183)), row.names =
> c(NA,
> -20L), class = "data.frame")
>
> I need to determine starting (initial) values for the model parameters for
> this nonlinear regression model, any ideas on how to accomplish this using
> R?
>
> Any help and/or guidance will be greatly appreciated.
>
> Thanks, beforehand, for your valuable and kindness.
>
> Best regards,
> Paul
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Numerical stability of: 1/(1 - cos(x)) - 2/x^2

2023-08-18 Thread Bert Gunter

"Are there any good indefinite (or much higher) precision packages"

A simple search on "arbitrary precision arithmetic in R" would have
immediately gotten you to the Rmpfr package.

See also:
https://cran.r-project.org/web/packages/Ryacas/vignettes/arbitrary-precision.html

-- Bert

On Fri, Aug 18, 2023 at 4:58 PM  wrote:

> This discussion is sooo familiar.
>
> If you want indefinite precision arithmetic, feel free to use a language
> and data type that supports it.
>
> Otherwise, only do calculations that fit in a safe zone.
>
> This is not just about this scenario. Floating point can work well when
> adding (or subtracting) two numbers of about the same size. But if one
> number is .123456789... and another is the same except raised to the -45th
> power of ten, then adding them effectively throws away the second number.
>
> This is a well-known problem for any finite binary representation of
> numbers. In the example given, yes, the smaller the number is, the worse
> the behavior in this case tends to be.
>
> There are many solutions and some are fairly expensive in terms of
> computation time and sometimes memory usage.
>
> Are there any good indefinite (or much higher) precision packages out
> there that would not only support the data type needed but also properly be
> used and passed along to the functions used to do complex calculations? No,
> I am not asking for indefinite precision complex numbers, but generally
> that would be a tuple of such numbers.
>
>
> -Original Message-
> From: R-help  On Behalf Of Bert Gunter
> Sent: Friday, August 18, 2023 7:06 PM
> To: Leonard Mada 
> Cc: R-help Mailing List ; Martin Maechler <
> maech...@stat.math.ethz.ch>
> Subject: Re: [R] Numerical stability of: 1/(1 - cos(x)) - 2/x^2
>
> "The ugly thing is that the error only gets worse as x decreases. The
> value neither drops to 0, nor does it blow up to infinity; but it gets
> worse in a continuous manner."
>
> If I understand you correctly, this is wrong:
>
> > x <- 2^(-20) ## considerably less then 1e-4 !!
> > y <- 1 - x^2/2;
> > 1/(1 - y) - 2/x^2
> [1] 0
>
> It's all about the accuracy of the binary approximation of floating point
> numbers (and their arithmetic)
>
> Cheers,
> Bert
>
>
> On Fri, Aug 18, 2023 at 3:25 PM Leonard Mada via R-help <
> r-help@r-project.org> wrote:
>
> > I have added some clarifications below.
> >
> > On 8/18/2023 10:20 PM, Leonard Mada wrote:
> > > [...]
> > > After more careful thinking, I believe that it is a limitation due to
> > > floating points:
> > > [...]
> > >
> > > The problem really stems from the representation of 1 - x^2/2 as shown
> > > below:
> > > x = 1E-4
> > > print(1 - x^2/2, digits=20)
> > > print(0.5, digits=20) # fails
> > > # 0.50003039
> >
> > The floating point representation of 1 - x^2/2 is the real culprit:
> > # 0.50003039
> >
> > The 3039 at the end is really an error due to the floating point
> > representation. However, this error blows up when inverting the value:
> > x = 1E-4;
> > y = 1 - x^2/2;
> > 1/(1 - y) - 2/x^2
> > # 1.215494
> > # should be 1/(x^2/2) - 2/x^2 = 0
> >
> >
> > The ugly thing is that the error only gets worse as x decreases. The
> > value neither drops to 0, nor does it blow up to infinity; but it gets
> > worse in a continuous manner. At least the reason has become now clear.
> >
> >
> > >
> > > Maybe some functions of type cos1p and cos1n would be handy for such
> > > computations (to replace the manual series expansion):
> > > cos1p(x) = 1 + cos(x)
> > > cos1n(x) = 1 - cos(x)
> > > Though, I do not have yet the big picture.
> > >
> >
> > Sincerely,
> >
> >
> > Leonard
> >
> > >
> > >
> > > On 8/17/2023 1:57 PM, Martin Maechler wrote:
> > >>>>>>> Leonard Mada
> > >>>>>>>  on Wed, 16 Aug 2023 20:50:52 +0300 writes:
> > >>  > Dear Iris,
> > >>  > Dear Martin,
> > >>
> > >>  > Thank you very much for your replies. I add a few comments.
> > >>
> > >>  > 1.) Correct formula
> > >>  > The formula in the Subject Title was correct. A small glitch
> > >> swept into
> > >>  > the last formula:
> > >>  > - 1/(cos(x) - 1) - 2/x^2
> > >>  > or
> > >>  > 1/(1 - c

Re: [R] Numerical stability of: 1/(1 - cos(x)) - 2/x^2

2023-08-18 Thread Bert Gunter

"Values of type 2^(-n) (and its binary complement) are exactly represented
as floating point numbers and do not generate the error. However, values
away from such special x-values will generate errors:"

That was exactly my point: The size of errors depends on the accuracy of
binary representation of floating point numbers and their arithmetic.

But you previously said:
"The ugly thing is that the error only gets worse as x decreases. The
value neither drops to 0, nor does it blow up to infinity; but it gets
worse in a continuous manner."

That is wrong and disagrees with what you say above.

-- Bert

On Fri, Aug 18, 2023 at 4:34 PM Leonard Mada  wrote:

> Dear Bert,
>
>
> Values of type 2^(-n) (and its binary complement) are exactly represented
> as floating point numbers and do not generate the error. However, values
> away from such special x-values will generate errors:
>
>
> # exactly represented:
> x = 9.53674316406250e-07
> y <- 1 - x^2/2;
> 1/(1 - y) - 2/x^2
>
> # almost exact:
> x = 9.536743164062502e-07
> y <- 1 - x^2/2;
> 1/(1 - y) - 2/x^2
>
> x = 9.536743164062498e-07
> y <- 1 - x^2/2;
> 1/(1 - y) - 2/x^2
>
> # the result behaves far better around values
> # which can be represented exactly,
> # but fails drastically for other values!
> x = 2^(-20) * 1.1
> y <- 1 - x^2/2;
> 1/(1 - y) - 2/x^2
> # 58672303 instead of 0!
>
>
> Sincerely,
>
>
> Leonard
>
>
> On 8/19/2023 2:06 AM, Bert Gunter wrote:
>
> "The ugly thing is that the error only gets worse as x decreases. The
> value neither drops to 0, nor does it blow up to infinity; but it gets
> worse in a continuous manner."
>
> If I understand you correctly, this is wrong:
>
> > x <- 2^(-20) ## considerably less then 1e-4 !!
> > y <- 1 - x^2/2;
> > 1/(1 - y) - 2/x^2
> [1] 0
>
> It's all about the accuracy of the binary approximation of floating point
> numbers (and their arithmetic)
>
> Cheers,
> Bert
>
>
> On Fri, Aug 18, 2023 at 3:25 PM Leonard Mada via R-help <
> r-help@r-project.org> wrote:
>
>> I have added some clarifications below.
>>
>> On 8/18/2023 10:20 PM, Leonard Mada wrote:
>> > [...]
>> > After more careful thinking, I believe that it is a limitation due to
>> > floating points:
>> > [...]
>> >
>> > The problem really stems from the representation of 1 - x^2/2 as shown
>> > below:
>> > x = 1E-4
>> > print(1 - x^2/2, digits=20)
>> > print(0.5, digits=20) # fails
>> > # 0.50003039
>>
>> The floating point representation of 1 - x^2/2 is the real culprit:
>> # 0.50003039
>>
>> The 3039 at the end is really an error due to the floating point
>> representation. However, this error blows up when inverting the value:
>> x = 1E-4;
>> y = 1 - x^2/2;
>> 1/(1 - y) - 2/x^2
>> # 1.215494
>> # should be 1/(x^2/2) - 2/x^2 = 0
>>
>>
>> The ugly thing is that the error only gets worse as x decreases. The
>> value neither drops to 0, nor does it blow up to infinity; but it gets
>> worse in a continuous manner. At least the reason has become now clear.
>>
>>
>> >
>> > Maybe some functions of type cos1p and cos1n would be handy for such
>> > computations (to replace the manual series expansion):
>> > cos1p(x) = 1 + cos(x)
>> > cos1n(x) = 1 - cos(x)
>> > Though, I do not have yet the big picture.
>> >
>>
>> Sincerely,
>>
>>
>> Leonard
>>
>> >
>> >
>> > On 8/17/2023 1:57 PM, Martin Maechler wrote:
>> >>>>>>> Leonard Mada
>> >>>>>>>  on Wed, 16 Aug 2023 20:50:52 +0300 writes:
>> >>  > Dear Iris,
>> >>  > Dear Martin,
>> >>
>> >>  > Thank you very much for your replies. I add a few comments.
>> >>
>> >>  > 1.) Correct formula
>> >>  > The formula in the Subject Title was correct. A small glitch
>> >> swept into
>> >>  > the last formula:
>> >>  > - 1/(cos(x) - 1) - 2/x^2
>> >>  > or
>> >>  > 1/(1 - cos(x)) - 2/x^2 # as in the subject title;
>> >>
>> >>  > 2.) log1p
>> >>  > Actually, the log-part behaves much better. And when it fails,
>> >> it fails
>> >>  > completely (which is easy to spot!).
>> >>
>> >>  > x = 1E-6
>> >>  > log(x) -log(1 - cos(x

Re: [R] Numerical stability of: 1/(1 - cos(x)) - 2/x^2

2023-08-18 Thread Bert Gunter

"The ugly thing is that the error only gets worse as x decreases. The
value neither drops to 0, nor does it blow up to infinity; but it gets
worse in a continuous manner."

If I understand you correctly, this is wrong:

> x <- 2^(-20) ## considerably less then 1e-4 !!
> y <- 1 - x^2/2;
> 1/(1 - y) - 2/x^2
[1] 0

It's all about the accuracy of the binary approximation of floating point
numbers (and their arithmetic)

Cheers,
Bert


On Fri, Aug 18, 2023 at 3:25 PM Leonard Mada via R-help <
r-help@r-project.org> wrote:

> I have added some clarifications below.
>
> On 8/18/2023 10:20 PM, Leonard Mada wrote:
> > [...]
> > After more careful thinking, I believe that it is a limitation due to
> > floating points:
> > [...]
> >
> > The problem really stems from the representation of 1 - x^2/2 as shown
> > below:
> > x = 1E-4
> > print(1 - x^2/2, digits=20)
> > print(0.5, digits=20) # fails
> > # 0.50003039
>
> The floating point representation of 1 - x^2/2 is the real culprit:
> # 0.50003039
>
> The 3039 at the end is really an error due to the floating point
> representation. However, this error blows up when inverting the value:
> x = 1E-4;
> y = 1 - x^2/2;
> 1/(1 - y) - 2/x^2
> # 1.215494
> # should be 1/(x^2/2) - 2/x^2 = 0
>
>
> The ugly thing is that the error only gets worse as x decreases. The
> value neither drops to 0, nor does it blow up to infinity; but it gets
> worse in a continuous manner. At least the reason has become now clear.
>
>
> >
> > Maybe some functions of type cos1p and cos1n would be handy for such
> > computations (to replace the manual series expansion):
> > cos1p(x) = 1 + cos(x)
> > cos1n(x) = 1 - cos(x)
> > Though, I do not have yet the big picture.
> >
>
> Sincerely,
>
>
> Leonard
>
> >
> >
> > On 8/17/2023 1:57 PM, Martin Maechler wrote:
> >>> Leonard Mada
> >>>  on Wed, 16 Aug 2023 20:50:52 +0300 writes:
> >>  > Dear Iris,
> >>  > Dear Martin,
> >>
> >>  > Thank you very much for your replies. I add a few comments.
> >>
> >>  > 1.) Correct formula
> >>  > The formula in the Subject Title was correct. A small glitch
> >> swept into
> >>  > the last formula:
> >>  > - 1/(cos(x) - 1) - 2/x^2
> >>  > or
> >>  > 1/(1 - cos(x)) - 2/x^2 # as in the subject title;
> >>
> >>  > 2.) log1p
> >>  > Actually, the log-part behaves much better. And when it fails,
> >> it fails
> >>  > completely (which is easy to spot!).
> >>
> >>  > x = 1E-6
> >>  > log(x) -log(1 - cos(x))/2
> >>  > # 0.3465291
> >>
> >>  > x = 1E-8
> >>  > log(x) -log(1 - cos(x))/2
> >>  > # Inf
> >>  > log(x) - log1p(- cos(x))/2
> >>  > # Inf => fails as well!
> >>  > # although using only log1p(cos(x)) seems to do the trick;
> >>  > log1p(cos(x)); log(2)/2;
> >>
> >>  > 3.) 1/(1 - cos(x)) - 2/x^2
> >>  > It is possible to convert the formula to one which is
> >> numerically more
> >>  > stable. It is also possible to compute it manually, but it
> >> involves much
> >>  > more work and is also error prone:
> >>
> >>  > (x^2 - 2 + 2*cos(x)) / (x^2 * (1 - cos(x)))
> >>  > And applying L'Hospital:
> >>  > (2*x - 2*sin(x)) / (2*x * (1 - cos(x)) + x^2*sin(x))
> >>  > # and a 2nd & 3rd & 4th time
> >>  > 1/6
> >>
> >>  > The big problem was that I did not expect it to fail for x =
> >> 1E-4. I
> >>  > thought it is more robust and works maybe until 1E-5.
> >>  > x = 1E-5
> >>  > 2/x^2 - 2E+10
> >>  > # -3.814697e-06
> >>
> >>  > This is the reason why I believe that there is room for
> >> improvement.
> >>
> >>  > Sincerely,
> >>  > Leonard
> >>
> >> Thank you, Leonard.
> >> Yes, I agree that it is amazing how much your formula suffers from
> >> (a generalization of) "cancellation" --- leading you to think
> >> there was a problem with cos() or log() or .. in R.
> >> But really R uses the system builtin libmath library, and the
> >> problem is really the inherent instability of your formula.
> >>
> >> Indeed your first approximation was not really much more stable:
> >>
> >> ## 3.) 1/(1 - cos(x)) - 2/x^2
> >> ## It is possible to convert the formula to one which is numerically
> >> more
> >> ## stable. It is also possible to compute it manually, but it
> >> involves much
> >> ## more work and is also error prone:
> >> ## (x^2 - 2 + 2*cos(x)) / (x^2 * (1 - cos(x)))
> >> ## MM: but actually, that approximation does not seem better (close
> >> to the breakdown region):
> >> f1 <- \(x) 1/(1 - cos(x)) - 2/x^2
> >> f2 <- \(x) (x^2 - 2 + 2*cos(x)) / (x^2 * (1 - cos(x)))
> >> curve(f1, 1e-8, 1e-1, log="xy" n=2^10)
> >> curve(f2, add = TRUE, col=2,   n=2^10)
> >> ## Zoom in:
> >> curve(f1, 1e-4, 1e-1, log="xy",n=2^9)
> >> curve(f2, add = TRUE, col=2,   n=2^9)
> >> ## Zoom in much more in y-direction:
> >> yl <- 1/6 + c(-5, 20)/10
> >> curve(f1, 1e-4, 1e-1, log="x", ylim=yl, n=2^9)
> >> abline(h = 1/6, lty=3, col="gray")
> >> curve(f2, add

Re: [R] Timezone question

2023-08-17 Thread Bert Gunter

You may also find the package "lutz" to be of interest, although that may
be overkill for your needs.
(found by an internet search).

Cheers,
Bert

On Thu, Aug 17, 2023 at 1:31 PM Dennis Fisher  wrote:

> R 4.3.1
> OS X
>
> Colleagues
>
> Is there a simple way to determine the timezone offset for my present
> location.  For example, during standard time in the US, the offset from GMT
> is 8 hours in California.
>
> Dennis
>
> Dennis Fisher MD
> P < (The "P Less Than" Company)
> Phone / Fax: 1-866-PLessThan (1-866-753-7784)
> www.PLessThan.com 
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Questions about R

2023-08-17 Thread Bert Gunter

Incidentally, you might be interested in the banner shown when R starts up:

"R is free software and comes with ABSOLUTELY NO WARRANTY."

I believe this is standard for open source software (upon which a lot of
organizations depend). In any case, that might be the most definitive and
"official" answer you can get.

Bert

On Thu, Aug 17, 2023 at 9:17 AM Bert Gunter  wrote:

> This is a volunteer Help list for users of R, which is open source, so you
> can see all its code. I can answer no to your questions, unless you are
> using one of R's innumerable packages that interacts with the internet and
> to which the user may give personal information to enable the desired
> functionality (logins, etc.).  But of course how do you know that I am not
> some malevolent agent or organization wishing to mislead you for my own
> nefarious purposes?
>
> Cheers,
> Bert
>
> On Thu, Aug 17, 2023 at 8:37 AM Shaun Parr 
> wrote:
>
>>
>>
>> Sent from Outlook for Android<https://aka.ms/AAb9ysg>
>>
>>
>> Hi there,
>>
>> My name is Shaun and I work in an organisation where one of our users
>> wishes to install the R software and our process is to assess the safety of
>> anyone software prior to authorisation. I can’t seem to locate all the
>> information that we require on the webpage, so could someone kindly advise
>> me of the following information please?
>>
>> 1. Please can you confirm what user information the software collects
>> (E.g. Name, password, e-mail address, any Personally Identifiable
>> Information etc)?
>> 2. If any is collected, please can you confirm if the information
>> collected by the software stays locally on the device or if it is
>> transferred anywhere. If it is transferred, could you please advise where
>> it is transferred to (E.g. your own servers, or a third party data centre
>> such as Amazon Web Services or Azure)?
>> 3. Are there any third-party components installed within the software
>> and, if so, are these also kept up-to-date?
>>
>> If you could kindly advise this information, it would be really
>> appreciated, thank you 
>>
>>
>> Shaun
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Questions about R

2023-08-17 Thread Bert Gunter

This is a volunteer Help list for users of R, which is open source, so you
can see all its code. I can answer no to your questions, unless you are
using one of R's innumerable packages that interacts with the internet and
to which the user may give personal information to enable the desired
functionality (logins, etc.).  But of course how do you know that I am not
some malevolent agent or organization wishing to mislead you for my own
nefarious purposes?

Cheers,
Bert

On Thu, Aug 17, 2023 at 8:37 AM Shaun Parr  wrote:

>
>
> Sent from Outlook for Android
>
>
> Hi there,
>
> My name is Shaun and I work in an organisation where one of our users
> wishes to install the R software and our process is to assess the safety of
> anyone software prior to authorisation. I can’t seem to locate all the
> information that we require on the webpage, so could someone kindly advise
> me of the following information please?
>
> 1. Please can you confirm what user information the software collects
> (E.g. Name, password, e-mail address, any Personally Identifiable
> Information etc)?
> 2. If any is collected, please can you confirm if the information
> collected by the software stays locally on the device or if it is
> transferred anywhere. If it is transferred, could you please advise where
> it is transferred to (E.g. your own servers, or a third party data centre
> such as Amazon Web Services or Azure)?
> 3. Are there any third-party components installed within the software and,
> if so, are these also kept up-to-date?
>
> If you could kindly advise this information, it would be really
> appreciated, thank you 
>
>
> Shaun
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] OFF TOPIC: chatGPT glibly produces a lot of wrong answers?

2023-08-13 Thread Bert Gunter

**OFF TOPIC** but perhaps of interest to some on this list. I apologize in
advance to those who may be offended.

The byline:

"ChatGPT's odds of getting code questions correct are worse than a coin flip

But its suggestions are so annoyingly plausible"
*
from here:
https://www.theregister.com/2023/08/07/chatgpt_stack_overflow_ai/

Hmm... Perhaps not surprising. Sounds like some expert consultants I've
met. 

Just for amusement. I am ignorant about this and have no strongly held
views,

Cheers to all,
Bert

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different TFIDF settings in test set prevent testing model

2023-08-11 Thread Bert Gunter

I know nothing about tf, etc., but can you not simply read in the whole
file into R and then randomly split using R? The training and test sets
would simply be defined by a single random sample of subscripts which is
either chosen or not.

e.g. (simplified example -- you would be subsetting the rows of your full
dataset):

> x<- 1:10
> samp <- sort(sample(x,5))
> x[samp] ## training
[1] 3 4 6 7 8
> x[-samp] ## test
[1]  1  2  5  9 10

Apologies if my ignorance means this can't work.

Cheers,
Bert


On Fri, Aug 11, 2023 at 7:17 AM James C Schopf  wrote:

> Hello, I'd be very grateful for your help.
>
> I randomly separated a .csv file with 1287 documents 75%/25% into 2 csv
> files, one for training an algorithm and the other for testing the
> algorithm.  I applied similar preprocessing, including TFIDF
> transformation, to both sets, but R won't let me make predictions on the
> test set due to a different TFIDF matrix.
> I get the error message:
>
> Error: variable 'text_tfidf' was fitted with type "nmatrix.67503" but type
> "nmatrix.27118" was supplied
>
> I'd greatly appreciate a suggestion to overcome this problem.
> Thanks!
>
>
> Here's my R codes:
>
> > library(tidyverse)
> > library(tidytext)
> > library(caret)
> > library(kernlab)
> > library(tokenizers)
> > library(tm)
> > library(e1071)
>
> ***LOAD TRAINING SET/959 rows with text in column1 and yes/no in column2
> (labelled M2)
> > url <- "D:/test/M2_75.csv"
> > d <- read_csv(url)
> ***CREATE TEXT CORPUS FROM TEXT COLUMN
> > train_text_corpus <- Corpus(VectorSource(d$Text))
> ***DEFINE TOKENS FOR EACH DOCUMENT IN CORPUS AND COMBINE THEM
> > tokenize_document <- function(doc) {
> + doc_tokens <- unlist(tokenize_words(doc))
> + doc_bigrams <- unlist(tokenize_ngrams(doc, n = 2))
> + doc_trigrams <- unlist(tokenize_ngrams(doc, n = 3))
> + all_tokens <- c(doc_tokens, doc_bigrams, doc_trigrams)
> + return(all_tokens)
> + }
> ***APPLY TOKENS TO DOCUMENTS
> > all_train_tokens <- lapply(train_text_corpus, tokenize_document)
> ***CREATE A DTM FROM THE TOKENS
> > train_text_dtm <-
> DocumentTermMatrix(Corpus(VectorSource(all_train_tokens)))
> ***TRANSFORM THE DTM INTO A TF-IDF MATRIX
> > train_text_tfidf <- weightTfIdf(train_text_dtm)
> ***CREATE A NEW DATA FRAME WITH M2 COLUMN FROM ORIGINAL DATA
> > trainData <- data.frame(M2 = d$M2)
> ***ADD NEW TFIDF transformed TEXT COLUMN NEXT TO DATA FRAME
> > trainData$text_tfidf <- I(as.matrix(train_text_tfidf))
> ***DEFINE THE ML MODEL
> > ctrl <- trainControl(method = "repeatedcv", number = 5, repeats = 2,
> classProbs = TRUE)
> ***TRAIN SVM
> > model_svmRadial <- train(M2 ~ ., data = trainData, method = "svmRadial",
> trControl = ctrl)
> ***SAVE SVM
> > saveRDS(model_svmRadial, file = "D:/SML/model_M23_svmRadial_UP.RDS")
>
> R code on my test set, which didn't work at last step:
>
> ***LOAD TEST SET/ 309 rows with text in column1 and yes/no in column2
> (labelled M2)
> > url <- "D:/test/M2_25.csv"
> > d <- read_csv(url)
> ***CREATE TEXT CORPUS FROM TEXT COLUMN
> > test_text_corpus <- Corpus(VectorSource(d$Text))
> ***DEFINE TOKENS FOR EACH DOCUMENT IN CORPUS AND COMBINE THEM
> > tokenize_document <- function(doc) {
>  doc_tokens <- unlist(tokenize_words(doc))
>  doc_bigrams <- unlist(tokenize_ngrams(doc, n = 2))
>  doc_trigrams <- unlist(tokenize_ngrams(doc, n = 3))
>  all_tokens <- c(doc_tokens, doc_bigrams, doc_trigrams)
>  return(all_tokens)
>  }
> ***APPLY TOKEN TO DOCUMENTS
> > all_test_tokens <- lapply(test_text_corpus, tokenize_document)
> ***CREATE A DTM FROM THE TOKENS
> > test_text_dtm <-
> DocumentTermMatrix(Corpus(VectorSource(all_test_tokens)))
> ***TRANSFORM THE DTM INTO A TF-IDF MATRIX
> > test_text_tfidf <- weightTfIdf(test_text_dtm)
> ***CREATE A NEW DATA WITH M2 COLUMN FROM ORIGINAL TEST DATA
> > testData <- data.frame(M2 = d$M2)
> ***ADD NEW TFIDF transformed TEXT COLUMN NEXT TO TEST DATA
> > testData$text_tfidf <- I(as.matrix(test_text_tfidf))
> ***LOAD OLD MODEL
> model_svmRadial <- readRDS("D:/SML/model_M2_75_svmRadial.RDS")
>  ***MAKE PREDICTIONS
> predictions <- predict(model_svmRadial, newdata = testData)
>
> This last line produces the error message:
>
> Error: variable 'text_tfidf' was fitted with type "nmatrix.67503" but type
> "nmatrix.27118" was supplied
>
> Please help.  Thanks!
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide

Re: [R] R library highcharter function highchart() execute with exception the apparmor read denied for /etc/passwd and /etc/group

2023-08-08 Thread Bert Gunter

If you don't get a satisfactory answer here in due course, you can try
contacting the package maintainer, who you can find via ?maintainer.

Cheers,
Bert

On Tue, Aug 8, 2023 at 7:50 AM Gu, Jay via R-help  wrote:
>
> Dears,
>
>
> I use the R library highcharter with ubuntu 18.04 and R 3.6.3. Recently, I 
> upgraded to ubuntu 20.04 and R 4.3.1. And the version of library highcharter 
> are both 0.9.4. Then I execute the function highchart() it always throw the 
> exception that child process has died. And I checked the /var/log/kern.log 
> and found below error:
>
> Aug 7 08:37:50 ip-172-31-27-249 kernel: [2251703.494866] audit: type=1400 
> audit(1691397470.399:739): apparmor="DENIED" operation="open" 
> profile="managedr-profile" name="/etc/passwd" pid=159930 comm="R" 
> requested_mask="r" denied_mask="r" fsuid=1000 ouid=0
> Aug 7 08:37:50 ip-172-31-27-249 kernel: [2251703.494871] audit: type=1400 
> audit(1691397470.399:740): apparmor="DENIED" operation="open" 
> profile="managedr-profile" name="/etc/group" pid=159930 comm="R" 
> requested_mask="r" denied_mask="r" fsuid=1000 ouid=0
>
> If I add below two lines in my apparmor profile it will resolve this issue. 
> But I don't like to expose these two files to end user as it has potential 
> risk.
> /etc/passwd r,
> /etc/group r,
>
> I'd like to know if there is any solution to fix it without giving the read 
> access for these two files /etc/passwd and /etc/group in the apparmor profile 
> as I did with ubuntu 18.04 and R 3.6.3. Thanks!
> Best Regards!
> Jay Gu
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] group consecutive dates in a row

2023-08-07 Thread Bert Gunter

Here is another way to obtain the day differences that is the argument
of rle() . It is perhaps more reliable in that it uses methods for
class POSIXct rather than depending on the underlying class structure
and conversion via as.numeric. In theory, the methods won't change or
any changes will be documented, whereas class implementations are
allowed to be fluid and undocumented at the R user level (e.g. the
Help system) afaik. In this case, I don't think that would happen, so
I am merely being pedantic, but I hope I do not offend by making the
point.

But I also wanted to say that the OP could have (imo) looked up the
methodology as I describe below, rather than post here. Or perhaps he
did, but got stymied because he did not know about rle(), a somewhat
esoteric base R function. In which case, the search query "how to find
runs of identical values in an R vector" immediately yielded a hit on
rle(). So my message is: **DO** first try to use R's internal Help
before posting, especially for base R related tasks: it is really a
superb resource (again imo). *

OK, enough sermonizing. Here's what I did. Since the data are POSIXct
class, I went to ?POSIXct and browsed through it until I found
"difftime for time intervals" in the *See Also* section. Following
that link to ?difftime showed me this was what was needed, which is:

difftime(mydf[-1,1], mydf[-nrow(mydf), 1], units = "days")
Time differences in days
[1] 1 1 6 8

Cheers,
Bert

*... and I think it is likely that there are time series related
packages that also could be used, perhaps more immediately, to do what
was requested. But that would require a more diligent search. Though
with new LLM's and generative AI for Help systems becoming available,
the degree of diligence required is rapidly decreasing.

On Mon, Aug 7, 2023 at 9:52 AM Ben Bolker  wrote:
>
> rle(as.numeric(diff(mydf$data_POSIX)))  should get you started, I think?
>
> On 2023-08-07 12:41 p.m., Stefano Sofia wrote:
> > Dear R users,
> >
> > I have a data frame with a single column of POSIXct elements, like
> >
> >
> > mydf <- data.frame(data_POSIX=as.POSIXct(c("2012-02-05", "2012-02-06", 
> > "2012-02-07", "2012-02-13", "2012-02-21"), format = "%Y-%m-%d", 
> > tz="Etc/GMT-1"))
> >
> >
> > I need to transform it in a two-columns data frame where I can get rid of 
> > consecutive dates. It should appear like
> >
> >
> > data_POSIX_init data_POSIX_fin
> >
> > 2012-02-05 2012-02-07
> >
> > 2012-02-13 NA
> >
> > 2012-02-21 NA
> >
> >
> > I started with two "while cycles" and so on, but this is not an efficient 
> > way to do it.
> >
> > Could you please give me an hint on how to proceed?
> >
> >
> > Thank you for your precious attention and help
> >
> > Stefano
> >
> >
> >   (oo)
> > --oOO--( )--OOo--
> > Stefano Sofia PhD
> > Civil Protection - Marche Region - Italy
> > Meteo Section
> > Snow Section
> > Via del Colle Ameno 5
> > 60126 Torrette di Ancona, Ancona (AN)
> > Uff: +39 071 806 7743
> > E-mail: stefano.so...@regione.marche.it
> > ---Oo-oO
> >
> > 
> >
> > AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu� contenere 
> > informazioni confidenziali, pertanto � destinato solo a persone autorizzate 
> > alla ricezione. I messaggi di posta elettronica per i client di Regione 
> > Marche possono contenere informazioni confidenziali e con privilegi legali. 
> > Se non si � il destinatario specificato, non leggere, copiare, inoltrare o 
> > archiviare questo messaggio. Se si � ricevuto questo messaggio per errore, 
> > inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio 
> > computer. Ai sensi dell'art. 6 della DGR n. 1394/2008 si segnala che, in 
> > caso di necessit� ed urgenza, la risposta al presente messaggio di posta 
> > elettronica pu� essere visionata da persone estranee al destinatario.
> > IMPORTANT NOTICE: This e-mail message is intended to be received only by 
> > persons entitled to receive the confidential information it may contain. 
> > E-mail messages to clients of Regione Marche may contain information that 
> > is confidential and legally privileged. Please do not read, copy, forward, 
> > or store this message unless you are an intended recipient of it. If you 
> > have received this message in error, please forward it to the sender and 
> > delete it completely from your computer system.
> >
> > --
> > Questo messaggio  stato analizzato da Libraesva ESG ed  risultato non 
> > infetto.
> > This message was scanned by Libraesva ESG and is believed to be clean.
> >
> >
> >   [[alternative HTML version deleted]]
> >
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal,

Re: [R] Stacking matrix columns

2023-08-05 Thread Bert Gunter

Or just dim(x) <- NULL.
(as matrices in base R are just vectors with a dim attribute stored in
column major order)

ergo:

> x
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
> x<- 1:20  ## a vector
> is.matrix(x)
[1] FALSE
> dim(x) <- c(5,4)
> is.matrix(x)
[1] TRUE
> attributes(x)
$dim
[1] 5 4

> ## in painful and unnecessary detail as dim() should be used instead
> attr(x, "dim") <- NULL
> is.matrix(x)
[1] FALSE
> x
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

## well, you get it...

-- Bert

On Sat, Aug 5, 2023 at 5:21 PM Iris Simmons  wrote:
>
> You could also do
>
> dim(x) <- c(length(x), 1)
>
> On Sat, Aug 5, 2023, 20:12 Steven Yen  wrote:
>
> > I wish to stack columns of a matrix into one column. The following
> > matrix command does it. Any other ways? Thanks.
> >
> >  > x<-matrix(1:20,5,4)
> >  > x
> >   [,1] [,2] [,3] [,4]
> > [1,]16   11   16
> > [2,]27   12   17
> > [3,]38   13   18
> > [4,]49   14   19
> > [5,]5   10   15   20
> >
> >  > matrix(x,ncol=1)
> >[,1]
> >   [1,]1
> >   [2,]2
> >   [3,]3
> >   [4,]4
> >   [5,]5
> >   [6,]6
> >   [7,]7
> >   [8,]8
> >   [9,]9
> > [10,]   10
> > [11,]   11
> > [12,]   12
> > [13,]   13
> > [14,]   14
> > [15,]   15
> > [16,]   16
> > [17,]   17
> > [18,]   18
> > [19,]   19
> > [20,]   20
> >  >
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Technical Help Request for "Version Differences" (i.e., CYTOFKIT package)

2023-08-04 Thread Bert Gunter

Dear Murat:

See the "posting guide" (link below) for how and what to post on this list.

I believe that in that guide it is recommended that you first update
your R version and packages to the latest versions (R 3.5 is rather
old now). You should probably reinstall R and all packages from the
main CRAN website:
https://cran.r-project.org/

Note that once you install R, you can read ?install.packages for how
to install the latest versions of all the packages you need that are
on CRAN.

Note that R and RStudio are different software products. So once R is
installed, you can download and install the packages you need from the
Posit website if you prefer, again with the latest versions.

If for some reason CRAN is not available to you or some of the
packages you need are not on CRAN(e.g. github only packages)  or won't
run on your computer, then the process will be more complicated, and
others will have to help you. CRAN has a rigorous process for assuring
compatibility of packages and R versions, so that packages not on CRAN
may not be compatible with the latest R version. All I can suggest is
that you avoid such packages **if possible** (I understand that it may
not be). You can search CRAN's "Task Views" (link on
https://cran.r-project.org/ )for alternative CRAN packages that may
have the functionality you seek.

Cheers,
Bert


On Fri, Aug 4, 2023 at 8:40 AM MURAT DELMAN via R-help
 wrote:
>
>
>
>
> Dear Ms./Mr.,
>
>
>
>
>
>
>
> If the email text and codes below are not properly displayed, you can 
> download the attached Word file, which has exactly the same content.
>
>
>
> I am a cytometrist and microscopist at Izmir Institute of Technology, 
> Integrated Research Center. I operate a cytometer and fluorescence microscope 
> to acquire data from our researchers’ samples and then give them analyzed 
> data. I need to dive into bulk data to present all of the information in the 
> sample. To do this, I have recently been trying to develop my skills in R, 
> RStudio, and Cytofkit package for cytometry (flow, spectral, mass, imaging 
> mass cytometry) data analysis. I am new in this area and must learn R for 
> data analysis.
>
>
>
>
>
> May I request help from you or one of your assistants who is competent in R, 
> RStudio, and R packages with the installation issues caused by “ version 
> differences ” between R itself, R packages, and their dependencies? I really 
> had to send this email to you anymore, because I could not run the package 
> “Cytofkit” and find a solution on the internet.
>
>
> It has been nearly 2 weeks since I have been trying to solve the errors and 
> warnings. I am in a vicious circle and cannot move forward anymore. This is 
> why I needed help from an expert competent in R programming.
>
>
>
> I tried to use the renv package, but I could not move forward because I am 
> not competent in R.
>
>
>
>
> I explained the errors that I faced during the installation below.
>
>
>
> I have two computers, both operating Windows 10, 64-bit.
>
>
>
> I installed R 3.5. 2 on the desktop (Windows in English) and R 3.5. 0 on the 
> laptop (Windows in Turkish).
>
>
>
> I know that you are very busy with your work, but may you (or your assistant) 
> help me by informing me how to install consistent versions of R, RStudio, and 
> packages/dependencies (for example, Cytofkit, ggplot2, Rtools, devtools, 
> plyr, shiny, GUI)?
>
>
>
>
>
> I guess I will need exact web page links " in an orderly manner" to download 
> the correct versions of everything.
>
>
>
> I would appreciate it if you could advise me or forward this email to one of 
> your assistants or experts.
>
>
>
>
>
>
> I really appreciate any help you can provide.
>
>
>
>
>
>
> Kind Regards
>
>
>
>
>
>
> Please find the technical details below;
>
>
>
>
>
> I used the installation order listed below;
>
>
>
> I downloaded R 3.5.0 and R 3.5.2 using the link and directory below.
>
>
>
> Link: [ https://www.freestatistics.org/cran/ | 
> https://www.freestatistics.org/cran/ ]
>
>
>
> Directory: Download R for Windows > install R for the first time > Previous 
> releases > R 3.5.2 (December, 2018) or R 3.5.0
>
>
>
>
>
> I downloaded RStudio from the link below;
>
>
>
> [ https://posit.co/download/rstudio-desktop/ | 
> https://posit.co/download/rstudio-desktop/ ]
>
>
>
>
>
>
> I downloaded packages via either [ 
> https://cran.rstudio.com/bin/windows/contrib/r-devel/ | 
> https://cran.rstudio.com/bin/windows/contrib/r-devel/ ] or the “install 
> function” tab in RStudio.
>
>
>
> Rtools : [ https://cran.r-project.org/bin/windows/Rtools/ | 
> https://cran.r-project.org/bin/windows/Rtools/ ]
>
>
>
> Devtools : [ https://cran.r-project.org/web/packages/devtools/index.html | 
> https://cran.r-project.org/web/packages/devtools/index.html ]
>
>
>
> ggplot2 : [ https://cran.r-project.org/web/packages/ggplot2/index.html | 
> https://cran.r-project.org/web/packages/ggplot2/index.html ]
>
>
>
> [ https://ggplot2.tidyverse.org/ | https://ggplot2.tidyverse.org/ ]

Re: [R] Style guide when using "R" in a title

2023-07-26 Thread Bert Gunter

https://www.r-project.org/logo/

Cheers,
Bert

On Wed, Jul 26, 2023 at 4:01 PM Wadsworth, Spencer G [STAT]
 wrote:
>
> Hello,
>
> I am working on a small booklet to be used with an existing statistics 
> textbook. The purpose of the booklet is to give worked through examples from 
> the textbook using R code and it will be made publicly available with the 
> textbook. The title of the booklet is "R Code Supplement for Basic 
> Engineering Data Collection and Analysis by Vardeman and Jobe".  Someone with 
> whom I'm working on the booklet said that the "R" in the title might need to 
> follow a specific style guide given by the R-project. Is this accurate? Is 
> there a particular font I should use?
>
> Thanks,
> Spencer
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotly question

2023-07-21 Thread Bert Gunter

As you apparently haven't received any responses yet, I'll try to
suggest something useful. However, I have absolutely zero experience
with plotly, so this is just from general principles and reading the
plot_ly Help file, which says for the "..." arguments:

"Arguments (i.e., attributes) passed along to the trace type. See
schema() for a list of acceptable attributes for a given trace type
(by going to traces -> type -> attributes). Note that attributes
provided at this level may override other arguments (e.g. plot_ly(x =
1:10, y = 1:10, color = I("red"), marker = list(color = "blue")))."

So I would **guess** that you needs to go to ?schema to see if the
further attributes of your "gauge" type that you wish to change are
there.

Alternatively, plotly is a package from posit.co, formerly RStudio;
they have an extensive support site and community here:
https://posit.co/support/
So you may have success there.

Finally, I assume you have tried web searching appropriate search
queries, but if not, you should do so. It is sometimes surprising how
much you can find that way.

... and, again, apologies if my ignorance means my suggestions are useless.

Cheers,
Bert

On Fri, Jul 21, 2023 at 6:19 AM Thomas Subia via R-help
 wrote:
>
> Colleagues
>
> Here is my reproducible code
>
> plot_ly(
>   domain = list(x = c(0, 1), y = c(0, 1)),
>   value = 2874,
>   title = list(text = "Generic"),
>   type = "indicator",
>   mode = "gauge+number+delta",
>   delta = list(reference = 4800),
>   gauge = list(
> axis =list(range = list(NULL, 5000)),
> steps = list(
> list(range = c(0, 4800), color = "white"),
> list(range = c(4800, 6000), color = "red")),
> threshold = list(
> line = list(color = "black", width = 6),
> thickness = 0.75,
> value = 4800)))
>
> How can I change the indicator color from green to some other color?
>
> How can I change the typeface and font size of the speedometer tick mark font 
> size?
>
> Thomas Subia
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Off-topic: ChatGPT Code Interpreter

2023-07-17 Thread Bert Gunter

This is an **off-topic** post about the subject line, that I thought
might be of interest to the R Community. I hope this does not offend
anyone.

The widely known ChatGPT software now offers what  is called a "Code
Interpreter," that, among other things, purports to do "data
analysis."  (Search for articles with details.) One quote, from the
(online) NY Times, is:

"Arvind Narayanan, a professor of computer science at Princeton
University, cautioned that people should not become overly reliant on
code interpreter for data analysis as A.I. still produces inaccurate
results and misinformation.

'Appropriate data analysis requires just a lot of critical thinking
about the data,” he said.' "

Amen. ... Maybe.

(As this is off-topic, if you wish to reply to me, probably better to
do so privately).

Cheers to all,
Bert

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to change the y-axis to logarithmic in a barplot ggplot

2023-07-16 Thread Bert Gunter

Maria,

ggplot is part of the Posit -- formerly RStudio -- assortment of
contributed packages. They have their own support site here
 , which you *may* find useful for such
questions, also. Though ggplot queries *are* frequently posted and answered
on R-Help.

Cheers,
Bert

On Sun, Jul 16, 2023 at 3:58 PM Maria Lathouri via R-help <
r-help@r-project.org> wrote:

> I will find the ggplot help.
>
> But I have tried everything, including what you have suggested and nothing
> works.
>
> Kind regards,
> Maria
>
>
>
>
>
>
> Στις Κυριακή 16 Ιουλίου 2023 στις 11:22:36 μ.μ. GMT+1, ο χρήστης CALUM
> POLWART  έγραψε:
>
>
>
>
>
> Try adding
>
> scale_y_log10()
>
> This is a general R help list. It's not a ggplot list and you are likely
> to be chased off to ggplot's package maintainers nominated support pages.
>
> But really a Google search should surely have found this?
>
>
>
> On Sun, 16 Jul 2023, 22:51 Maria Lathouri via R-help, <
> r-help@r-project.org> wrote:
> > Dear all,
> >
> > ggplot(fc, aes(x = Temp, y = mean, fill = Glass)) +
> > geom_bar(stat = "identity", position = "dodge", aes(y=log(mean)))
> > + theme_bw() + theme(panel.grid.major = element_blank(),
> panel.grid.minor = element_blank()) + theme(legend.position = c(0.45,
> 0.85), legend.title = element_blank())
> > + scale_fill_brewer(palette = "Dark2") + scale_color_brewer(palette =
> "Dark2") +
> >
> scale_y_log10()
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Base R Stats Package - quantile function

2023-07-11 Thread Bert Gunter

1. I highly doubt that anyone from the "R Core team" would respond to such
a request. That is emphatically **not** their job.

2. More to the point, this is *exactly* the sort of task that *you*, as a
student/practitioner of statistics and data analysis are expected to do for
yourself. Indeed, examples already are included in ?quantile, and
references are provided for you to follow up on on your own if you wish to
learn more.

3. Finally, your request is largely off topic here. Please read and follow
the posting guide linked below for learning about/posting on matters that
are on topic.

Cheers,
Bert

On Tue, Jul 11, 2023 at 12:19 PM a.chandh...@btinternet.com a.chandhial---
via R-help  wrote:

>
>
>
> Hi,
>
>
> In Base R Stats Package, the quantile function has 9 Type's:
>
> ?quantile
>
> I'd be very grateful if simple numerical examples (ideally from members
> of the R core team), for each of the 9 methods, both for EVEN and ODD
> numbered length's of series, be provided.
>
>
> thanks,
> Amarjit
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting an error calling MASS::boxcox in a function

2023-07-08 Thread Bert Gunter

Aha. Many thanks, John. Never would have gotten there on my own.

-- Bert

On Sat, Jul 8, 2023 at 2:01 PM John Fox  wrote:
>
> Hi Bert,
>
> On 2023-07-08 3:42 p.m., Bert Gunter wrote:
> > Caution: This email may have originated from outside the organization. 
> > Please exercise additional caution with any links and attachments.
> >
> >
> > Thanks John.
> >
> > ?boxcox says:
> >
> > *
> > Arguments
> >
> > object
> >
> > a formula or fitted model object. Currently only lm and aov objects are 
> > handled.
> > *
> > I read that as saying that
> >
> > boxcox(lm(z+1 ~ 1),...)
> >
> > should run without error. But it didn't. And perhaps here's why:
> > BoxCoxLambda <- function(z){
> > b <- MASS:::boxcox.lm(lm(z+1 ~ 1), lambda = seq(-5, 5, length.out =
> > 61), plotit = FALSE)
> > b$x[which.max(b$y)]# best lambda
> > }
> >
> >> lambdas <- apply(dd,2 , BoxCoxLambda)
> > Error in NextMethod() : 'NextMethod' called from an anonymous function
> >
> > and, indeed, ?UseMethod says:
> > "NextMethod should not be called except in methods called by UseMethod
> > or from internal generics (see InternalGenerics). In particular it
> > will not work inside anonymous calling functions (e.g.,
> > get("print.ts")(AirPassengers))."
> >
> > BUT 
> > BoxCoxLambda <- function(z){
> >b <- MASS:::boxcox(z+1 ~ 1, lambda = seq(-5, 5, length.out = 61),
> > plotit = FALSE)
> >b$x[which.max(b$y)]# best lambda
> > }
> >
> >> lambdas <- apply(dd,2 , BoxCoxLambda)
> >> lambdas
> > [1] 0.167 0.167
>
> As it turns out, it's the update() step in boxcox.lm() that fails, and
> the update takes place because $y is missing from the lm object, so the
> following works:
>
> BoxCoxLambda <- function(z){
>  b <- boxcox(lm(z + 1 ~ 1, y=TRUE),
>  lambda = seq(-5, 5, length.out = 101),
>  plotit = FALSE)
>  b$x[which.max(b$y)]
> }
>
> >
> > The identical lambdas do not seem right to me;
>
> I think that's just an accident of the example (using the BoxCoxLambda()
> above):
>
>  > apply(dd, 2, BoxCoxLambda, simplify = TRUE)
> [1] 0.2 0.2
>
>  > dd[, 2]  <- dd[, 2]^3
>  > apply(dd, 2, BoxCoxLambda, simplify = TRUE)
> [1] 0.2 0.1
>
> Best,
>   John
>
> > nor do I understand why
> > boxcox.lm apparently throws the error while boxcox.formula does not
> > (it also calls NextMethod()) So I would welcome clarification to clear
> > my clogged (cerebral) sinuses. :-)
> >
> >
> > Best,
> > Bert
> >
> >
> > On Sat, Jul 8, 2023 at 11:25 AM John Fox  wrote:
> >>
> >> Dear Ron and Bert,
> >>
> >> First (and without considering why one would want to do this, e.g.,
> >> adding a start of 1 to the data), the following works for me:
> >>
> >> -- snip --
> >>
> >>   > library(MASS)
> >>
> >>   > BoxCoxLambda <- function(z){
> >> +   b <- boxcox(z + 1 ~ 1,
> >> +   lambda = seq(-5, 5, length.out = 101),
> >> +   plotit = FALSE)
> >> +   b$x[which.max(b$y)]
> >> + }
> >>
> >>   > mrow <- 500
> >>   > mcol <- 2
> >>   > set.seed(12345)
> >>   > dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, 
> >> ncol =
> >> +mcol)
> >>
> >>   > dd1 <- dd[, 1] # 1st column of dd
> >>   > res <- boxcox(lm(dd1 + 1 ~ 1), lambda = seq(-5, 5, length.out = 101),
> >> plotit
> >> +  = FALSE)
> >>   > res$x[which.max(res$y)]
> >> [1] 0.2
> >>
> >>   > apply(dd, 2, BoxCoxLambda, simplify = TRUE)
> >> [1] 0.2 0.2
> >>
> >> -- snip --
> >>
> >> One could also use the powerTransform() function in the car package,
> >> which in this context transforms towards *multi*normality:
> >>
> >> -- snip --
> >>
> >>   > library(car)
> >> Loading required package: carData
> >>
> >>   > powerTransform(dd + 1)
> >> Estimated transformation parameters
> >>  Y1Y2
> >> 0.1740200 0.2089925
> >>
> >> I hope this helps,
> >>John
> >>
> >> --
> >>

Re: [R] Getting an error calling MASS::boxcox in a function

2023-07-08 Thread Bert Gunter

Thanks John.

?boxcox says:

*
Arguments

object

a formula or fitted model object. Currently only lm and aov objects are handled.
*
I read that as saying that

boxcox(lm(z+1 ~ 1),...)

should run without error. But it didn't. And perhaps here's why:
BoxCoxLambda <- function(z){
   b <- MASS:::boxcox.lm(lm(z+1 ~ 1), lambda = seq(-5, 5, length.out =
61), plotit = FALSE)
   b$x[which.max(b$y)]# best lambda
}

> lambdas <- apply(dd,2 , BoxCoxLambda)
Error in NextMethod() : 'NextMethod' called from an anonymous function

and, indeed, ?UseMethod says:
"NextMethod should not be called except in methods called by UseMethod
or from internal generics (see InternalGenerics). In particular it
will not work inside anonymous calling functions (e.g.,
get("print.ts")(AirPassengers))."

BUT 
BoxCoxLambda <- function(z){
  b <- MASS:::boxcox(z+1 ~ 1, lambda = seq(-5, 5, length.out = 61),
plotit = FALSE)
  b$x[which.max(b$y)]# best lambda
}

> lambdas <- apply(dd,2 , BoxCoxLambda)
> lambdas
[1] 0.167 0.167

The identical lambdas do not seem right to me; nor do I understand why
boxcox.lm apparently throws the error while boxcox.formula does not
(it also calls NextMethod()) So I would welcome clarification to clear
my clogged (cerebral) sinuses. :-)


Best,
Bert


On Sat, Jul 8, 2023 at 11:25 AM John Fox  wrote:
>
> Dear Ron and Bert,
>
> First (and without considering why one would want to do this, e.g.,
> adding a start of 1 to the data), the following works for me:
>
> -- snip --
>
>  > library(MASS)
>
>  > BoxCoxLambda <- function(z){
> +   b <- boxcox(z + 1 ~ 1,
> +   lambda = seq(-5, 5, length.out = 101),
> +   plotit = FALSE)
> +   b$x[which.max(b$y)]
> + }
>
>  > mrow <- 500
>  > mcol <- 2
>  > set.seed(12345)
>  > dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol =
> +mcol)
>
>  > dd1 <- dd[, 1] # 1st column of dd
>  > res <- boxcox(lm(dd1 + 1 ~ 1), lambda = seq(-5, 5, length.out = 101),
> plotit
> +  = FALSE)
>  > res$x[which.max(res$y)]
> [1] 0.2
>
>  > apply(dd, 2, BoxCoxLambda, simplify = TRUE)
> [1] 0.2 0.2
>
> -- snip --
>
> One could also use the powerTransform() function in the car package,
> which in this context transforms towards *multi*normality:
>
> -- snip --
>
>  > library(car)
> Loading required package: carData
>
>  > powerTransform(dd + 1)
> Estimated transformation parameters
> Y1Y2
> 0.1740200 0.2089925
>
> I hope this helps,
>   John
>
> --
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> web: https://www.john-fox.ca/
>
> On 2023-07-08 12:47 p.m., Bert Gunter wrote:
> > Caution: This email may have originated from outside the organization. 
> > Please exercise additional caution with any links and attachments.
> >
> >
> > No, I'm afraid I'm wrong. Something went wrong with my R session and gave
> > me incorrect answers. After restarting, I continued to get the same error
> > as you did with my supposed "fix." So just ignore what I said and sorry for
> > the noise.
> >
> > -- Bert
> >
> > On Sat, Jul 8, 2023 at 8:28 AM Bert Gunter  wrote:
> >
> >> Try this for your function:
> >>
> >> BoxCoxLambda <- function(z){
> >> y <- z
> >> b <- boxcox(y + 1 ~ 1,lambda = seq(-5, 5, length.out = 61), plotit =
> >> FALSE)
> >> b$x[which.max(b$y)]# best lambda
> >> }
> >>
> >> ***I think*** (corrections and clarification strongly welcomed!) that `~`
> >> (the formula function) is looking for 'z' in the GlobalEnv, the caller of
> >> apply(), and not finding it. It finds 'y' here explicitly in the
> >> BoxCoxLambda environment.
> >>
> >> Cheers,
> >> Bert
> >>
> >>
> >>
> >> On Sat, Jul 8, 2023 at 4:28 AM Ron Crump via R-help 
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> Firstly, apologies as I have posted this on community.rstudio.com too.
> >>>
> >>> I want to optimise a Box-Cox transformation on columns of a matrix (ie, a
> >>> unique lambda for each column). So I wrote a function that includes the
> >>> call to MASS::boxcox in order that it can be applied to each column 
> >>> easily.
> >>> Except that I'm getting an error when calling the function. If I just
> >>> extract a column of the matrix and ru

Re: [R] Getting an error calling MASS::boxcox in a function

2023-07-08 Thread Bert Gunter

No, I'm afraid I'm wrong. Something went wrong with my R session and gave
me incorrect answers. After restarting, I continued to get the same error
as you did with my supposed "fix." So just ignore what I said and sorry for
the noise.

-- Bert

On Sat, Jul 8, 2023 at 8:28 AM Bert Gunter  wrote:

> Try this for your function:
>
> BoxCoxLambda <- function(z){
>y <- z
>b <- boxcox(y + 1 ~ 1,lambda = seq(-5, 5, length.out = 61), plotit =
> FALSE)
>b$x[which.max(b$y)]# best lambda
> }
>
> ***I think*** (corrections and clarification strongly welcomed!) that `~`
> (the formula function) is looking for 'z' in the GlobalEnv, the caller of
> apply(), and not finding it. It finds 'y' here explicitly in the
> BoxCoxLambda environment.
>
> Cheers,
> Bert
>
>
>
> On Sat, Jul 8, 2023 at 4:28 AM Ron Crump via R-help 
> wrote:
>
>> Hi,
>>
>> Firstly, apologies as I have posted this on community.rstudio.com too.
>>
>> I want to optimise a Box-Cox transformation on columns of a matrix (ie, a
>> unique lambda for each column). So I wrote a function that includes the
>> call to MASS::boxcox in order that it can be applied to each column easily.
>> Except that I'm getting an error when calling the function. If I just
>> extract a column of the matrix and run the code not in the function, it
>> works. If I call the function either with an extracted column (ie dd1 in
>> the reprex below) or in a call to apply I get an error (see the reprex
>> below).
>>
>> I'm sure I'm doing something silly, but I can't see what it is. Any help
>> appreciated.
>>
>> library(MASS)
>>
>> # Find optimised Lambda for Boc-Cox transformation
>> BoxCoxLambda <- function(z){
>> b <- boxcox(lm(z+1 ~ 1), lambda = seq(-5, 5, length.out = 61), plotit
>> = FALSE)
>> b$x[which.max(b$y)]# best lambda
>> }
>>
>> mrow <- 500
>> mcol <- 2
>> set.seed(12345)
>> dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol =
>> mcol)
>>
>> # Try it not using the BoxCoxLambda function:
>> dd1 <- dd[,1] # 1st column of dd
>> bb <- boxcox(lm(dd1+1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit
>> = FALSE)
>> print(paste0("1st column's lambda is ", bb$x[which.max(bb$y)]))
>> #> [1] "1st column's lambda is 0.2"
>>
>> # Calculate lambda for each column of dd
>> lambdas <- apply(dd, 2, BoxCoxLambda, simplify = TRUE)
>> #> Error in eval(predvars, data, env): object 'z' not found
>>
>> Created on 2023-07-08 with reprex v2.0.2
>>
>> Thanks for your time and help.
>>
>> Ron
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting an error calling MASS::boxcox in a function

2023-07-08 Thread Bert Gunter

Try this for your function:

BoxCoxLambda <- function(z){
   y <- z
   b <- boxcox(y + 1 ~ 1,lambda = seq(-5, 5, length.out = 61), plotit =
FALSE)
   b$x[which.max(b$y)]# best lambda
}

***I think*** (corrections and clarification strongly welcomed!) that `~`
(the formula function) is looking for 'z' in the GlobalEnv, the caller of
apply(), and not finding it. It finds 'y' here explicitly in the
BoxCoxLambda environment.

Cheers,
Bert



On Sat, Jul 8, 2023 at 4:28 AM Ron Crump via R-help 
wrote:

> Hi,
>
> Firstly, apologies as I have posted this on community.rstudio.com too.
>
> I want to optimise a Box-Cox transformation on columns of a matrix (ie, a
> unique lambda for each column). So I wrote a function that includes the
> call to MASS::boxcox in order that it can be applied to each column easily.
> Except that I'm getting an error when calling the function. If I just
> extract a column of the matrix and run the code not in the function, it
> works. If I call the function either with an extracted column (ie dd1 in
> the reprex below) or in a call to apply I get an error (see the reprex
> below).
>
> I'm sure I'm doing something silly, but I can't see what it is. Any help
> appreciated.
>
> library(MASS)
>
> # Find optimised Lambda for Boc-Cox transformation
> BoxCoxLambda <- function(z){
> b <- boxcox(lm(z+1 ~ 1), lambda = seq(-5, 5, length.out = 61), plotit
> = FALSE)
> b$x[which.max(b$y)]# best lambda
> }
>
> mrow <- 500
> mcol <- 2
> set.seed(12345)
> dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol =
> mcol)
>
> # Try it not using the BoxCoxLambda function:
> dd1 <- dd[,1] # 1st column of dd
> bb <- boxcox(lm(dd1+1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit
> = FALSE)
> print(paste0("1st column's lambda is ", bb$x[which.max(bb$y)]))
> #> [1] "1st column's lambda is 0.2"
>
> # Calculate lambda for each column of dd
> lambdas <- apply(dd, 2, BoxCoxLambda, simplify = TRUE)
> #> Error in eval(predvars, data, env): object 'z' not found
>
> Created on 2023-07-08 with reprex v2.0.2
>
> Thanks for your time and help.
>
> Ron
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 5063 matches

Mail list logo