[R] DPLYR Multiple Mutate Statements On Same DataFrame

2024-10-17 Thread Sparks, John
Hi R Helpers,

I have been looking for an example of how to execute different dplyr mutate 
statements on the same dataframe in a single step.  I show how to do what I 
want to do by going from df0 to df1 to df2 to df3 by applying a mutate 
statement to each dataframe in sequence, but I would like to know if there is a 
way to execute this in a single step; so simply go from df0 to df1 while 
executing all the transformations.   See example below.

Guidance would be appreciated.
--John J. Sparks, Ph.D.

library(dplyr)
df0<-structure(list(SeqNum = c(1L, 2L, 3L, 4L, 5L, 6L, 8L, 9L, 10L,
11L, 12L, 13L, 14L, 15L, 16L, 18L, 19L, 21L, 22L, 23L), MOSTYP = c(37L,
41L, 41L, 13L, 3L, 27L, 37L, 37L, 15L, 14L, 13L, 37L, 4L, 27L,
37L, 26L, 17L, 37L, 37L, 17L), MGEMOM = c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L),
MGODRK = c(3L, 2L, 2L, 3L, 4L, 2L, 2L, 2L, 3L, 4L, 3L, 2L,
3L, 1L, 2L, 3L, 4L, 4L, 3L, 3L), MOSHOO = c(7L, 7L, 7L, 2L,
9L, 4L, 7L, 7L, 2L, 2L, 2L, 7L, 9L, 4L, 7L, 4L, 2L, 7L, 7L,
2L), MRELGE = c(0L, 1L, 0L, 2L, 1L, 0L, 0L, 0L, 3L, 1L, 1L,
1L, 0L, 0L, 0L, 0L, 2L, 0L, 0L, 1L), MSKB2 = c(5L, 4L, 4L,
3L, 4L, 5L, 7L, 1L, 5L, 4L, 3L, 4L, 5L, 6L, 7L, 5L, 4L, 6L,
4L, 7L), MFWEKI = c(1L, 1L, 2L, 2L, 1L, 0L, 0L, 3L, 0L, 1L,
2L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 2L, 0L), MAANTH = c(3L, 4L,
4L, 4L, 4L, 5L, 2L, 6L, 2L, 4L, 4L, 4L, 4L, 2L, 2L, 4L, 3L,
3L, 3L, 2L), MHHUUR = c(2L, 2L, 4L, 2L, 2L, 3L, 0L, 3L, 2L,
2L, 2L, 3L, 1L, 6L, 0L, 2L, 2L, 0L, 2L, 2L), MSKA = c(1L,
0L, 4L, 2L, 2L, 3L, 0L, 3L, 2L, 0L, 2L, 3L, 1L, 5L, 0L, 0L,
1L, 0L, 0L, 1L), MAUT2 = c(2L, 4L, 4L, 3L, 4L, 5L, 5L, 3L,
2L, 3L, 3L, 4L, 4L, 3L, 5L, 2L, 3L, 3L, 2L, 3L), MFALLE = c(1L,
0L, 0L, 3L, 5L, 0L, 0L, 0L, 0L, 4L, 1L, 1L, 2L, 2L, 0L, 2L,
5L, 0L, 0L, 3L), MGEMLE = c(1L, 0L, 0L, 0L, 4L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 3L, 2L, 0L), MAUT1 = c(2L,
5L, 7L, 3L, 0L, 4L, 2L, 1L, 3L, 9L, 5L, 3L, 2L, 4L, 2L, 1L,
3L, 0L, 4L, 2L), MINKGE = c(2L, 4L, 2L, 2L, 0L, 2L, 2L, 1L,
3L, 0L, 1L, 4L, 2L, 2L, 2L, 5L, 1L, 0L, 3L, 1L), MOPLHO = c(1L,
0L, 0L, 0L, 0L, 2L, 2L, 1L, 2L, 0L, 0L, 1L, 0L, 0L, 2L, 0L,
0L, 0L, 0L, 0L), MGODPR = c(1L, 2L, 2L, 0L, 1L, 3L, 2L, 3L,
2L, 1L, 2L, 3L, 0L, 3L, 2L, 2L, 2L, 0L, 2L, 1L), MAUT0 = c(8L,
6L, 9L, 7L, 5L, 9L, 6L, 7L, 6L, 5L, 4L, 7L, 8L, 5L, 6L, 7L,
5L, 9L, 9L, 5L), MSKB1 = c(0L, 2L, 4L, 1L, 0L, 5L, 2L, 7L,
2L, 0L, 3L, 3L, 3L, 4L, 2L, 0L, 2L, 3L, 3L, 1L), MSKC = c(4L,
5L, 3L, 4L, 6L, 3L, 3L, 2L, 4L, 8L, 3L, 3L, 4L, 3L, 3L, 4L,
4L, 3L, 3L, 5L), PAANHA = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), PWAPAR = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L), PPERSA = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), AMOTSC = c(0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L), APERSA = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), AWAPAR = c(1L,
1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 1L,
1L, 0L, 1L, 1L), Resp = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), row.names = c(NA,
20L), class = "data.frame")


df1<-df0 %>%
  mutate(across(starts_with('P'),~ifelse(.x==0,   0,
  ifelse(.x==1,   25,
  ifelse(.x==2,   75,
  ifelse(.x==3,  150,
  ifelse(.x==4,  350,
  ifelse(.x==5,  750,
  ifelse(.x==6, 3000,
  ifelse(.x==7, 7500,
  ifelse(.x==8,15000,
  ifelse(.x==9,3,
  -99

df2<-df1 %>%
mutate_at(vars(MRELGE:MSKC),~ifelse(.x==0,  0,
 ifelse(.x==1,  5,
  -99)))
df3<-df2 %>%
mutate_at(vars(MGODRK),~ifelse(.x==0,  0,
ifelse(.x==1,  5,
  -99)))




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there a sexy way ...?

2024-09-27 Thread Sorkin, John
"Sexy code" may get a job done and demonstrate the code's knowledge of a 
programming language, but it often does this at the expense of clear, easy to 
document (i.e. annotate what the code does), easy to read, and easy to 
understand code. I fear that this is what this thread has developed "sexy" but 
not easily understandable code. While I send kudos to all of you, remember that 
sometimes simpler, while not as sexy can be better in the long run. ;)

John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center;
PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382





From: R-help  on behalf of avi.e.gr...@gmail.com 

Sent: Friday, September 27, 2024 10:48 PM
To: 'Rolf Turner'; r-help@r-project.org
Subject: Re: [R] Is there a sexy way ...?

Rold,

We need to be clear on what makes an answer sexy! LOL!

I decided it was sexy to do it in a way that nobody (normal) would and had
not suggested yet.

Here is an original version I will explain in a minute. Or, maybe best a bit
before. Hee is the unformatted result whicvh is a tad hard to read but will
be made readable soon:

x <- list(`1` = c(7, 13, 1, 4, 10),
  `2` = c(2, 5,  14, 8, 11),
  `3` = c(6, 9, 15, 12, 3))

as.integer(unlist(strsplit(as.vector(paste(paste(x$`1`, x$`2`, x$`3`,
sep=","), collapse=",")), split=",")))

The result is: 7  2  6 13  5  9  1 14 15  4  8 12 10 11  3

After reading what others wrote, the following is more general one where any
number of vectors in a list can be handled:

as.integer(unlist(strsplit(as.vector(paste(do.call(paste, c(x, sep=",")),
collapse=",")), split=",")))

Perhaps a tad more readable is a version using the new pipe but for obvious
reasons, the dplyr/magrittr pipe works better for me than having to create
silly anonymous functions instead of using a period. You now have a
pipeline:

library(dplyr)

x %>%
  c(sep=",") %>%
  do.call(paste, .) %>%
  paste(collapse=",") %>%
  as.vector() %>%
  strsplit(split=",") %>%
  unlist() %>%
  as.integer()

And it returns the right answer!

- You start with x and pipe it as

- the first argument to c() and the second argument already in place is an
option to later use comma as a separator

- that is piped to a do.call() which takes that c() tuple and replaces the
second argument of period with it. You now have taken the original data and
made three text strings like so:
"7,2,6"   "13,5,9"  "1,14,15" "4,8,12"  "10,11,3"

- But you want all those strings collapsed into a single long string with
commas between the parts. Do another paste this time putting the substrings
together and collapsing with a comma. The results is:
"7,2,6,13,5,9,1,14,15,4,8,12,10,11,3"

- But that is not a vector and don't ask why!

- Now split that string at commas:
"7"  "2"  "6"  "13" "5"  "9"  "1"  "14" "15" "4"  "8"  "12" "10" "11" "3"

- and undo the odd list format it returns to flatten it back into a
character vector:
"7"  "2"  "6"  "13" "5"  "9"  "1"  "14" "15" "4"  "8"  "12" "10" "11" "3"

- Yep it looks the same but is subtly different. Time to make it into
integers or whatever:
7  2  6 13  5  9  1 14 15  4  8 12 10 11  3

Looked at after the fact, it seems so bloody obvious! And the chance of
someone else trying this approach, justifiably, is low, LOL!

One nice feature of the do.call is this can be extended like so:

x <- list(`1` = c(7, 13, 1, 4, 10),
  `2` = c(2, 5,  14, 8, 11),
  `3` = c(6, 9, 15, 12, 3),
  `4` = c( 101, 102, 103, 104, 105),
  `5` = c(-105, -104, -103, -102, -101))

Works fine and does this for the now five columns:

[1]726  101 -105   1359  102 -1041   14   15  103
-10348   12  104 -102
[21]   10   113  105 -101

My apologies to all who expected a more serious post. I have been focusing
on Python lately and over there, some things are done differently albeit I
probably would be using the numpy and pandas packages to do this or even a
simple list comprehension using zip:

# Python, not R.
 [

Re: [R] (no subject)

2024-09-16 Thread John Kane
8, the values of the corresponding mean.
> > > I found this solution, where db10_means is the output dataset, db10 is
> my
> > > initial data.
> > >
> > > db10_means<-db10 %>%
> > >group_by(groupid) %>%
> > >mutate(across(starts_with("cp"), list(mean = mean)))
> > >
> > > It works perfectly, except that for NA values, where it replaces to all
> > > group members the NA, while in some cases, the group is made of some NA
> > and
> > > some values.
> > > So, when I have a group of two values and one NA, I would like that for
> > > those with a value, the mean is replaced, for those with NA, the NA is
> > > replaced.
> > > Here the mean function has not the na.rm=T option associated, but it
> > > appears that this solution cannot be implemented in this case. I am not
> > > even sure that this would be enough to solve my problem.
> > > Thanks for any help provided.
> > >
> > Hello,
> >
> > Your data is a mess, please don't post html, this is plain text only
> > list. Anyway, I managed to create a data frame by copying the data to a
> > file named "rhelp.txt" and then running
> >
> >
> >
> > db10 <- scan(file = "rhelp.txt", what = character())
> > header <- db10[1:4]
> > db10 <- db10[-(1:4)] |> as.numeric()
> > db10 <- matrix(db10, ncol = 4L, byrow = TRUE) |>
> >as.data.frame() |>
> >setNames(header)
> >
> > str(db10)
> > #> 'data.frame':25 obs. of  4 variables:
> > #>  $ cp1: num  1 5 3 7 10 5 2 4 8 10 ...
> > #>  $ cp2: num  10 2 1 4 4 5 6 4 4 15 ...
> > #>  $ role   : num  13 5 3 6 2 8 8 7 7 3 ...
> > #>  $ groupid: num  4 10 7 4 7 3 7 8 8 3 ...
> >
> >
> > And here is the data in dput format.
> >
> >
> >
> > db10 <-
> >structure(list(
> >  cp1 = c(1, 5, 3, 7, 10, 5, 2, 4, 8, 10, 9, 2,
> >  2, 20, 9, 13, 3, 4, 4, 10, 17, 8, 3, 13, 10),
> >  cp2 = c(10, 2, 1, 4, 4, 5, 6, 4, 4, 15, 15, 10,
> >  4, 2, 11, 10, 14, 2, 4, 0, 20, 18, 4, 3, 9),
> >  role = c(13, 5, 3, 6, 2, 8, 8, 7, 7, 3, 10, 5,
> >   11, 5, 3, 13, 12, 15, 1, 3, 15, 10, 19, 5, 2),
> >  groupid = c(4, 10, 7, 4, 7, 3, 7, 8, 8, 3, 2, 5,
> >  20, 12, 6, 4, 6, 7, 16, 7, 3, 7, 8, 20, 6)),
> >  class = "data.frame", row.names = c(NA, -25L))
> >
> >
> >
> > As for the problem, I am not sure if you want summarise instead of
> > mutate but here is a summarise solution.
> >
> >
> >
> > library(dplyr)
> >
> > db10 %>%
> >group_by(groupid) %>%
> >summarise(across(starts_with("cp"), ~ mean(.x, na.rm = TRUE)))
> >
> > # same result, summarise's new argument .by avoids the need to group_by
> > db10 %>%
> >summarise(across(starts_with("cp"), ~ mean(.x, na.rm = TRUE)), .by =
> > groupid)
> >
> >
> >
> > Can you post the expected output too?
> >
> > Hope this helps,
> >
> > Rui Barradas
> >
> >
> > --
> > Este e-mail foi analisado pelo software antivírus AVG para verificar a
> > presença de vírus.
> > www.avg.com
> >
>
>
> --
>
> Francesca
>
>
> --
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear regression and stand deviation at the Linux command line

2024-08-22 Thread Sorkin, John
Keith,
I suggest you being by looking at a web page

https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/lm

It will introduce you to the lm function, the function that performs liner 
regression and the summary function which returns some of the material you are 
looking for. The page come complete with code that can be run via he web page. 
Once you review the web page, and hopefully try to run the analysis you want to 
run, you can again ask the R help list for additional help.

There are other web pages that can help you, for example

https://www.statology.org/logistic-regression-in-r/#:~:text=How%20to%20Perform%20Logistic%20Regression%20in%20R%20%28Step-by-Step%29,Predictions%20...%205%20Step%205%3A%20Model%20Diagnostics%20

Take the first steps, show that you are trying and the R help list will be very 
helpful.
John

John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center;
PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382





From: R-help  on behalf of Keith Christian 

Sent: Thursday, August 22, 2024 3:07 PM
To: r-help@r-project.org
Subject: [R] Linear regression and stand deviation at the Linux command line

R List,

Please excuse this ultra-newbie post.
I looked at this page but it's a bit beyond me.
https://www2.kenyon.edu/Depts/Math/hartlaub/Math305%20Fall2011/R.htm

I'm interested in R construct(s) to be entered at the command
line that would output slope, y-intercept, and r-squared values read
from a csv or other filename entered at the command line, and the same
for standard deviation calculations, namely the standard deviation,
variance, and z-scores for every data point in the file.

E.g.
$ ((R function for linear regression here))slope, y-intercept, and
r-squared, other related stats that R seems most capable of
generating.
linear_regression_data.csv file contents (Are line numbers, commas,
etc. needed or no?)
1 20279
2 899
3 24747
4 12564
5 29543

$ ((R function for standard deviation here))standard deviation,
variance, z-scores, other related stats that R seems most capable of
generating.
standard_deviation_data.csv file contents (Are line numbers, commas,
etc. needed or no?)
1 16837
2 9498
3 31389
4 2365
5 17384

Many thanks,

--Keith

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Manually calculating values from aov() result

2024-08-07 Thread John Fox

Dear Brian,

As Duncan mentioned, the terms type-I, II, and III sums of squares 
originated in SAS. The type-II and III SSs computed by the Anova() 
function in the car package take a different computational approach than 
in SAS, but in almost all cases produce the same results. (I slightly 
regret using the "type-*" terminology for car::Anova() because of the 
lack of exact correspondence to SAS.) The standard R anova() function 
computes type-I (sequential) SSs.


The focus, however, shouldn't be on the SSs, or how they're computed, 
but on the hypotheses that are tested. Briefly, the hypotheses for 
type-I tests assume that all terms later in the sequence are 0 in the 
population; type-II tests assume that interactions to which main effects 
are marginal (and higher-order interactions to which lower-order 
interactions are marginal) are 0. Type-III tests don't, e.g., assume 
that interactions to which a main effect are marginal are 0 in testing 
the main effect, which represents an average over levels of the 
factor(s) with which the factor in the main effect interact. The 
description of the hypotheses for type-III tests is even more complex if 
there are covariates. In my opinion, researchers are usually interested 
in the hypotheses for type-II tests.


These matters are described in detail, for example, in my applied 
regression text <https://www.john-fox.ca/AppliedRegression/index.html>.


I hope this helps,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/
--
On 2024-08-07 8:27 a.m., Brian Smith wrote:

[You don't often get email from briansmith199...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Caution: External email.


Hi,

Thanks for this information. Is there any way to force R to use Type-1
SS? I think most textbooks use this only.

Thanks and regards,

On Wed, 7 Aug 2024 at 17:00, Duncan Murdoch  wrote:


On 2024-08-07 6:06 a.m., Brian Smith wrote:

Hi,

I have performed ANOVA as below

dat = data.frame(
'A' = c(-0.3960025, -0.3492880, -1.5893792, -1.4579074, -4.9214873,
-0.8575018, -2.5551363, -0.9366557, -1.4307489, -0.3943704),
'B' = c(2,1,2,2,1,2,2,2,2,2),
'C' = c(0,1,1,1,1,1,1,0,1,1))

summary(aov(A ~ B * C, dat))

However now I also tried to calculate SSE for factor C

Mean = sapply(split(dat, dat$C), function(x) mean(x$A))
N = sapply(split(dat, dat$C), function(x) dim(x)[1])

N[1] * (Mean[1] - mean(dat$A))^2 + N[2] * (Mean[2] - mean(dat$A))^2
#1.691

But in ANOVA table the sum-square for C is reported as 0.77.

Could you please help how exactly this C = 0.77 is obtained from aov()


Your design isn't balanced, so there are several ways to calculate the
SS for C.  What you have calculated looks like the "Type I SS" in SAS
notation, if I remember correctly, assuming that C enters the model
before B.  That's not what R uses; I think it is Type II SS.

For some details about this, see
https://mcfromnz.wordpress.com/2011/03/02/anova-type-ii-ss-explained/



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Create matrix with variable number of columns AND CREATE NAMES FOR THE COLUMNS

2024-07-01 Thread Sorkin, John
#I am trying to write code that will create a matrix with a variable number of 
columns where the #number of columns is 1+Grps
#I can do this:
NSims <- 4
Grps <- 5
DiffMeans <- matrix(nrow=NSims,ncol=1+Grps)
DiffMeans

#I have a problem when I try to name the columns of the matrix. I want the 
first column to be NSims, #and the other columns to be something like Value1, 
Value2, . . . Valuen where N=Grps

# I wrote a function to build a list of length Grps
createValuelist <- function(num_elements) {
  for (i in 1:num_elements) {
cat("Item", i, "\n", sep = "")
  }
}
createValuelist(Grps)

# When I try to assign column names I receive an error:
#Error in dimnames(DiffMeans) <- list(NULL, c("NSim", createValuelist(Grps))) : 
# length of 'dimnames' [2] not equal to array extent
dimnames(DiffMeans) <- list(NULL,c("NSim",createValuelist(Grps)))
DiffMeans

# Thank you for your help!


John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center; 
PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression performance when using summary() twice

2024-06-21 Thread John Fox

Dear Christian,

You're apparently using the glm.nb() function in the MASS package.

Your function is peculiar in several respects. For example, you specify 
the model formula as a character string and then convert it into a 
formula, but you could just pass the formula to the function -- the 
conversion seems unnecessary. Similarly, you compute the summary for the 
model twice rather than just saving it in a local variable in your 
function. And the form of the function output is a bit strange, but I 
suppose you have reasons for that.


The primary reason that your function is slow, however, is that the 
confidence intervals computed by confint() profile the likelihood, which 
requires refitting the model a number of times. If you're willing to use 
possibly less accurate Wald-based rather than likelihood-based 
confidence intervals, computed, e.g., by the Confint() function in the 
car package, then you could speed up the computation considerably,


Using a model fit by example(glm.nb),

library(MASS)
example(glm.nb)
microbenchmark::microbenchmark(
  Wald = car::Confint(quine.nb1, vcov.=vcov(quine.nb1),
   estimate=FALSE),
  LR = confint(quine.nb1)
)

which produces

Unit: microseconds
 expr   min   lq   meanmedian   uqmax
 Wald   136.366   161.13   222.0872   184.541   283.72386.466
   LR 87223.031 88757.09 95162.8733 95761.568 97672.23 182734.048
 neval
   100
   100


I hope this helps,
 Johm
--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/
--
On 2024-06-21 10:38 a.m., c.bu...@posteo.jp wrote:
[You don't often get email from c.bu...@posteo.jp. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]


Caution: External email.


Hello,

I am not a regular R user but coming from Python. But I use R for
several special task.

Doing a regression analysis does cost some compute time. But I wonder
when this big time consuming algorithm is executed and if it is done
twice in my sepcial case.

It seems that calling "glm()" or similar does not execute the time
consuming part of the regression code.
It seems it is done when calling "summary(model)".
Am I right so far?

If this is correct I would say that in my case the regression is down
twice with the identical formula and data. Which of course is
inefficient. See this code:

my_function <- function(formula_string, data) {
     formula <- as.formula(formula_string)
     model <- glm.nb(formula, data = data)

     result = cbind(summary(model)$coefficients, confint(model))
     result = as.data.frame(result)

     string_result = capture.output(summary(model))

     return(list(result, string_result))
     }

I do call summary() once to get the "$coefficents" and a second time
when capturing its output as a string.

If this really result in computing the regression twice I ask myself if
there is a R-way to make this more efficent?

Best regards,
Christian Buhtz

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Column names of model.matrix's output with contrast.arg

2024-06-17 Thread John Fox

Dear Christophe and Ben,

Also see the car package for replacements for contr.treatment(), 
contr.sum(), and contr.helmert() -- e.g., help("contr.Sum", package="car").


These functions have been in the car package for more than two decades, 
and AFAIK, no one uses them (including myself). I didn't write a 
replacement for contr.poly() because the current coefficient labeling 
seemed reasonably transparent.


Best,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/

--
On 2024-06-17 4:29 p.m., Ben Bolker wrote:

Caution: External email.


   It's sorta-kinda-obliquely-partially documented in the examples:

zapsmall(cP <- contr.poly(3)) # Linear and Quadratic

output:

     .L .Q
[1,] -0.7071068  0.4082483
[2,]  0.000 -0.8164966
[3,]  0.7071068  0.4082483

FWIW the faux package provides better-named alternatives.


On 2024-06-17 4:25 p.m., Christophe Dutang wrote:

Thanks for your reply.

It might good to document the naming convention in ?contrasts. It is 
hard to understand .L for linear, .Q for quadratic, .C for cubic and 
^n for other degrees.


For contr.sum, we could have used .Sum, .Sum…

Maybe the examples ?model.matrix should use names in dd objects so 
that we observe when names are dropped.


Kind regards, Christophe



Le 14 juin 2024 à 11:45, peter dalgaard  a écrit :

You're at the mercy of the various contr.XXX functions. They may or 
may not set the colnames on the matrices that they generate.


The rationales for (not) setting them is not perfectly transparent, 
but you obviously cannot use level names on contr.poly, so it uses 
.L, .Q, etc.


In MASS, contr.sdif is careful about labeling the columns with the 
levels that are being diff'ed.


For contr.treatment, there is a straightforward connection to 0/1 
dummy variables, so level names there are natural.


One could use levels in contr.sum and contr.helmert, but it might 
confuse users that comparisons are with the average of all levels or 
preceding levels. (It can be quite confusing when coding is +1 for 
male and -1 for female, so that the gender difference is twice the 
coefficient.)


-pd


On 14 Jun 2024, at 08:12 , Christophe Dutang  wrote:

Dear list,

Changing the default contrasts used in glm() makes me aware how 
model.matrix() set column names.


With default contrasts, model.matrix() use the level values to name 
the columns. However with other contrasts, model.matrix() use the 
level indexes. In the documentation, I don’t see anything in the 
documentation related to this ? It does not seem natural to have 
such a behavior?


Any comment is welcome.

An example is below.

Kind regards, Christophe


#example from ?glm
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- paste0("O", gl(3,1,9))
treatment <- paste0("T", gl(3,3))

X3 <- model.matrix(counts ~ outcome + treatment)
X4 <- model.matrix(counts ~ outcome + treatment, contrasts = 
list("outcome"="contr.sum"))
X5 <- model.matrix(counts ~ outcome + treatment, contrasts = 
list("outcome"="contr.helmert"))


#check with original factor
cbind.data.frame(X3, outcome)
cbind.data.frame(X4, outcome)
cbind.data.frame(X5, outcome)

#same issue with glm
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())
glm.D94 <- glm(counts ~ outcome + treatment, family = poisson(), 
contrasts = list("outcome"="contr.sum"))
glm.D95 <- glm(counts ~ outcome + treatment, family = poisson(), 
contrasts = list("outcome"="contr.helmert"))


coef(glm.D93)
coef(glm.D94)
coef(glm.D95)

#check linear predictor
cbind(X3 %*% coef(glm.D93), predict(glm.D93))
cbind(X4 %*% coef(glm.D94), predict(glm.D94))

-
Christophe DUTANG
LJK, Ensimag, Grenoble INP, UGA, France
ILB research fellow
Web: http://dutangc.free.fr

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathe

[R] Can't compute row means of two columns of a dataframe.

2024-06-08 Thread Sorkin, John
I have a data frame with three columns, TotalInches, Low20, High20. For each 
row of the dataset, I am trying to compute the mean of Low20 and High20. 

xxxz <- structure(list(TotalInches = 
 c(58, 59, 60, 61, 62, 63, 64, 65, 
   66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76), Low20 = c(84, 
87, 
   90, 93, 96, 99, 102, 106, 109, 112, 116, 119, 122, 126, 129, 
   133, 137, 141, 144), High20 = c(111, 115, 119, 123, 127, 
131, 
   135, 140, 144, 148, 153, 157, 162, 167, 171, 176, 181, 186, 
191
   )), class = "data.frame", row.names = c(NA, -19L))
xxxz
str(xxxz)
xxxz$Average20 <- by(xxxz[,c("Low20","High20")],xxxz[,"TotalInches"],mean)
warnings()

When I run the code above, I don't get the means by row. I get the following 
warning messages, one for each row of the dataframe.

Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA

 Can someone tell my what I am doing wrong, and how I can compute the row means?

Thank you,
John

John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center; 
PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Listing folders on One Drive

2024-05-20 Thread John Fox

Dear Nick,

See list.dirs(), which is documented in the same help file as list.files().

I hope this helps,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/
--
On 2024-05-20 9:36 a.m., Nick Wray wrote:

[You don't often get email from nickmw...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Caution: External email.


Hello I have lots of folders of individual Scottish river catchments on my
uni One Drive.  Each folder is labelled with the river name eg "Tay" and
they are all in a folder named "Scotland"
I want to list the folders on One Drive so that I can cross check that I
have them all against a list of folders on my laptop.
Can I somehow use list.files() - I've tried various things but none seem to
work...
Any help appreciated
Thanks Nick Wray

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Print date on y axis with month, day, and year

2024-05-09 Thread Sorkin, John
I am trying to use ggplot to plot the data, and R code, below. The dates 
(jdate) are printing as Mar 01, Mar 15, etc. I want to have the date printed as 
MMM DD  (or any other way that will show month, date, and year, e.g. 
mm/dd/yy). How can I accomplish this?

yyy  <- structure(list(
  jdate = structure(c(19052, 19053, 19054, 19055, 
  19058, 19059, 19060, 19061, 19062, 19063, 19065, 19066, 
19067, 
  19068, 19069, 19072, 19073, 19074, 19075, 19076, 19077, 
19083, 
  19086, 19087, 19088, 19089, 19090, 19093, 19094, 19095), 
class = "Date"), 
Sum = c ( 1,  3,  9, 11, 13, 16, 18, 22, 26, 27, 30, 32, 35, 39,  41, 
 43, 48, 51, 56, 58, 59, 63, 73, 79, 81, 88, 91, 93, 96, 103)), 
row.names = c(NA, 30L), class = "data.frame")
yyy
class(yyy$jdate)
ggplot(data=yyy[1:30,],aes(as.Date(jdate,format="%m-%d-%Y"),Sum)) +geom_point()


Thank you 
John



John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center; 
PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] x[0]: Can '0' be made an allowed index in R?

2024-04-23 Thread John Fox

Hello Peter,

Unless I too misunderstand your point, negative indices for removal do 
work with the Oarray package (though -0 doesn't work to remove the 0th 
element, since -0 == 0 -- perhaps what you meant):


> library(Oarray)

> v <- Oarray(1:10, offset=0)

> v
[0,] [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,]
   123456789   10

> dim(v)
[1] 10

> v[-1]
[1]  1  3  4  5  6  7  8  9 10

> v[-0]
[1] 1

Best,
 John

On 2024-04-23 9:03 a.m., Peter Dalgaard via R-help wrote:

Caution: External email.


Doesn't sound like you got the point. x[-1] normally removes the first element. 
With 0-based indices, this cannot work.

- pd


On 22 Apr 2024, at 17:31 , Ebert,Timothy Aaron  wrote:

You could have negative indices. There are two ways to do this.
1) provide a large offset.
Offset <- 30
for (i in -29 to 120) { print(df[i+Offset])}


2) use absolute values if all indices are negative.
for (i in -200 to -1) {print(df[abs(i)])}

Tim



-Original Message-
From: R-help  On Behalf Of Peter Dalgaard via 
R-help
Sent: Monday, April 22, 2024 10:36 AM
To: Rolf Turner 
Cc: R help project ; Hans W 
Subject: Re: [R] x[0]: Can '0' be made an allowed index in R?

[External Email]

Heh. Did anyone bring up negative indices yet?

-pd


On 22 Apr 2024, at 10:46 , Rolf Turner  wrote:


See fortunes::fortune(36).

cheers,

Rolf Turner

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Stats. Dep't. (secretaries) phone:
+64-9-373-7599 ext. 89622
Home phone: +64-9-480-4619

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat/
.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7Ctebert%40ufl.edu
%7C79ca6aadcaee4aa3241308dc62d986f6%7C0d4da0f84a314d76ace60a62331e1b84
%7C0%7C0%7C638493933686698527%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=
wmv9OYcMES0nElT9OAKTdjBk%2BB55bQ7BjxOuaVVkPg4%3D&reserved=0
PLEASE do read the posting guide
http://www.r/
-project.org%2Fposting-guide.html&data=05%7C02%7Ctebert%40ufl.edu%7C79
ca6aadcaee4aa3241308dc62d986f6%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
7C0%7C638493933686711061%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=AP78X
nfKrX6B0YVM0N76ty9v%2Fw%2BchHIytw33X7M9umE%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.


--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 
Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question regarding reservoir volume and water level

2024-04-07 Thread Sorkin, John
Aside from the fact that the original question might well be a class exercise 
(or homework), the question is unanswerable given the data given by the 
original poster. One needs to know the dimensions of the reservoir, above and 
below the current waterline. Are the sides, above and below the waterline 
smooth? Is the region currently above the waterline that can store water a 
mirror image of the region below the waterline? Is the region above the 
reservoir include a flood plane? Will the additional water go into the flood 
plane?

The lack of required detail in the question posed by the original poster 
suggests that there are strong assumptions, assumptions that typically would be 
made in a class-room example or exercise.

John

John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center;
PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382





From: R-help  on behalf of Rui Barradas 

Sent: Sunday, April 7, 2024 10:53 AM
To: javad bayat; R-help
Subject: Re: [R] Question regarding reservoir volume and water level

Às 13:27 de 07/04/2024, javad bayat escreveu:
> Dear all;
> I have a question about the water level of a reservoir, when the volume
> changed or doubled.
> There is a DEM file with the highest elevation 1267 m. The lowest elevation
> is 1230 m. The current volume of the reservoir is 7,000,000 m3 at 1240 m.
> Now I want to know what would be the water level if the volume rises to
> 1250 m? or what would be the water level if the volume doubled (14,000,000
> m3)?
>
> Is there any way to write codes to do this in R?
> I would be more than happy if anyone could help me.
> Sincerely
>
>
>
>
>
>
>
>
Hello,

This is a simple rule of three.
If you know the level l the argument doesn't need to be named but if you
know the volume v then it must be named.


water_level <- function(l, v, level = 1240, volume = 7e6) {
   if(missing(v)) {
 volume * l / level
   } else level * v / volume
}

lev <- 1250
vol <- 14e6

water_level(l = lev)
#> [1] 7056452
water_level(v = vol)
#> [1] 2480


Hope this helps,

Rui Barradas


--
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença 
de vírus.
http://www.avg.com/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rtools and things dependent on it

2024-02-23 Thread Sorkin, John
David,

I greatly appreciate the explanation you gave regarding R tools providing tools 
available in Linux distros, but not found in Windows. (I am using a windows 
system). Does this mean that Linux users don't need to use R tools when they 
want to compile R code?

Additionally, thank you for the information about what I should read. I will 
look at the material again, and hopefully things the material you suggest I 
read will be more understandable.

John

P.S. This email should be in txt format, not html. I sent if from my desktop 
windows machine which provides more options than does my iPhone.



John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center;
PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382





From: David Winsemius 
Sent: Friday, February 23, 2024 8:14 PM
To: Sorkin, John
Cc: avi.e.gr...@gmail.com; r-help@r-project.org
Subject: Re: [R] Rtools and things dependent on it


On 2/23/24 16:28, Sorkin, John wrote:
David,
My apologies regarding the format of my email. I am replying using my iPhone, 
and I can’t find a way to switch from what I suspect is html to txt format.
The link you sent told me that R tools allows compilation of code.


It's specifically designed to provide the code tools missing in Windows that 
would other wise have been provided by a typical Linux distro. More 
expansively, it allows compilation of code written in C and/or Fortran using 
the version that was used to build the matching R version and allows it to be 
called by the routines written in R that bind a package together.

This is good to know, but beyond this important fact, the rest of the material 
was close to unintelligible.

The phrase "the rest of the material" is not specific enough to offer more 
explanation. You should quote material that is beyond your understanding. You 
should only be reading the sections named: "Installing Rtools43" and "Building 
packages from source using Rtools43". I doubt that material further on would be 
relevant.

--

David

I doubt this is the fault of the author, it is probably because I lack some 
basic knowledge. Can you suggest some more basic material I can read. Please 
note. I am not computer naive, I am simply missing basic knowledge of the 
material discussed in the web page.
Thank you,
John
John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to 
faxing)

On Feb 23, 2024, at 7:01 PM, David Winsemius 
<mailto:dwinsem...@comcast.net> wrote:


On 2/23/24 14:34, avi.e.gr...@gmail.com<mailto:avi.e.gr...@gmail.com> wrote:
This may be a dumb question and the answer may make me feel dumber.

I have had trouble for years with R packages wanting Rtools on my machine
and not being able to use it. Many packages are fine as binaries are
available. I have loaded Rtools and probably need to change my PATH or
something.


I suppose making sure that whatever directory holds your Rtools code is
on your path would be a good idea. I wondered if there's an environment
variable that could be set, but reading the page on using Rtools did not
mention one until I got down to the section on building R from source
which is surely NOT what you want to do.. You should read the
information on installation and building packages from source.
https://cran.r-project.org/bin/windows/base/howto-R-devel.html<https://cran.r-project.org/bin/windows/base/howto-R-devel.html>
 which
includes this sentence:

"It is recommended to use the defaults and install into|c:/rtools43|.
When done that way, Rtools43 may be used in the same R session which
installed it or which was started before Rtools43 was installed."


But I recently suggested to someone that they might want to use the tabyl()
function in the janitor package that I find helpful. I get a warning when I
install it about Rtools but it works fine. When they install it, it fails. I
assumed they would get it from CRAN the same way I did as we are both using
Windows and from within RSTUDIO.

In the past, I have run into other packages I could not use and just moved
on but it seems like time to see if this global problem has a work-around.

And, in particular, I have the latest versions of 

Re: [R] Rtools and things dependent on it

2024-02-23 Thread Sorkin, John
Avi ,
Your question is not dumb. Let me ask a more fundamental question. What is R 
tools, what does it do, and how is it used. From time to time, I receive a 
message when I down load a package saying I need R tools. When I receive the 
message, I don’t know what I should do, other than down load R tools.
John
John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to 
faxing)

On Feb 23, 2024, at 5:34 PM, avi.e.gr...@gmail.com wrote:

This may be a dumb question and the answer may make me feel dumber.

I have had trouble for years with R packages wanting Rtools on my machine
and not being able to use it. Many packages are fine as binaries are
available. I have loaded Rtools and probably need to change my PATH or
something.

But I recently suggested to someone that they might want to use the tabyl()
function in the janitor package that I find helpful. I get a warning when I
install it about Rtools but it works fine. When they install it, it fails. I
assumed they would get it from CRAN the same way I did as we are both using
Windows and from within RSTUDIO.

In the past, I have run into other packages I could not use and just moved
on but it seems like time to see if this global problem has a work-around.

And, in particular, I have the latest versions of both R and RSTUDIO which
can be a problem when other things are not as up-to-date.

Or, maybe some people with R packages could be convinced to make binaries
available in the first place?

Avi


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C02%7CJSorkin%40som.umaryland.edu%7C8d5f2c8346f24559a7f908dc34bf9979%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638443244987424663%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C6%7C%7C%7C&sdata=BO9wgkrjNmI4j2deiBDxHw%2F9tVjynfQYEHhBZ8BGq%2Fk%3D&reserved=0
PLEASE do read the posting guide 
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C02%7CJSorkin%40som.umaryland.edu%7C8d5f2c8346f24559a7f908dc34bf9979%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638443244987432863%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C6%7C%7C%7C&sdata=kVnTbE6ZEpmJ88Zmu%2FUbUH%2F%2FnjoSHSmDjuIxxxw3uz8%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggarrange & legend

2024-02-05 Thread John Kane
Blast it hit send by accident. Anyway the code above is a WWE.

I don't see any obvious way no move the legend


On Mon, 5 Feb 2024 at 09:13, John Kane  wrote:

> I'm sorry but that is not a working example.
>
> A working example needs to create the plots being used.
>
> For example, stealing some code from
> https://rpkgs.datanovia.com/ggpubr/reference/ggarrange.html
> #=
>
> data <https://rdrr.io/r/utils/data.html>("ToothGrowth")df <- 
> ToothGrowthdf$dose <- as.factor 
> <https://rdrr.io/r/base/factor.html>(df$dose)# Box plotbxp <- ggboxplot 
> <https://rpkgs.datanovia.com/ggpubr/reference/ggboxplot.html>(df, x = "dose", 
> y = "len",color = "dose", palette = "jco")# Density plotdens <- ggdensity 
> <https://rpkgs.datanovia.com/ggpubr/reference/ggdensity.html>(df, x = "len", 
> fill = "dose", palette = "jco")
>
> mylist<-list(bxp, dens)
>
> dev.new(width=28, height=18)
>
> fig1<- ggarrange(plotlist=mylist, common.legend = TRUE, legend="top", labels 
> = c("(A)", "(B)"), font.label = list(size = 18, color = "black"), ncol=2)
>
> fig1
> #=
>
>
> On Mon, 5 Feb 2024 at 08:44,  wrote:
>
>> Dear John Kane
>>
>> Dear R community
>>
>>
>>
>> Here my working example
>>
>>1. Example that is working with legend=”top”. However, as mentioned,
>>the legend is in the middle of the top axis.
>>
>> mylist<-list(p1, p2)
>>
>> dev.new(width=28, height=18)
>>
>> fig1<- ggarrange(plotlist=mylist, common.legend = TRUE, legend="top",
>> labels = c("(A)", "(B)"), font.label = list(size = 18, color = "black"),
>> ncol=2)
>>
>> fig1
>>
>>
>>
>>1. My question is how I can position the legend on the topright of
>>the top axis. However, “topright” is not a common label for legend in
>>ggarrange (but in other plot functions), so legend =”topright” is not
>>working.
>>
>> mylist<-list(p1, p2)
>>
>> dev.new(width=28, height=18)
>>
>> fig1<- ggarrange(plotlist=mylist, common.legend = TRUE,
>> legend="topright", labels = c("(A)", "(B)"), font.label = list(size = 18,
>> color = "black"), ncol=2)
>>
>> fig1
>>
>>
>>
>> Kind regards
>>
>> Sibylle
>>
>>
>>
>> *From:* John Kane 
>> *Sent:* Monday, February 5, 2024 1:59 PM
>> *To:* sibylle.stoec...@gmx.ch
>> *Cc:* r-help@r-project.org
>> *Subject:* Re: [R] ggarrange & legend
>>
>>
>>
>> Could you supply us with a MWE (minimal working example)of what you have
>> so far?
>>
>> Thanks.
>>
>>
>>
>> On Mon, 5 Feb 2024 at 05:00, SIBYLLE STÖCKLI via R-help <
>> r-help@r-project.org> wrote:
>>
>> Dear R community
>>
>> It is possible to adjust the legend in combined ggplots using ggarrange
>> with
>> be positions top, bottom, left and right.
>> My question: Is there a function to change the position of the legend to
>> topright or bottomleft? Right and top etc are in the middle of the axis.
>>
>> Kind regards
>> Sibylle
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>>
>> John Kane
>> Kingston ON Canada
>>
>
>
> --
> John Kane
> Kingston ON Canada
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggarrange & legend

2024-02-05 Thread John Kane
I'm sorry but that is not a working example.

A working example needs to create the plots being used.

For example, stealing some code from
https://rpkgs.datanovia.com/ggpubr/reference/ggarrange.html
#=

data <https://rdrr.io/r/utils/data.html>("ToothGrowth")df <-
ToothGrowthdf$dose <- as.factor
<https://rdrr.io/r/base/factor.html>(df$dose)# Box plotbxp <-
ggboxplot <https://rpkgs.datanovia.com/ggpubr/reference/ggboxplot.html>(df,
x = "dose", y = "len",color = "dose", palette = "jco")# Density
plotdens <- ggdensity
<https://rpkgs.datanovia.com/ggpubr/reference/ggdensity.html>(df, x =
"len", fill = "dose", palette = "jco")

mylist<-list(bxp, dens)

dev.new(width=28, height=18)

fig1<- ggarrange(plotlist=mylist, common.legend = TRUE, legend="top",
labels = c("(A)", "(B)"), font.label = list(size = 18, color =
"black"), ncol=2)

fig1
#=


On Mon, 5 Feb 2024 at 08:44,  wrote:

> Dear John Kane
>
> Dear R community
>
>
>
> Here my working example
>
>1. Example that is working with legend=”top”. However, as mentioned,
>the legend is in the middle of the top axis.
>
> mylist<-list(p1, p2)
>
> dev.new(width=28, height=18)
>
> fig1<- ggarrange(plotlist=mylist, common.legend = TRUE, legend="top",
> labels = c("(A)", "(B)"), font.label = list(size = 18, color = "black"),
> ncol=2)
>
> fig1
>
>
>
>1. My question is how I can position the legend on the topright of the
>top axis. However, “topright” is not a common label for legend in ggarrange
>(but in other plot functions), so legend =”topright” is not working.
>
> mylist<-list(p1, p2)
>
> dev.new(width=28, height=18)
>
> fig1<- ggarrange(plotlist=mylist, common.legend = TRUE, legend="topright",
> labels = c("(A)", "(B)"), font.label = list(size = 18, color = "black"),
> ncol=2)
>
> fig1
>
>
>
> Kind regards
>
> Sibylle
>
>
>
> *From:* John Kane 
> *Sent:* Monday, February 5, 2024 1:59 PM
> *To:* sibylle.stoec...@gmx.ch
> *Cc:* r-help@r-project.org
> *Subject:* Re: [R] ggarrange & legend
>
>
>
> Could you supply us with a MWE (minimal working example)of what you have
> so far?
>
> Thanks.
>
>
>
> On Mon, 5 Feb 2024 at 05:00, SIBYLLE STÖCKLI via R-help <
> r-help@r-project.org> wrote:
>
> Dear R community
>
> It is possible to adjust the legend in combined ggplots using ggarrange
> with
> be positions top, bottom, left and right.
> My question: Is there a function to change the position of the legend to
> topright or bottomleft? Right and top etc are in the middle of the axis.
>
> Kind regards
> Sibylle
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> John Kane
> Kingston ON Canada
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggarrange & legend

2024-02-05 Thread John Kane
Could you supply us with a MWE (minimal working example)of what you have so
far?
Thanks.

On Mon, 5 Feb 2024 at 05:00, SIBYLLE STÖCKLI via R-help <
r-help@r-project.org> wrote:

> Dear R community
>
> It is possible to adjust the legend in combined ggplots using ggarrange
> with
> be positions top, bottom, left and right.
> My question: Is there a function to change the position of the legend to
> topright or bottomleft? Right and top etc are in the middle of the axis.
>
> Kind regards
> Sibylle
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] DrDimont package in R

2024-02-05 Thread John Kane
Nothing got through.  Try plain text rather than HTML.

On Mon, 5 Feb 2024 at 06:04, Anas Jamshed  wrote:

>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use of geometric mean .. in good data analysis

2024-01-23 Thread John via R-help
I've advised people consulting me that if their data is loaded with 
zeros, while they are absolutely certain that something should be where 
the zeros are, then they either need a better measuring tool, or to 
carefully document the results of limits on detectability and then note 
what fraction of the data is really below instrument limits.  It's 
important information as it stands, but they don't want to go writing 
fairy tales based on things not seen.

On 1/22/24 12:57, Jeff Newmiller via R-help wrote:

> Still OT... but here is my own (I think previously mentioned here) rant on 
> people thrashing about with log transformation and an all-too-common kludge 
> to deal with zeros mixed among small 
> numbers...https://gist.github.com/jdnewmil/99301a88de702ad2fcbaef33326b08b4
>
> OP perhaps posting a link here to your question posed wherever you end up 
> with it will help shorten this thread.
>
> On January 22, 2024 12:23:20 PM PST, Bert Gunter  
> wrote:
>> Ah LOD's, typically LLOD's ("lower limits of detection").
>>
>> Disclaimer: I am *NOT* in any sense an expert on such matters. What follows
>> are just some comments based on my personal experience. Please filter
>> accordingly. Also, while I kept it on list as Martin suggested it might be
>> useful to do so, most folks probably can safely ignore the rant that
>> follows as off topic and not of interest. So you've been warned!!
>>
>> The rant:
>> My experience is: data that contain a "bunch" of values that are, e.g.
>> below a LLOD, are frequently reported and/or analyzed by various ad hoc,
>> and imho, uniformly bad methods. e.g.:
>>
>> 1) The censored values are recorded and analyzed as at the LLOD;
>> 2) The censored values are recorded and analyzed at some arbitrary value
>> below the LLOD, like LLOD/2;
>> 3) The censored values are are "imputed" by ad hoc methods, e.g. uniform
>> random values between 0 and the LLOD for left censoring.
>>
>> To repeat, *IMO*, all of this is junk and will produced misleading
>> statistical results. Whether they mislead enough to substantively affect
>> the science or regulatory decisions depend on the specifics of the
>> circumstances. I accept no general claim as to their innocuousness.
>>
>> Further:
>>
>> a) When you have a "lot" of values -- 50%? 75%?, 25%? -- face facts: you
>> have (practically) no useful information from the values that you do have
>> to infer what the distribution of values that you don't have looks like.
>> All one can sensibly do is say that x% of the values are below a LOD and
>> here's the distribution of what lies above. Presumably, if you have such
>> data conditional on covariates with the obvious intent to determine the
>> relationship to those covariates, you could analyze the percentages of
>> LLOD's and known values separately. There are undoubtedly more
>> sophisticated methods out there, so this is where you need to go to the
>> literature to see what might suit; though I think it will still have to
>> come down to looking at these separately (e.g. with extra parameters to
>> account for unmeasurable values). Another way of saying this is: any
>> analysis which treats all the data as arising from a single distribution
>> will depend more on the assumptions you make than on the data. So good luck
>> with that!
>>
>> b) If you have a "modest" amount of (known) censoring -- 5%?, 20%? 10%? --
>> methods for the analysis of censored data should be useful. My
>> understanding is that MI (multiple imputation) is regarded as a generally
>> useful approach, and there are many R packages that can do various flavors
>> of this. Again, you should consult the literature: there are very likely
>> nontechnical reviews of this topic, too, as well as online discussions and
>> tutorials.
>>
>> So if you are serious about dealing with this and have a lot of data with
>> these issues, my advice would be to stop looking for ad hoc advice and dig
>> into the literature: it's one of the many areas of "data science" where
>> seemingly simple but pervasive questions require complex answers.
>>
>> And, again, heed my personal caveats.
>>
>> Thus endeth my rant.
>>
>> Cheers to all,
>> Bert
>>
>>
>>
>> On Mon, Jan 22, 2024 at 9:29 AM Rich Shepard
>> wrote:
>>
>>> On Mon, 22 Jan 2024, Martin Maechler wrote:
>>>
>>>> I think it is a good question, not really only about geo-chemistry, but
>>>> about statistics in applied sci

Re: [R] Use of geometric mean .. in good data analysis

2024-01-22 Thread John Fox

Dear Martin,

Helpful general advice, although it's perhaps worth mentioning that the 
geometric mean, defined e.g. naively as prod(x)^(1/length(x)), is 
necessarily 0 if there are any 0 values in x. That is, the geometric 
mean "works" in this case but isn't really informative.


Best,
 John
--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/

On 2024-01-22 12:18 p.m., Martin Maechler wrote:

Caution: External email.



Rich Shepard
 on Mon, 22 Jan 2024 07:45:31 -0800 (PST) writes:


 > A statistical question, not specific to R.  I'm asking for
 > a pointer for a source of definitive descriptions of what
 > types of data are best summarized by the arithmetic,
 > geometric, and harmonic means.

In spite of  off-topic:

I think it is a good question, not really only about
geo-chemistry, but about statistics in applied sciences (and
engineering for that matter).

Something I sure good applied statisticians in the 1980's and
1990's would all know the answer of :

To use the geometric mean instead of the arithmetic mean
is basically  *equivalent* to  first log-transform the data
and then work with that transformed data:
Not just for computing average, but for more relevant modelling,
inference, etc.

John W Tukey (and several other of the grands of the time)
had the log transform among the  "First aid transformations":

If the data for a continuous variable must all be positive it is
also typically the case that the distribution is considerably
skewed to the right.
In such a case behave as a good human who sees another human in
health distress: apply First Aid -- do the things you learned to
do quickly without too much thought, because things must happen
fast ---to hopefully save the other's life.

Here: Do log transform all such variables with further ado,
and only afterwards start your (exploratory and more) data analysis.

Now,  mean(log(y)) = log(geometricmean(y)),
where mean() is the arithmetic mean as in R
{mathematically; on the computer you need all.equal(), not '==' !!}

I.e., according to Tukey and all the other experienced applied
statisticians of the past, the geometric mean is the "best thing"
to do for such positive right-skewed data   in the same sense
that the log-transform is the best "a priori" transformation for
such data -- with the one advantage even that you need to fiddle
with zeroes when log-transforming, whereas the geometric mean
works already for zeroes.

Martin


 > As an aquatic ecologist I see regulators apply the
 > geometric mean to geochemical concentrations rather than
 > using the arithmetic mean. I want to know whether the
 > geometric mean of a set of chemical concentrations (e.g.,
 > in mg/L) is an appropriate representation of the expected
 > value. If not, I want to explain this to non-technical
 > decision-makers; if so, I want to understand why my
 > assumption is wrong.

 > TIA,

 > Rich

 > __
 > R-help@r-project.org mailing list -- To UNSUBSCRIBE and
 > more, see https://stat.ethz.ch/mailman/listinfo/r-help
 > PLEASE do read the posting guide
 > http://www.R-project.org/posting-guide.html and provide
 > commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there any design based two proportions z test?

2024-01-18 Thread John Fox

Dear Md Kamruzzaman,

I've copied this response to the r-help list, where you originally asked 
your question. That way, other people can follow the conversation, if 
they're interested and there will be a record of the solution. Please 
keep r-help in the loop


See below:

On 2024-01-17 9:47 p.m., Md. Kamruzzaman wrote:


Caution: External email.


Dear John
Thank you so much for your reply.

I have calculated the 95%CI of the separate two proportions by using the 
survey package. The code is given below.


svyby(~Diabetes_Cate, ~Year, nhc, svymean, na=TRUE)

Here: nhc is the weighted survey data.


I understand your point that it is possible to calculate the 95%CI of 
the proportional difference manually.  It is time consuming, that's why 
I was looking for a function with a design effect to calculate this 
easily.  I couldn't find this kind of function.



However, it will be okay for me to calculate this manually, if there are 
no functions like this.


If you intend to do this computation once, it's not terribly time 
consuming. If you intend to do it repeatedly, you can write a simple 
function to do the calculation, probably in less time than it takes to 
search for one.





For manual calculation, could you please share the formula? to calculate 
the 95%CI of proportional difference.


Here's a simple function to compute the confidence interval, assuming 
that the normal distribution is used. The formula is based on the 
elementary result that the variance of the difference of two independent 
random variables is the sum of their variances, plus the observation 
that the width of the confidence interval is 2*z*SE, where z is the 
normal quantile corresponding to the confidence level (e.g., 1.96 for a 
95% CI).


ciDiff <- function(ci1, ci2, level=0.95){
  p1 <- mean(ci1)
  p2 <- mean(ci2)
  z <- qnorm((1 - level)/2, lower.tail=FALSE)
  se1 <- (ci1[2] - ci1[1])/(2*z)
  se2 <- (ci2[2] - ci2[1])/(2*z)
  seDiff <- sqrt(se1^2 + se2^2)
  (p1 - p2) + c(-z, z)*seDiff
}




Example: Prevalence of Diabetes:
                                                      2011: 11.0 (95%CI 
10.1-11.9)
                                                      2017: 10.1 (95%CI 
9.4-10.9)

                                                      Diff: 0.9% (95%CI: ??)


These are percentages, not proportions, but you can use either:

> ciDiff(c(10.1, 11.9), c(9.4, 10.9))
[1] -0.3215375  2.0215375

> ciDiff(c(.101, .119), c(.094, .109))
[1] -0.003215375  0.020215375

You'll want more significant digits in the inputs to get sufficiently 
precise results.


Since I did this quickly, if I were you I'd check the results manually.

Best,
 John


With Kind Regards

-----

*/Md Kamruzzaman/*



On Thu, Jan 18, 2024 at 12:44 AM John Fox <mailto:j...@mcmaster.ca>> wrote:


Dear Md Kamruzzaman,

To answer your second question first, you could just use the svychisq()
function. The difference-of-proportion test is equivalent to a
chisquare
test for the 2-by-2 table.

You don't say how you computed the confidence intervals for the two
separate proportions, but if you have their standard errors (and if
not,
you should be able to infer them from the confidence intervals) you can
compute the variance of the difference as the sum of the variances
(squared standard errors), because the two proportions are independent,
and from that the confidence interval for their difference.

I hope this helps,
John
-- 
John Fox, Professor Emeritus

McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/ <https://www.john-fox.ca/>

On 2024-01-16 10:21 p.m., Md. Kamruzzaman wrote:
 > [You don't often get email from mkzama...@gmail.com
<mailto:mkzama...@gmail.com>. Learn why this is important at
https://aka.ms/LearnAboutSenderIdentification
<https://aka.ms/LearnAboutSenderIdentification> ]
 >
 > Caution: External email.
 >
 >
 > Hello Everyone,
 > I was analysing big survey data using survey packages on RStudio.
Survey
 > package allows survey data analysis with the design effect.The survey
 > package included functions for all other statistical analysis except
 > two-proportion z tests.
 >
 > I was trying to calculate the difference in prevalence of
Diabetes and
 > Prediabetes between the year 2011 and 2017 (with 95%CI). I was
able to
 > calculate the weighted prevalence of diabetes and prediabetes in
the Year
 > 2011 and 2017 and just subtracted the prevalence of 2011 from the
 > prevalence of 2017 to get the difference in prevalence. But I
could not
 > calculate the 95%CI of the difference in prevalence considering
the weight
 > of the survey data.
 >
 >

Re: [R] Is there any design based two proportions z test?

2024-01-17 Thread John Fox

Dear Md Kamruzzaman,

To answer your second question first, you could just use the svychisq() 
function. The difference-of-proportion test is equivalent to a chisquare 
test for the 2-by-2 table.


You don't say how you computed the confidence intervals for the two 
separate proportions, but if you have their standard errors (and if not, 
you should be able to infer them from the confidence intervals) you can 
compute the variance of the difference as the sum of the variances 
(squared standard errors), because the two proportions are independent, 
and from that the confidence interval for their difference.


I hope this helps,
John
--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/

On 2024-01-16 10:21 p.m., Md. Kamruzzaman wrote:

[You don't often get email from mkzama...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

Caution: External email.


Hello Everyone,
I was analysing big survey data using survey packages on RStudio. Survey
package allows survey data analysis with the design effect.The survey
package included functions for all other statistical analysis except
two-proportion z tests.

I was trying to calculate the difference in prevalence of Diabetes and
Prediabetes between the year 2011 and 2017 (with 95%CI). I was able to
calculate the weighted prevalence of diabetes and prediabetes in the Year
2011 and 2017 and just subtracted the prevalence of 2011 from the
prevalence of 2017 to get the difference in prevalence. But I could not
calculate the 95%CI of the difference in prevalence considering the weight
of the survey data.

I was also trying to see if this difference in prevalence is statistically
significant. I could do it using the simple two-proportion z test without
considering the weight of the sample. But I want to do it considering the
weight of the sample.


Example: Prevalence of Diabetes:
  2011: 11.0 (95%CI
10.1-11.9)
  2017: 10.1 (95%CI
9.4-10.9)
  Diff: 0.9% (95%CI: ??)
  Proportion Z test P
Value: ??
Your cooperation will be highly appreciated.

Thanks in advance.

With Regards

**

*Md Kamruzzaman*

*PhD **Research Fellow (**Medicine**)*
Discipline of Medicine and Centre of Research Excellence in Translating
Nutritional Science to Good Health
Adelaide Medical School | Faculty of Health and Medical Sciences
The University of Adelaide
Adelaide SA 5005

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] arrow on contour line

2024-01-12 Thread John Kane
Something like this shodld worx.  You will need to fiddle around with the
actual co-ordinates etc. I just stuck an arrow in what seemed like a handy
place

On Wed, 10 Jan 2024 at 19:13, Deepankar Basu  wrote:

> Hello,
>
> I am drawing contour lines for a function of 2 variables at one level of
> the value of the function and want to include a small arrow in any
> direction of increase of the function. Is there some way to do that?
>
> Below is an example that creates the contour lines. How do I add one small
> arrow on each line in the direction of increase of the function (at some
> central point of the contour line)? Any direction will do, but perhaps the
> direction of the gradient will be the best.
>
> Thanks in advance.
> DB
>
> 
>
> library(tidyverse)
>
> x <- seq(1,2,length.out=100)
> y <- seq(1,2,length.out=100)
>
> myf <- function(x,y) {x*y}
> myg <- function(x,y) {x^2 + y^2}
>
> d1 <- expand.grid(X1 = x, X2 = y) %>%
>   mutate(Z = myf(X1,X2)) %>%
>   as.data.frame()
>
> d2 <- expand.grid(X1 = x, X2 = y) %>%
>   mutate(Z = myg(X1,X2)) %>%
>   as.data.frame()
>
> ggplot(data = d1, aes(x=X1,y=X2,z=Z))+
>   stat_contour(breaks = c(2)) +
>   stat_contour(data=d2, aes(x=X1,y=X2,z=Z), breaks=c(6))
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Truncated plots

2024-01-09 Thread John Kane
If it looks to be a very specific RStudio/ggplot2 problem then
https://community.rstudio.com is probably the place to ask.

What happens if she does as Duncan suggests or if she exports the file?

Come to think of it, is she getting the same result if she clicks on Zoom
in the plot window? The standard plot window in RStudio can distort the
actual image due to size restrictions.

On Tue, 9 Jan 2024 at 12:47, Ivan Krylov via R-help 
wrote:

> В Tue, 9 Jan 2024 16:42:32 +
> Nick Wray  пишет:
>
> > she has a problem with R studio on her laptop
>
> Does the problem happen with plain R, without Rstudio?
>
> What's the student's sessionInfo()?
>
> > I have a screenshot which could email if anyone needs to see what it
> > looks like.
>
> I think that PNG screenshots are allowed on the mailing list, so it
> could be very helpful if you attached an appropriately cropped
> screenshot.
>
> --
> Best regards,
> Ivan
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Amelia. Imputation of time-series data

2024-01-05 Thread Sorkin, John
Colleagues,

I have started working with Amelia, with the aim of imputing missing data for 
time-series data. 

Although I have succeeded in getting Amelia to perform the imputation, I have 
not found any documentation describing how Amelia imputes time-series data. I 
have read the basic Amelia documentation, but it does not address how 
time-series data are imputed. The documentation describes general imputation 
where there is no serial auto correlation of repeated observations from the 
same subject. Does Amelia incorporate the serial autocorrelation in the 
imputation procedure? Can someone direct me to documentation that explains the 
imputation method? 

Thank you,
John


John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;

Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center; 

PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;

Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Obtaining a value of pie in a zero inflated model (fm-zinb2)

2024-01-04 Thread Sorkin, John
I am running a zero inflated regression using the zeroinfl function similar to 
the model below:
  
 fm_zinb2 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "poisson")
summary(fm_zinb2)

I have three questions:

1) How can I obtain a value for the parameter pie, which is the fraction of the 
population that is in the zero inflated model vs the fraction in the count 
model? 

2) For any particular subject, how can I determine if the subject is in the 
portion of the population that contributes a zero count because the subject is 
in the group of subjects who have structural zero responses vs. the subject 
being in the portion of the population who can contribute a zero or a non-zero 
response?

3) zero inflated models can be solved using closed form solutions, or using 
iterative methods. Which method is used by fm_zinb2?

Thank you,
John

John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;

Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center; 

PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;

Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Advice on starting to analyze smokestack emissions?

2023-12-17 Thread Sorkin, John
Kevin,
I would like to be in touch with you. I am pursuing a research project similar 
to yours. Perhaps we can help each other.
John
jsor...@som.umaryland.edu

John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;

Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center;

PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;

Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382





From: R-help  on behalf of Kevin Zembower via 
R-help 
Sent: Saturday, December 16, 2023 2:06 PM
To: R-help email list
Subject: Re: [R] Advice on starting to analyze smokestack emissions?

Just to follow up on this thread, I didn't experience any problems
accessing the air monitoring data with the RAQSAPI package that I
anticipated from the US EPA's Air Quality System (AQS) Data Mart
database website. I didn't have to qualify with an agency affiliation
at all, just an email address.

Thanks again, Karl, for suggesting this.

-Kevin

On Fri, 2023-12-15 at 08:29 -0500, Kevin Zembower wrote:
> Bert, Tim, Karl and Richard, thank you all for your suggestions and
> help.
>
> I will try the R-sig-ecology list.
>
> Karl, I wasn't aware of the RAQSAPI package, but it looked promising.
> However, when I went to the source of the data it uses, the United
> States Environmental Protection Agency’s (US EPA) Air Quality System
> (AQS) Data Mart database, it looks like interactive access to the
> data
> is restricted to those who can document a professional agency
> affiliation. I don't have that. I'll work with the package to see if
> this is true regarding obtaining the data through it. Thanks for the
> suggestion.
>
> Richard, the Canada study of crematoriums was very useful. Thanks.
>
> Thanks, again, all, for your help.
>
> -Kevin



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Convert two-dimensional array into a three-dimensional array.

2023-12-08 Thread Sorkin, John
Colleagues

I want to convert a 10x2 array:
# create a 10x2 matrix.
datavals <- matrix(nrow=10,ncol=2)
datavals[,] <- rep(c(1,2),10)+c(rnorm(10),rnorm(10))
datavals

into a 10x3 array, ThreeDArray, dim(10,2,10).

The values storede in  ThreeDArray's first dimensions will be the data stored 
in datavalues.
ThreeDArray[i,,] <- datavals[i,]

The values storede in  ThreeDArray's second dimensions will be the data stored 
in datavalues.
ThreeDArray[,j,] <- datavals[,j]

The data stored in ThreeDArray[,,1] will be 1, 
The data stored in ThreeDArray[,,2] will be 2.
 . . . 
The data stored in ThreeDArray[,,10] will be 10.

I have no idea how to code the coversion of the 10x2 matrix into a 10,2,10 
array.
I may be able to acomplish my mission by coding each line of the plan described 
above,
but there has to be a more efficient and elegant way to accompish my goal.

Many thanks for your help!
John




John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;

Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center; 

PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;

Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Convert character date time to R date-time variable.

2023-12-07 Thread Sorkin, John
Colleagues,

I have a matrix of character data that represents date and time. The format of 
each element of the matrix is 
"2020-09-17_00:00:00"
How can I convert the elements into a valid R date-time constant?

Thank you,
John



John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;

Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center; 

PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;

Senior Statistician University of Maryland Center for Vascular Research;

Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding "Page X of XX" to PDFs

2023-12-02 Thread John Kane
https://community.rstudio.com/t/total-number-of-pages-in-quarto-pdf/177316/2

On Sat, 2 Dec 2023 at 09:39, Dennis Fisher  wrote:

> OS X
> R 4.3.1
>
> Colleagues
>
> I often create multipage PDFs [pdf()] in which the text "Page X" appears
> in the margin.  These PDFs are created automatically using a massive R
> script.
>
> One of my clients requested that I change this to:
> Page X of XX
> where XX is the total number of pages.
>
> I don't know the number of expected pages so I can't think of any clever
> way to do this.  I suppose that I could create the PDF, find out the number
> of pages, then have a second pass in which the R script was fed the number
> of pages.  However, there is one disadvantage to this -- the original PDF
> contains a timestamp on each page -- the new version would have a different
> timestamp -- so I would prefer to not use this approach.
>
> Has anyone thought of some terribly clever way to solve this problem?
>
> Dennis
>
> Dennis Fisher MD
> P < (The "P Less Than" Company)
> Phone / Fax: 1-866-PLessThan (1-866-753-7784)
> www.PLessThan.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot adjust two y-axis

2023-11-24 Thread John Kane
https://r-graph-gallery.com/line-chart-dual-Y-axis-ggplot2.html

On Fri, 24 Nov 2023 at 12:08, Charles-Édouard Giguère 
wrote:

> Hi,
> Just find a scaling factor that would make the two sets of data comparable.
> Here I divided the second row by 5 and did the same for the second axis.
> Charles-Édouard
> F1 <- as.table(matrix(c(50,11,6,17,16,3,1,2237,611,403,240,280,0,0), 2,7))
> barplot(F1, beside = TRUE, col = c("blue", "grey")) axis(2,
> at=c(0,10,20,30,40,50,60, labels=c(0,10,20,30,40,50,60))) axis(4, at =
> c(0,500,1000,1500,2000,2500), labels =
> c(0,500,1000,1500,2000,2500))
>
> -Message d'origine-
> De : sibylle.stoec...@gmx.ch 
> Envoyé : 24 novembre 2023 11:27
> À : 'Charles-Édouard Giguère' ; r-help@r-project.org
> Objet : RE: [R] ggplot adjust two y-axis
>
> Dear Charles-Edouard
>
> Thanks a lot. Yes indeed barplot sounds excellent.
>
> Unfortunately, the scale of the smaller axis is fixed, even If I am able to
> draw to axes. The idea is to expand the scale to the scale to the second
> axis for comparison.
> F1 <- as.table(matrix(c(50,11,6,17,16,3,1,2237,611,403,240,280,0,0), 2,7))
> barplot(F1, beside = TRUE, col = c("blue", "grey")) axis(2,
> at=c(0,10,20,30,40,50,60, labels=c(0,10,20,30,40,50,60))) axis(4, at =
> c(0,500,1000,1500,2000,2500), labels =
> c(0,500,1000,1500,2000,2500))
>
> Kind regards
> Sibylle
>
>
> -Original Message-
> From: Charles-Édouard Giguère 
> Sent: Friday, November 24, 2023 3:57 PM
> To: sibylle.stoec...@gmx.ch; r-help@r-project.org
> Subject: RE: [R] ggplot adjust two y-axis
>
> Hi,
>  I don't know the axis mecanism well enough in ggplot but using the
> original
> barplot function you can add an axis on the right using the axis function.
>
> Here is an example:
>
> test <- as.table(matrix(c(2,10,3,11), 2,2)) barplot(test, beside = TRUE,
> col
> = scales::brewer_pal(palette = 1)(2)) axis(4, at = c(0, 5,  10), labels =
> c(0,50,100))
>
>
> -Message d'origine-
> De : sibylle.stoec...@gmx.ch  Envoyé : 24
> novembre
> 2023 09:27 À : 'Charles-Édouard Giguère' ;
> r-help@r-project.org Objet : RE: [R] ggplot adjust two y-axis
>
> Dear Charles-Edouard
>
> Thanks a lot.
> So no way in R to just simply have one ggplot with to axis as in Excel
> (attachment)?
>
> Kind regards
> Sibylle
>
> -Original Message-
> From: Charles-Édouard Giguère 
> Sent: Friday, November 24, 2023 3:14 PM
> To: sibylle.stoec...@gmx.ch; r-help@r-project.org
> Subject: RE: [R] ggplot adjust two y-axis
>
> You could also use more simply facet_wrap(~ Studien_Flaeche).
> Charles-Édouard
>
> -Message d'origine-
> De : Charles-Édouard Giguère  Envoyé : 24 novembre
> 2023 09:11 À : sibylle.stoec...@gmx.ch; r-help@r-project.org Objet : RE:
> [R]
> ggplot adjust two y-axis
>
> Hi Sibylle,
> For that kind of data with two different scales, I generally use two graphs
> that I name gg1 and gg2 and join them using gridExtra::grid.arrange(gg1,
> gg2). This way, the red part of your graph is easier to interpret.
> Have a nice day,
> Charles-Édouard
>
> -Message d'origine-
> De : R-help  De la part de
> sibylle.stoec...@gmx.ch Envoyé : 24 novembre 2023 05:52 À :
> r-help@r-project.org Objet : [R] ggplot adjust two y-axis
>
> Dear R-users
>
> Is it possible to adjust two y-axis in a ggplot differently?
> - First y axis (0-60)
> - Second y axis (0-2500)
>
>
> ### Figure 1
> ggplot(Fig1,aes(BFF,Wert,fill=Studien_Flaeche))+
>   geom_bar(stat="identity",position='dodge')+
>   scale_y_continuous(name="First Axis", sec.axis=sec_axis(trans=~.*50,
> name="Second Axis"))+
>   scale_fill_brewer(palette="Set1")
>
> Thanks a lot
> Sibylle
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can someone please have a look at this query on stackoverflow?

2023-11-14 Thread John Kane
I ran the code from the answer and it seems to work well. It, definitely,
is giving a landscape output.

---
title: "Testing landscape and aspect ratio"
output:
  pdf_document:
number_sections: true
classoption:
  - landscape
  - "aspectratio=169"
header-includes:
   - \usepackage{dcolumn}
documentclass: article
geometry: margin=1.5cm---

```{r, out.extra='keepaspectratio=true', out.height='100%', out.width="100%"}
plot(rnorm(100))
```




On Mon, 13 Nov 2023 at 23:33, Ashim Kapoor  wrote:

> Dear all,
>
> I have posted a query which has received a response but that is not
> working on my computer.
>
> Here is the query:
>
>
> https://stackoverflow.com/questions/77387434/pdf-from-rmarkdown-landscape-and-aspectratio-169
>
> Can someone please help me ?
>
> Best Regards,
> Ashim
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada


landscape.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error running gee function. I neither understand the error message, nor know what needs to be done the get the gee to run

2023-10-25 Thread Sorkin, John
Colleagues,

I am receiving several error messages from the gee function. I don't understand 
the ides the error messages are trying to impart, and I don't know how to debug 
or correct the error. The error messages follow:


> fitgee <- gee(HipFlex ~ 
> StepHeight,data=datashort,id=PID,corstr="exchangeable",na.action=na.omit)

Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27

running glm to get initial regression estimate

(Intercept)  StepHeight

   1.400319   58.570236

Error in gee(HipFlex ~ StepHeight, data = datashort, id = PID, corstr = 
"exchangeable",  :

  NA/NaN/Inf in foreign function call (arg 3)

In addition: Warning message:

In gee(HipFlex ~ StepHeight, data = datashort, id = PID, corstr = 
"exchangeable",  :

  NAs introduced by coercion

Of note, when the analysis is run using lm, there is no problem. My fully data 
and code follow:
Thank you,
John


CODE:

if (!require(gee)) {install.packages("gee")}
library(gee)

datashort <- structure(list(HipFlex =   c(1.95, 2.07,  1.55,  0.44,  0.23, 0.41,
 0.22, 4.61, 10.02,  1.08, 1.43, 1.82,  0.34,  0.77,  0.22, 1.06,
 0.13, 0.36,  2.84,  5.2, 12.27, 1.37,  2.33,  3.48,  4.76, 1.92,  2.09,
 4.67, 2.94,  0.75,  0.11, 3.56, 1.63,  0.8,   1.54,  5.06, NA,5.41,
 6.18, 3.75,  3.12, 17.43, 3.18, 0.85, 14.54, 14.34, 21.92, 4.91,
 1.52, 0.38,  0.43,  0.47, 0.56, 6.4,  12.4,   3.98,  0.57, 1.84, 12.06,
 0.45, 8.16,  0.02,  0,0.05, 0.52,  0.11,  0.48,  1.5,  3.29,  2.58,
 2.07, 6.06,  1.46,  1.06, 3.82, 1.09,  2.86,  3.47,  2.22, 1.89, NA,
 3.48, 6.38,  3.58,  1.83, 2.8,  8.28,  7.15,  4.77,  4.93, 0, 0.11,
 1.99, 2.01,  2.3,   1.24, 1.33, 2, 1.01), PID = c("HIPS004", "HIPS004",
 "HIPS005", "HIPS005", "HIPS005", "HIPS006", "HIPS006", "HIPS008",
 "HIPS010", "HIPS024", "HIPS024", "HIPS024", "HIPS025", "HIPS028",
 "HIPS028", "HIPS030", "HIPS030", "HIPS030", "HIPS035", "HIPS035",
 "HIPS035", "HIPS036", "HIPS036", "HIPS037", "HIPS044", "HIPS047",
 "HIPS047", "HIPS056", "HIPS056", "HIPS057", "HIPS057", "HIPS057",
 "HIPS058", "HIPS059", "HIPS059", "HIPS061", "HIPS062", "HIPS062",
 "HIPS062", "HIPS064", "HIPS074", "HIPS079", "HIPS084", "HIPS089",
 "HIPS090", "HIPS090", "HIPS090", "HIPS091", "HIPS091", "HIPS092",
 "HIPS092", "HIPS092", "HIPS001", "HIPS001", "HIPS001", "HIPS004",
 "HIPS004", "HIPS004", "HIPS005", "HIPS005", "HIPS005", "HIPS006",
 "HIPS006", "HIPS008", "HIPS022", "HIPS024", "HIPS028", "HIPS030",
 "HIPS035", "HIPS036", "HIPS036", "HIPS039", "HIPS044", "HIPS047",
 "HIPS051", "HIPS056", "HIPS058", "HIPS058", "HIPS059", "HIPS059",
 "HIPS062", "HIPS062", "HIPS062", "HIPS069", "HIPS069", "HIPS071",
 "HIPS074", "HIPS079", "HIPS084", "HIPS084", "HIPS085", "HIPS089",
 "HIPS090", "HIPS091", "HIPS091", "HIPS091", "HIPS092", "HIPS092",
 "HIPS093"), StepHeight =  c(0.005, 0.008, 0.072, 0.003, 0.014,
 0.01,  0.027, 0.074, 0.128, 0.048, 0.036, 0.024, 0.021, 0.026,
 0.03,  0.004, 0.006, 0.006, 0.011, 0.006, 0.053, 0.028, 0.073,
 0.041, 0.005, 0.007, 0.013, 0.012, 0.021, 0.053, 0.013, 0.071,
 0.012, 0.016, 0.023, 0.024, 0.011, 0.019, 0.014, 0.022, 0.011,
 0.129, 0.03,  0.012, 0.062, 0.145, 0.077, 0.028, 0.006, 0.019,
 0.008, 0.006, 0.034, 0.109, 0.09,  0.005, 0.016, 0.005, 0.257,
 0.011, 0.205, 0.01,  0.017, 0.039, 0.01,  0.016, 0.043, 0.004,
 0.008, 0.04,  0.068, 0.006, 0.008, 0.005, 0.097, 0.015, 0.016,
 0.01,  0.021, 0.008, 0.01,  0.006, 0.016, 0.021, 0.012, 0.009,
 0.032, 0.055, 0.006, 0.066, 0.018, 0.01,  0.018, 0.017, 0.015,
 0.01,  0.017, 0.02,  0.022)), class = "data.frame", row.names = c(4L,
  5L,   6L,7L,   8L,  10L,  12L,  14L,  19L,  29L,  30L,  31L, 33L, 41L,
  43L,  44L,  45L,  46L,  47L,  48L,  51L,  52L,  53L,  58L,  62L, 65L, 67L,
  70L,  72L,  74L,  75L,  77L,  79L,  82L,  83L,  86L,  88L,  89L, 90L, 93L,
 109L, 114L, 117L, 129L, 131L, 132L, 133L, 134L, 135L, 136L, 137L,
 138L, 142L, 143L, 144L, 145L, 146L, 147L, 148L, 149L, 150L, 151L,
 152L, 155L, 165L

[R] by function does not separate output from function with mulliple parts

2023-10-24 Thread Sorkin, John
Colleagues,

I have written an R function (see fully annotated code below), with which I 
want to process a dataframe within levels of the variable StepType. My program 
works, it processes the data within levels of StepType, but the usual headers 
that separate the output by levels of StepType are at the end of the listing 
rather than being used as separators, i.e. I get

Regression results StepType First
Contrast results StepType First
Regression results StepType Second
Contrast results StepType Second

and only after the results are displayed do I get the usual separators:
mydata$StepType: First
NULL
-- 
mydata$StepType: Second
NULL


What I want to get is output that includes the separators i.e., 

mydata$StepType: First
Regression results StepType First
Contrast results StepType First
-- 
mydata$StepType: Second
Regression results StepType Second
Contrast results StepType Second

Can you help me get the separators included in the printed otput?
Thank you, 
John


# Create Dataframe #

mydata <- structure(list(HipFlex = c(19.44, 4.44, 3.71, 1.95, 2.07, 1.55, 
  0.44, 0.23, 2.15, 0.41, 2.3, 0.22, 2.08, 4.61, 4.19, 5.65, 2.73, 
  1.46, 10.02, 7.41, 6.91, 5.28, 9.56, 2.46, 6, 3.85, 6.43, 3.73, 
  1.08, 1.43, 1.82, 2.22, 0.34, 5.11, 0.94, 0.98, 2.04, 1.73, 0.94, 
  18.41, 0.77, 2.31, 0.22, 1.06, 0.13, 0.36, 2.84, 5.2, 2.39, 2.99),
   jSex = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
  1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
  1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
  2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), levels = c("Male", "Female"), class = 
"factor")), 
  row.names = c(NA, 50L), class = "data.frame")

mydata[,"StepType"] <- rep(c("First","Second"),25)
mydata

# END Create Dataframe #



# Define function to be run#

DoReg <- function(x){
fit0<-lm(as.numeric(HipFlex) ~ jSex,data=x)
  print(summary(fit0))
  
  cat("\nMale\n")
  print(contrast(fit0,
 list(jSex="Male")))
  
  cat("\nFemale\n")  
  print(contrast(fit0,
 list(jSex="Female")))
  
  cat("\nDifference\n")
  print(contrast(fit0,
 a=list(jSex="Male"),
 b=list(jSex="Female")))
}

# END Define function to be run#


#
# Run function within levels of Steptype#
#
by(mydata,mydata$StepType,DoReg)
#
# END Run function within levels of Steptype#
#




John David Sorkin M.D., Ph.D.
Professor of Medicine, University of Maryland School of Medicine;
Associate Director for Biostatistics and Informatics, Baltimore VA Medical 
Center Geriatrics Research, Education, and Clinical Center; 
PI Biostatistics and Informatics Core, University of Maryland School of 
Medicine Claude D. Pepper Older Americans Independence Center;
Senior Statistician University of Maryland Center for Vascular Research;
Division of Gerontology and Paliative Care,
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
Cell phone 443-418-5382



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] download.file strict certificate revocation check

2023-10-05 Thread John Neset
Ivan,
SSL connect error & we definitely have MITM doing certificate interference.
No change with True or False with R_LIBCURL_SSL_REVOKE_BEST_EFFORT
Environment variable results should be attached.

-Original Message-
From: Ivan Krylov 
Sent: Wednesday, October 4, 2023 8:52 AM
To: John Neset 
Cc: r-help@R-project.org
Subject: Re: [R] download.file strict certificate revocation check

WARNING: This is an external email.
Do not click links or open attachments unless you recognize the sender and know 
the content is safe.



В Wed, 4 Oct 2023 13:09:47 +0000
John Neset  пишет:

> Trying to do this, reference FAQ-
> 2.18 The Internet download functions fail.
> (c) A MITM proxy (typically in enterprise environments) makes it
> impossible to validate that certificates haven't been revoked. One can
> switch to only best effort revocation checks via an environment
> variable: see ?download.file.

Here's what help(download.file) has to say:

>> On Windows with ‘method = "libcurl"’, when R was linked with
>> ‘libcurl’ with ‘Schannel’ enabled, the connection fails if it
>> cannot be established that the certificate has not been revoked.
>> Some MITM proxies present particularly in corporate environments
>> do not work with this behavior. It can be changed by setting
>> environment variable ‘R_LIBCURL_SSL_REVOKE_BEST_EFFORT’ to
>> ‘TRUE’, with the consequence of reducing security.

Does it help to Sys.setenv(...) this environment variable before downloading? 
If not, please provide your sessionInfo() and the full error message.

--
Best regards,
Ivan
Confidentiality Notice - This communication and any attachments are for the 
sole use of the intended recipient(s) and may contain confidential and 
privileged information. Any unauthorized review, use, disclosure, distribution 
or copying is prohibited. If you are not the intended recipient(s), please 
contact the sender by replying to this e-mail and destroy/delete all copies of 
this e-mail message.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] download.file strict certificate revocation check

2023-10-04 Thread John Neset
What/how do I interact with the download.file with turning off the strict 
certificate revocation check in regards to download & update packages?
I clearly made an attempt at this, but failed miserably.

Trying to do this, reference FAQ-
2.18 The Internet download functions fail.
(c) A MITM proxy (typically in enterprise environments) makes it impossible to 
validate that certificates haven't been revoked. One can switch to only best 
effort revocation checks via an environment variable: see ?download.file.
Confidentiality Notice - This communication and any atta...{{dropped:10}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Grouping by Date and showing count of failures by date

2023-09-30 Thread John Kane
To follow up on Rui Barradas's post, I do not think PivotTable is an R
command.

You may be thinking og the "pivot_longer" and "pivot_wider" functions in
the {tidyr} package which is part of {tidyverse}.

On Sat, 30 Sept 2023 at 07:03, Rui Barradas  wrote:

> Às 21:29 de 29/09/2023, Paul Bernal escreveu:
> > Dear friends,
> >
> > Hope you are doing great. I am attaching the dataset I am working with
> > because, when I tried to dput() it, I was not able to copy the entire
> > result from dput(), so I apologize in advance for that.
> >
> > I am interested in creating a column named Failure_Date_Period that has
> the
> > FAILDATE but formatted as _MM. Then I want to count the number of
> > failures (given by column WONUM) and just have a dataframe that has the
> > FAILDATE and the count of WONUM.
> >
> > I tried this:
> > pt <- PivotTable$new()
> > pt$addData(failuredf)
> > pt$addColumnDataGroups("FAILDATE")
> > pt <- PivotTable$new()
> > pt$addData(failuredf)
> > pt$addColumnDataGroups("FAILDATE")
> > pt$defineCalculation(calculationName = "FailCounts",
> > summariseExpression="n()")
> > pt$renderPivot()
> >
> > but I was not successful. Bottom line, I need to create a new dataframe
> > that has the number of failures by FAILDATE, but in -MM format.
> >
> > Any help and/or guidance will be greatly appreciated.
> >
> > Kind regards,
> > Paul
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> Hello,
>
> No data is attached. Maybe try
>
> dput(head(failuredf, 30))
>
> ?
>
> And where can we find non-base PivotTable? Please start the scripts with
> calls to library() when using non-base functionality.
>
> Hope this helps,
>
> Rui Barradas
>
>
> --
> Este e-mail foi analisado pelo software antivírus AVG para verificar a
> presença de vírus.
> www.avg.com
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] car::deltaMethod() fails when a particular combination of categorical variables is not present

2023-09-26 Thread John Fox

Dear Michael,

My previous response was inaccurate: First, linearHypothesis() *is* able 
to accommodate aliased coefficients by setting the argument singular.ok 
= TRUE:


> linearHypothesis(minimal_model, "bt2 + csent + bt2:csent = 0",
+  singular.ok=TRUE)

Linear hypothesis test:
bt2  + csent  + bt2:csent = 0

Model 1: restricted model
Model 2: a ~ b * c

  Res.DfRSS Df Sum of Sq  F Pr(>F)
1 16 9392.1
2 15 9266.4  1125.67 0.2034 0.6584

Moreover, when there is an empty cell, this F-test is (for a reason that 
I haven't worked out, but is almost surely due to how the rank-deficient 
model is parametrized) *not* equivalent to the t-test for the 
corresponding coefficient in the raveled version of the two factors:


> df$bc <- factor(with(df, paste(b, c, sep=":")))
> m <- lm(a ~ bc, data=df)
> summary(m)

Call:
lm(formula = a ~ bc, data = df)

Residuals:
Min  1Q  Median  3Q Max
-57.455 -11.750   0.439  14.011  37.545

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)20.50  17.57   1.166   0.2617
bct1:unsent37.50  24.85   1.509   0.1521
bct2:other 32.00  24.85   1.287   0.2174
bct2:sent  17.17  22.69   0.757   0.4610  <<< cf. F = 0.2034, p 
= 0.6584

bct2:unsent38.95  19.11   2.039   0.0595

Residual standard error: 24.85 on 15 degrees of freedom
Multiple R-squared:  0.2613,Adjusted R-squared:  0.06437
F-statistic: 1.327 on 4 and 15 DF,  p-value: 0.3052

In the full-rank case, however, what I said is correct -- that is, the 
F-test for the 1 df hypothesis on the three coefficients is equivalent 
to the t-test for the corresponding coefficient when the two factors are 
raveled:


> linearHypothesis(minimal_model_fixed, "bt2 + csent + bt2:csent = 0")

Linear hypothesis test:
bt2  + csent  + bt2:csent = 0

Model 1: restricted model
Model 2: a ~ b * c

  Res.DfRSS Df Sum of Sq  F Pr(>F)
1 15 9714.5
2 14 9194.4  1520.08 0.7919 0.3886

> df_fixed$bc <- factor(with(df_fixed, paste(b, c, sep=":")))
> m <- lm(a ~ bc, data=df_fixed)
> summary(m)

Call:
lm(formula = a ~ bc, data = df_fixed)

Residuals:
Min  1Q  Median  3Q Max
-57.455 -11.750   0.167  14.011  37.545

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)   64.000 25.627   2.497   0.0256
bct1:sent-43.500 31.387  -1.386   0.1874
bct1:unsent  -12.000 36.242  -0.331   0.7455
bct2:other   -11.500 31.387  -0.366   0.7195
bct2:sent-26.333 29.591  -0.890   0.3886 << cf.
bct2:unsent   -4.545 26.767  -0.170   0.8676

Residual standard error: 25.63 on 14 degrees of freedom
Multiple R-squared:  0.2671,Adjusted R-squared:  0.005328
F-statistic:  1.02 on 5 and 14 DF,  p-value: 0.4425

So, to summarize:

(1) You can use linearHypothesis() with singular.ok=TRUE to test the 
hypothesis that you specified, though I suspect that this hypothesis 
probably isn't testing what you think in the rank-deficient case. I 
suspect that the hypothesis that you want to test is obtained by 
raveling the two factors.


(2) There is no reason to use deltaMethod() for a linear hypothesis, but 
there is also no intrinsic reason that deltaMethod() shouldn't be able 
to handle a rank-deficient model. We'll probably fix that.


My apologies for the confusion,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/

On 2023-09-26 9:49 a.m., John Fox wrote:

Caution: External email.


Dear Michael,

You're testing a linear hypothesis, so there's no need to use the delta
method, but the linearHypothesis() function in the car package also
fails in your case:

 > linearHypothesis(minimal_model, "bt2 + csent + bt2:csent = 0")
Error in linearHypothesis.lm(minimal_model, "bt2 + csent + bt2:csent = 
0") :

there are aliased coefficients in the model.

One work-around is to ravel the two factors into a single factor with 5
levels:

 > df$bc <- factor(with(df, paste(b, c, sep=":")))
 > df$bc
  [1] t2:unsent t2:unsent t2:unsent t2:unsent t2:sent   t2:unsent
  [7] t2:unsent t1:sent   t2:unsent t2:unsent t2:other  t2:unsent
[13] t1:unsent t1:sent   t2:unsent t2:other  t1:unsent t2:sent
[19] t2:sent   t2:unsent
Levels: t1:sent t1:unsent t2:other t2:sent t2:unsent

 > m <- lm(a ~ bc, data=df)
 > summary(m)

Call:
lm(formula = a ~ bc, data = df)

Residuals:
     Min  1Q  Median  3Q Max
-57.455 -11.750   0.439  14.011  37.545

Coefficients:
     Estimate Std. Error t value Pr(>|t|)
(Intercept)    20.50  17.57   1.166   0.2617
bct1:unsent    37.50  24.85   1.509   0.1521
bct2:other 32.00  24.85   1.287   0.2174
bct2:sent  17.17  22.69   0.757   0.4610
bct2:unsent    38.95  19.11   2.039   0.0595

Residual sta

Re: [R] car::deltaMethod() fails when a particular combination of categorical variables is not present

2023-09-26 Thread John Fox

Dear Michael,

You're testing a linear hypothesis, so there's no need to use the delta 
method, but the linearHypothesis() function in the car package also 
fails in your case:


> linearHypothesis(minimal_model, "bt2 + csent + bt2:csent = 0")
Error in linearHypothesis.lm(minimal_model, "bt2 + csent + bt2:csent = 0") :
there are aliased coefficients in the model.

One work-around is to ravel the two factors into a single factor with 5 
levels:


> df$bc <- factor(with(df, paste(b, c, sep=":")))
> df$bc
 [1] t2:unsent t2:unsent t2:unsent t2:unsent t2:sent   t2:unsent
 [7] t2:unsent t1:sent   t2:unsent t2:unsent t2:other  t2:unsent
[13] t1:unsent t1:sent   t2:unsent t2:other  t1:unsent t2:sent
[19] t2:sent   t2:unsent
Levels: t1:sent t1:unsent t2:other t2:sent t2:unsent

> m <- lm(a ~ bc, data=df)
> summary(m)

Call:
lm(formula = a ~ bc, data = df)

Residuals:
Min  1Q  Median  3Q Max
-57.455 -11.750   0.439  14.011  37.545

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)20.50  17.57   1.166   0.2617
bct1:unsent37.50  24.85   1.509   0.1521
bct2:other 32.00  24.85   1.287   0.2174
bct2:sent  17.17  22.69   0.757   0.4610
bct2:unsent38.95  19.11   2.039   0.0595

Residual standard error: 24.85 on 15 degrees of freedom
Multiple R-squared:  0.2613,Adjusted R-squared:  0.06437
F-statistic: 1.327 on 4 and 15 DF,  p-value: 0.3052

Then the hypothesis is tested directly by the t-value for the 
coefficient bct2:sent.


I hope that this helps,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/

On 2023-09-26 1:12 a.m., Michael Cohn wrote:

Caution: External email.


I'm running a linear regression with two categorical predictors and their
interaction. One combination of levels does not occur in the data, and as
expected, no parameter is estimated for it. I now want to significance test
a particular combination of levels that does occur in the data (ie, I want
to get a confidence interval for the total prediction at given levels of
each variable).

In the past I've done this using car::deltaMethod() but in this dataset
that does not work, as shown in the example below: The regression model
gives the expected output, but deltaMethod() gives this error:

error in t(gd) %*% vcov. : non-conformable arguments

I believe this is because there is no parameter estimate for when the
predictors have the values 't1' and 'other'. In the df_fixed dataframe,
putting one person into that combination of categories causes deltaMethod()
to work as expected.

I don't know of any theoretical reason that missing one interaction
parameter estimate should prevent getting a confidence interval for a
different combination of predictors. Is there a way to use deltaMethod() or
some other function to do this without changing my data?

Thank you,

- Michael Cohn
Vote Rev (http://voterev.org)


Demonstration:
--

library(car)
# create dataset with outcome and two categorical predictors
outcomes <- c(91,2,60,53,38,78,48,33,97,41,64,84,64,8,66,41,52,18,57,34)
persontype <-
c("t2","t2","t2","t2","t2","t2","t2","t1","t2","t2","t2","t2","t1","t1","t2","t2","t1","t2","t2","t2")
arm_letter <-
c("unsent","unsent","unsent","unsent","sent","unsent","unsent","sent","unsent","unsent","other","unsent","unsent","sent","unsent","other","unsent","sent","sent","unsent")
df <- data.frame(a = outcomes, b=persontype, c=arm_letter)

# note: there are no records with the combination 't1' + 'other'
table(df$b,df$c)


#regression works as expected
minimal_formula <- formula("a ~ b*c")
minimal_model <- lm(minimal_formula, data=df)
summary(minimal_model)

#use deltaMethod() to get a prediction for individuals with the combination
'b2' and 'sent'
# deltaMethod() fails with "error in t(gd) %*% vcov. : non-conformable
arguments."
deltaMethod(minimal_model, "bt2 + csent + `bt2:csent`", rhs=0)

# duplicate the dataset and change one record to be in the previously empty
cell
df_fixed <- df
df_fixed[c(13),"c"] <- 'other'
table(df_fixed$b,df_fixed$c)

#deltaMethod() now works
minimal_model_fixed <- lm(minimal_formula, data=df_fixed)
deltaMethod(minimal_model_fixed, "bt2 + csent + `bt2:csent`", rhs=0)

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Print hypothesis warning- Car package

2023-09-18 Thread John Fox

Hi Peter,

On 2023-09-18 10:08 a.m., peter dalgaard wrote:

Caution: External email.


Also, I would guess that the code precedes the use of backticks in non-syntactic names. 


Indeed, by more than a decade (though modified in the interim).


Could they be deployed here?


I don't think so, at least not without changing how the function works.

The problem doesn't occur when the hypothesis is specified symbolically 
as a character vector, including in equation form, only when the 
hypothesis matrix is given directly, in which case linearHypothesis() 
tries to construct the equation-form representation, again as character 
vectors. Its inability to do so when the coefficient names include 
arithmetic operators doesn't, I think, require a warning or even a 
message: the symbolic representation of the hypothesis can simply be 
omitted. The numeric results reported are entirely unaffected.


I've made this change and will commit it to the next version of the car 
package.


Thank you for the suggestion,
 John




- Peter


On 17 Sep 2023, at 16:43 , John Fox  wrote:

Dear Robert,

Anova() calls linearHypothesis(), also in the car package, to compute sums of 
squares and df, supplying appropriate hypothesis matrices. linearHypothesis() 
usually tries to express the hypothesis matrix in symbolic equation form for 
printing, but won't do this if coefficient names include arithmetic operators, 
in your case - and +, which can confuse it.

The symbolic form of the hypothesis isn't really relevant for Anova(), which 
doesn't use the printed representation of each hypothesis, and so, despite the 
warnings, you get the correct ANOVA table. In your case, where the data are 
balanced, with 4 cases per cell, Anova(mod) and summary(mod) are equivalent, 
which makes me wonder why you would use Anova() in the first place.

To elaborate a bit, linearHypothesis() does tolerate arithmetic operators in 
coefficient names if you specify the hypothesis symbolically rather than as a 
hypothesis matrix. For example, to test, the interaction:

--- snip 


linearHypothesis(mod,

+  c("TreatmentDabrafenib:ExpressionCD271+ = 0",
+"TreatmentTrametinib:ExpressionCD271+ = 0",
+"TreatmentCombination:ExpressionCD271+ = 0"))
Linear hypothesis test

Hypothesis:
TreatmentDabrafenib:ExpressionCD271+ = 0
TreatmentTrametinib:ExpressionCD271+ = 0
TreatmentCombination:ExpressionCD271+ = 0

Model 1: restricted model
Model 2: Viability ~ Treatment * Expression

  Res.Df   RSS Df Sum of Sq F Pr(>F)
1 27 18966
2 24 16739  32226.3 1.064 0.3828

--- snip 

Alternatively:

--- snip 


H <- matrix(0, 3, 8)
H[1, 6] <- H[2, 7] <- H[3, 8] <- 1
H

 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]00000100
[2,]00000010
[3,]00000001


linearHypothesis(mod, H)

Linear hypothesis test

Hypothesis:


Model 1: restricted model
Model 2: Viability ~ Treatment * Expression

  Res.Df   RSS Df Sum of Sq F Pr(>F)
1 27 18966
2 24 16739  32226.3 1.064 0.3828
Warning message:
In printHypothesis(L, rhs, names(b)) :
  one or more coefficients in the hypothesis include
 arithmetic operators in their names;
  the printed representation of the hypothesis will be omitted

--- snip 

There's no good reason that linearHypothesis() should try to express each 
hypothesis symbolically for Anova(), since Anova() doesn't use that 
information. When I have some time, I'll arrange to avoid the warning.

Best,
John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/
On 2023-09-16 4:39 p.m., Robert Baer wrote:

Caution: External email.
When doing Anova using the car package,  I get a print warning that is
unexpected.  It seemingly involves have my flow cytometry factor levels
named CD271+ and CD171-.  But I am not sure this warning should be
intended behavior.  Any explanation about whether I'm doing something
wrong? Why can't I have CD271+ and CD271- as factor levels?  Its legal
text isn't it?
library(car) mod = aov(Viability ~ Treatment*Expression, data = dat1)
Anova(mod, type =2) Anova Table (Type II tests) Response: Viability Sum
Sq Df F value Pr(>F) Treatment 19447.3 3 9.2942 0.0002927 *** Expression
2669.8 1 3.8279 0.0621394 . Treatment:Expression 2226.3 3 1.0640
0.3828336 Residuals 16739.3 24 --- Signif. codes: 0 ‘***’ 0.001 ‘**’
0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning messages: 1: In printHypothesis(L,
rhs, names(b)) : one or more coefficients in the hypothesis include
arithmetic operators in their names; the printed representation of the
hypothesis will be omitted 2: In printHypothesis(L, rhs, names(b)) : one
or more coefficients in the hypothesis include arithmetic operators in
the

Re: [R] Print hypothesis warning- Car package

2023-09-17 Thread John Fox

Dear Robert,

Anova() calls linearHypothesis(), also in the car package, to compute 
sums of squares and df, supplying appropriate hypothesis matrices. 
linearHypothesis() usually tries to express the hypothesis matrix in 
symbolic equation form for printing, but won't do this if coefficient 
names include arithmetic operators, in your case - and +, which can 
confuse it.


The symbolic form of the hypothesis isn't really relevant for Anova(), 
which doesn't use the printed representation of each hypothesis, and so, 
despite the warnings, you get the correct ANOVA table. In your case, 
where the data are balanced, with 4 cases per cell, Anova(mod) and 
summary(mod) are equivalent, which makes me wonder why you would use 
Anova() in the first place.


To elaborate a bit, linearHypothesis() does tolerate arithmetic 
operators in coefficient names if you specify the hypothesis 
symbolically rather than as a hypothesis matrix. For example, to test, 
the interaction:


--- snip 

> linearHypothesis(mod,
+  c("TreatmentDabrafenib:ExpressionCD271+ = 0",
+"TreatmentTrametinib:ExpressionCD271+ = 0",
+"TreatmentCombination:ExpressionCD271+ = 0"))
Linear hypothesis test

Hypothesis:
TreatmentDabrafenib:ExpressionCD271+ = 0
TreatmentTrametinib:ExpressionCD271+ = 0
TreatmentCombination:ExpressionCD271+ = 0

Model 1: restricted model
Model 2: Viability ~ Treatment * Expression

  Res.Df   RSS Df Sum of Sq F Pr(>F)
1 27 18966
2 24 16739  32226.3 1.064 0.3828

--- snip 

Alternatively:

--- snip 

> H <- matrix(0, 3, 8)
> H[1, 6] <- H[2, 7] <- H[3, 8] <- 1
> H
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]00000100
[2,]00000010
[3,]00000001

> linearHypothesis(mod, H)
Linear hypothesis test

Hypothesis:


Model 1: restricted model
Model 2: Viability ~ Treatment * Expression

  Res.Df   RSS Df Sum of Sq F Pr(>F)
1 27 18966
2 24 16739  32226.3 1.064 0.3828
Warning message:
In printHypothesis(L, rhs, names(b)) :
  one or more coefficients in the hypothesis include
 arithmetic operators in their names;
  the printed representation of the hypothesis will be omitted

--- snip 

There's no good reason that linearHypothesis() should try to express 
each hypothesis symbolically for Anova(), since Anova() doesn't use that 
information. When I have some time, I'll arrange to avoid the warning.


Best,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/
On 2023-09-16 4:39 p.m., Robert Baer wrote:

Caution: External email.


When doing Anova using the car package,  I get a print warning that is
unexpected.  It seemingly involves have my flow cytometry factor levels
named CD271+ and CD171-.  But I am not sure this warning should be
intended behavior.  Any explanation about whether I'm doing something
wrong? Why can't I have CD271+ and CD271- as factor levels?  Its legal
text isn't it?

library(car) mod = aov(Viability ~ Treatment*Expression, data = dat1)
Anova(mod, type =2) Anova Table (Type II tests) Response: Viability Sum
Sq Df F value Pr(>F) Treatment 19447.3 3 9.2942 0.0002927 *** Expression
2669.8 1 3.8279 0.0621394 . Treatment:Expression 2226.3 3 1.0640
0.3828336 Residuals 16739.3 24 --- Signif. codes: 0 ‘***’ 0.001 ‘**’
0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning messages: 1: In printHypothesis(L,
rhs, names(b)) : one or more coefficients in the hypothesis include
arithmetic operators in their names; the printed representation of the
hypothesis will be omitted 2: In printHypothesis(L, rhs, names(b)) : one
or more coefficients in the hypothesis include arithmetic operators in
their names; the printed representation of the hypothesis will be
omitted 3: In printHypothesis(L, rhs, names(b)) : one or more
coefficients in the hypothesis include arithmetic operators in their
names; the printed representation of the hypothesis will be omitted


The code to reproduce:

```


dat1 <-structure(list(Treatment = structure(c(1L, 1L, 1L, 1L, 3L, 1L,
1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L), levels = c("Control",
"Dabrafenib", "Trametinib", "Combination"), class = "factor"),
Expression = structure(c(2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L,
 1L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
 1L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L), levels = c("CD271-",
"CD271+"), class = "factor"),
  

[R] Theta from negative binomial regression and power_NegativeBinomiial from PASSED

2023-09-14 Thread Sorkin, John
Colleagues,

I want to use the power_NetativeBinomial function from the PASSED library. The 
function requires a value for a parameter theta. The meaning of theta is not 
given in the documentation (at least I can�t find it) of the function. Further 
the descriptions of the negative binomial distribution that I am familiar with 
do not mention theta as being a parameter of the distribution. I noticed that 
when one runs the glm.nb function to perform a negative binomial regression one 
obtains a value for theta. This leads to two questions

  1.  Is the theta required by the power_NetativeBinomial function the theta 
that is produced by the glm.nb function
  2.  What is theta, and how does it relate to the parameters of the negative 
binomial distribution?

Thank you,
John

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R coding errors

2023-09-08 Thread John Kane
Can you supply us with some sample data?

 A handy way to supply some sample data is the dput() function.  In the
case of a large dataset something like dput(head(mydata, 100)) should
supply the data we need. Just do dput(mydata) where *mydata* is your data.
Copy the output and paste it here.

On Fri, 8 Sept 2023 at 10:42, PROFESOR MADYA DR NORHAYATI BAHARUN <
norha...@uitm.edu.my> wrote:

> Hi Sir,
>
> Could you please help me on the following errors:
>
> STEPS TO MIX IRT_RMT APPROACH IN R
>
> #1- Load required libraries
> library(eRm)
> library(ltm)
> library(mirt)
> library(psych)
>
> HT <- read.csv("C:/Users/User/Dropbox/Analysis R_2023/HT2.csv")
> str(HT)
>
> #2- Load or create your data matrix
> response_columns <- HT[, 1:ncol(HT)]
> response_matrix <- as.matrix(response_columns)
>
> #3- Fit IRT model
> irt_model <- gpcm(response_matrix)
> irt_model
> summary(irt_model)
>
> #4- Fit Rasch model
> rasch_model <- PCM(response_matrix)
> rasch_model
> summary(rasch_model)
>
> #5- Compare item parameter estimates between IRT and Rasch models
> irt_item_parameters <- coef(irt_model)
> rasch_item_parameters <- coef(rasch_model)
>
> #6- Compare person ability estimates between IRT and Rasch models
> #TRY1
> irt_person_abilities <- fscores(irt_model)###ERROR###
> #TRY2
> #a(IRT)- Fit your GPCM model using ltm
> irt_model <- gpcm(data = HT, constraint = "1PL")
>
> #a(IRT)- Calculate factor scores (IRT person abilities)
> irt_person_abilities <- factor.scores(irt_model) ###ERROR###
> irt_person_abilities_dim1 <- factor.scores(gpcm_model, f = 1)
>
> #TRY3
> # Fit your GPCM model using mirt
> irt_model <- mirt(data = HT, model = "gpcm", itemtype = "graded")
>  ###ERROR###
>
> # Calculate factor scores for dimension 1 (adjust the dimension as needed)
> irt_person_abilities_dim1 <- fscores(gpcm_model, method = "EAP", dims = 1)
>
>
> #b(RMT)
> rasch_person_abilities <- person.parameter(rasch_model)$theta
>
>
> #7- Perform model comparison using fit statistics (e.g., AIC, BIC)
> irt_aic <- AIC(irt_model)
> rasch_aic <- AIC(rasch_model)
>
> irt_bic <- BIC(irt_model)
> rasch_bic <- BIC(rasch_model)
>
> #8- Print or visualize the results for comparison
> print("Item Parameter Estimates:")
> print(irt_item_parameters)
> print(rasch_item_parameters)
>
> print("Person Ability Estimates:")
> print(irt_person_abilities)  ###ERROR###
> print(rasch_person_abilities)
>
> print("Model Fit Statistics:")
> print(paste("IRT AIC:", irt_aic))
> print(paste("Rasch AIC:", rasch_aic))
>
> print(paste("IRT BIC:", irt_bic))
> print(paste("Rasch BIC:", rasch_bic))
>
> Hope to get your response.
>
> Many thanks.
>
> Regards,
>
> Norhayati
>
> --
>
>
>
>
> *PENAFIAN: *E-mel ini dan apa-apa fail yang dihantar
> bersama-samanya
> ("Mesej") adalah dihasratkan hanya untuk kegunaan
> penerima yang dinyatakan
> di atas dan mungkin mengandungi maklumat yang tidak
> umum, bermilik,
> istimewa, sulit dan dikecualikan dari penzahiran di bawah
> undang-undang
> yang terpakai termasuklah Akta Rahsia Rasmi 1972. *BACA SELANJUTNYA...*
> <https://mail.uitm.edu.my/index.php?option=com_content&view=article&id=83>*DISCLAIMER
>
> :** This e-mail and any files transmitted with it
> ("Message") is intended
> only for the use of the recipient(s) named
> above and may contain
> information that is non-public,  proprietary,
> privileged,  confidential
> and
> exempt  from  disclosure under applicable law including the
> Official
> Secrets Act 1972. **READ MORE...*
> <https://mail.uitm.edu.my/index.php?option=com_content&view=article&id=83>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Issue with gc() on Ubuntu 20.04

2023-08-27 Thread John Logsdon

On 27-08-2023 21:02, Ivan Krylov wrote:

On Sun, 27 Aug 2023 19:54:23 +0100
John Logsdon  wrote:


Not so although it did lower the gc() time to 95.84%.

This was on a 16 core Threadripper 1950X box so I was intending to
use library parallel but I tried it on my lowly windows box that is
years old and got it down to 88.07%.


Does the Windows box have the same version of R on it?



Yes, they are both 4.3.1


The only thing I can think of is that there are quite a lot of cases
where a function is generated on the fly as in:

eval(parse(t=paste("dprob <-
function(x,l,s){",dist.functions[2,][dist.functions[1,]==distn],"(x,l,s)}",sep="")))


This isn't very idiomatic. If you need dprob to call the function named
in dist.functions[2,][dist.functions[1,]==distn], wouldn't it be easier
for R to assign that function straight to dprob?

dprob <- get(dist.functions[2,][dist.functions[1,]==distn])

This way, you avoid the need to parse the code, which is typically not
the fastest part of a programming language.

(Generally in R and other programming languages with recursive data
structures, storing variable names in other variables is not very
efficient. Why not put functions directly into a list?)



Agreed but this statement and other similar ones are only assigned once 
in an outer loop.



Rprof() samples the whole call stack. Can you find out which functions
result in a call to gc()? I haven't experimented with a wide sample of
R code, but I don't usually encounter gc() as a major entry in my
Rprof() outputs.


From the first table, removing all the system functions, it suggests 
that the function do.combx() is mainly guilty.  I have recoded that and 
gc() no longer appears - as it shouldn't with it switched off!  One 
difference was that the new code used the built in combn function while 
the old code used gtools::combinations.  I need gtools::permutations 
elsewhere but that is not time critical.


Thanks Ivan for making me think!

--
John Logsdon
Quantex Research Ltd
m:+447717758675/h:+441614454951

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Issue with gc() on Ubuntu 20.04

2023-08-27 Thread John Logsdon

Folks

I have come across an issue with gc() hogging the processor according to 
Rprof.


Platform is Ubuntu 20.04 all up to date
R version 4.3.1
libraries: survival, MASS, gtools and openxlsx.

With default gc.auto options, the profiler notes the garbage collector 
as self.pct 99.39%.


So I have tried switching it off using options(gc.auto=Inf) in the R 
session before running my program using source().


This lowered self.pct to 99.36.  Not much there.

After some pondering, I added an options(gc.auto=Inf) at the beginning 
of each function, not resetting it at exit, but expecting the offending 
function(s) to plead guilty.


Not so although it did lower the gc() time to 95.84%.

This was on a 16 core Threadripper 1950X box so I was intending to use 
library parallel but I tried it on my lowly windows box that is years 
old and got it down to 88.07%.


The only thing I can think of is that there are quite a lot of cases 
where a function is generated on the fly as in:


eval(parse(t=paste("dprob <- 
function(x,l,s){",dist.functions[2,][dist.functions[1,]==distn],"(x,l,s)}",sep="")))


I haven't added the options to any of these.

The highest time used by any of my functions is 0.05% - the rest is 
dominated by gc().


There may not be much point in parallising the code until I can reduce 
the garbage collection.


I am not short of memory and would like to disable it fully but despite 
adding to all routines, I haven't managed to do this yet.


Can anyone advise me?

And why is the Linux version so much worse than Windows?

TIA

--
John Logsdon
Quantex Research Ltd
m:+447717758675/h:+441614454951

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Determining Starting Values for Model Parameters in Nonlinear Regression

2023-08-19 Thread John Fox

Dear John, John, and Paul,

In this case, one can start values by just fitting

> lm(1/y ~ x1 + x2 + x3 - 1, data=mydata)

Call:
lm(formula = 1/y ~ x1 + x2 + x3 - 1, data = mydata)

Coefficients:
 x1   x2   x3
0.00629  0.00868  0.00803

Of course, the errors enter this model differently, so this isn't the 
same as the nonlinear model, but the regression coefficients are very 
close to the estimates for the nonlinear model.


Best,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/

On 2023-08-19 6:39 p.m., Sorkin, John wrote:

Caution: External email.


Colleagues,

At the risk of starting a forest fire, or perhaps a brush fire, while it is 
good to see that nlxb can find a solution from arbitrary starting values, I 
think Paul’s question has merit despite Professor Nash’s excellent and helpful 
observation.

Although non-linear algorithms can converge, they can converge to a false 
solution if starting values are sub-optimally specified. When possible, I try 
to specify thought-out starting values. Would it make sense to plot y as a 
function of (x1, x2) at different values of x3 to get a sense of possible 
starting values? Or, perhaps using median values of x1, x2, and x3 as starting 
values. Comparing results from different starting values can give some 
confidence that the solution obtained using arbitrary starting values are 
likely “correct”.

I freely admit that my experience (and thus expertise) using non-linear 
solutions is limited. Please do not flame me, I am simply urging caution.

John

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to 
faxing)

On Aug 19, 2023, at 4:35 PM, J C Nash 
mailto:profjcn...@gmail.com>> wrote:

Why bother. nlsr can find a solution from very crude start.

Mixture <- c(17, 14, 5, 1, 11, 2, 16, 7, 19, 23, 20, 6, 13, 21, 3, 18, 15, 26, 
8, 22)
x1 <- c(69.98, 72.5, 77.6, 79.98, 74.98, 80.06, 69.98, 77.34, 69.99, 67.49, 
67.51, 77.63,
72.5, 67.5, 80.1, 69.99, 72.49, 64.99, 75.02, 67.48)
x2 <- c(29, 25.48, 21.38, 19.85, 22, 18.91, 29.99, 19.65, 26.99, 29.49, 32.47,
20.35, 26.48, 31.47, 16.87, 27.99, 24.49, 31.99, 24.96, 30.5)
x3 <- c(1, 2, 1, 0, 3, 1, 0, 2.99, 3, 3, 0, 2, 1, 1, 3, 2,
3, 3, 0, 2)
y <- c(1.4287, 1.4426, 1.4677, 1.4774, 1.4565,
   1.4807, 1.4279, 1.4684, 1.4301, 1.4188, 1.4157, 1.4686, 1.4414,
   1.4172, 1.4829, 1.4291, 1.4438, 1.4068, 1.4524, 1.4183)
mydata<-data.frame(Mixture, x1, x2, x3, y)
mydata
mymod <- y ~ 1/(Beta1*x1 + Beta2*x2 + Beta3*x3)
library(nlsr)
strt<-c(Beta1=1, Beta2=2, Beta3=3)
trysol<-nlxb(formula=mymod, data=mydata, start=strt, trace=TRUE)
trysol
# or pshort(trysol)


Output is

residual sumsquares =  1.5412e-05  on  20 observations
after  29Jacobian and  43 function evaluations
  namecoeff  SE   tstat  pval  gradient
JSingval
Beta1 0.00629212 5.997e-06   1049  2.425e-42   4.049e-08   
721.8
Beta2 0.00867741 1.608e-05  539.7  1.963e-37  -2.715e-08   
56.05
Beta3 0.00801948 8.809e-05  91.03  2.664e-24   1.497e-08   
10.81

J Nash


On 2023-08-19 16:19, Paul Bernal wrote:
Dear friends,
Hope you are all doing well and having a great weekend.  I have data that
was collected on specific gravity and spectrophotometer analysis for 26
mixtures of NG (nitroglycerine), TA (triacetin), and 2 NDPA (2 -
nitrodiphenylamine).
In the dataset, x1 = %NG,  x2 = %TA, and x3 = %2 NDPA.
The response variable is the specific gravity, and the rest of the
variables are the predictors.
This is the dataset:
dput(mod14data_random)
structure(list(Mixture = c(17, 14, 5, 1, 11, 2, 16, 7, 19, 23,
20, 6, 13, 21, 3, 18, 15, 26, 8, 22), x1 = c(69.98, 72.5, 77.6,
79.98, 74.98, 80.06, 69.98, 77.34, 69.99, 67.49, 67.51, 77.63,
72.5, 67.5, 80.1, 69.99, 72.49, 64.99, 75.02, 67.48), x2 = c(29,
25.48, 21.38, 19.85, 22, 18.91, 29.99, 19.65, 26.99, 29.49, 32.47,
20.35, 26.48, 31.47, 16.87, 27.99, 24.49, 31.99, 24.96, 30.5),
 x3 = c(1, 2, 1, 0, 3, 1, 0, 2.99, 3, 3, 0, 2, 1, 1, 3, 2,
 3, 3, 0, 2), y = c(1.4287, 1.4426, 1.4677, 1.4774, 1.4565,
 1.4807, 1.4279, 1.4684, 1.4301, 1.4188, 1.4157, 1.4686, 1.4414,
 1.4172, 1.4829, 1.4291, 1.4438, 1.4068, 1.4524, 1.4183)), row.names =
c(NA,
-20L), class = "data.frame")
The model is the following:
y = 1/(Beta1x1 + Beta2x2 + Beta3x3)
I need to determine starting (initial) values for the model parameters for
this nonlinear regression model, any ideas on how to accomplish this using
R?
Cheers,
Paul
[[alternative HTML version deleted]]

Re: [R] Determining Starting Values for Model Parameters in Nonlinear Regression

2023-08-19 Thread Sorkin, John
Colleagues,

At the risk of starting a forest fire, or perhaps a brush fire, while it is 
good to see that nlxb can find a solution from arbitrary starting values, I 
think Paul’s question has merit despite Professor Nash’s excellent and helpful 
observation.

Although non-linear algorithms can converge, they can converge to a false 
solution if starting values are sub-optimally specified. When possible, I try 
to specify thought-out starting values. Would it make sense to plot y as a 
function of (x1, x2) at different values of x3 to get a sense of possible 
starting values? Or, perhaps using median values of x1, x2, and x3 as starting 
values. Comparing results from different starting values can give some 
confidence that the solution obtained using arbitrary starting values are 
likely “correct”.

I freely admit that my experience (and thus expertise) using non-linear 
solutions is limited. Please do not flame me, I am simply urging caution.

John

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to 
faxing)

On Aug 19, 2023, at 4:35 PM, J C Nash 
mailto:profjcn...@gmail.com>> wrote:

Why bother. nlsr can find a solution from very crude start.

Mixture <- c(17, 14, 5, 1, 11, 2, 16, 7, 19, 23, 20, 6, 13, 21, 3, 18, 15, 26, 
8, 22)
x1 <- c(69.98, 72.5, 77.6, 79.98, 74.98, 80.06, 69.98, 77.34, 69.99, 67.49, 
67.51, 77.63,
   72.5, 67.5, 80.1, 69.99, 72.49, 64.99, 75.02, 67.48)
x2 <- c(29, 25.48, 21.38, 19.85, 22, 18.91, 29.99, 19.65, 26.99, 29.49, 32.47,
   20.35, 26.48, 31.47, 16.87, 27.99, 24.49, 31.99, 24.96, 30.5)
x3 <- c(1, 2, 1, 0, 3, 1, 0, 2.99, 3, 3, 0, 2, 1, 1, 3, 2,
   3, 3, 0, 2)
y <- c(1.4287, 1.4426, 1.4677, 1.4774, 1.4565,
  1.4807, 1.4279, 1.4684, 1.4301, 1.4188, 1.4157, 1.4686, 1.4414,
  1.4172, 1.4829, 1.4291, 1.4438, 1.4068, 1.4524, 1.4183)
mydata<-data.frame(Mixture, x1, x2, x3, y)
mydata
mymod <- y ~ 1/(Beta1*x1 + Beta2*x2 + Beta3*x3)
library(nlsr)
strt<-c(Beta1=1, Beta2=2, Beta3=3)
trysol<-nlxb(formula=mymod, data=mydata, start=strt, trace=TRUE)
trysol
# or pshort(trysol)


Output is

residual sumsquares =  1.5412e-05  on  20 observations
   after  29Jacobian and  43 function evaluations
 namecoeff  SE   tstat  pval  gradient
JSingval
Beta1 0.00629212 5.997e-06   1049  2.425e-42   4.049e-08   
721.8
Beta2 0.00867741 1.608e-05  539.7  1.963e-37  -2.715e-08   
56.05
Beta3 0.00801948 8.809e-05  91.03  2.664e-24   1.497e-08   
10.81

J Nash


On 2023-08-19 16:19, Paul Bernal wrote:
Dear friends,
Hope you are all doing well and having a great weekend.  I have data that
was collected on specific gravity and spectrophotometer analysis for 26
mixtures of NG (nitroglycerine), TA (triacetin), and 2 NDPA (2 -
nitrodiphenylamine).
In the dataset, x1 = %NG,  x2 = %TA, and x3 = %2 NDPA.
The response variable is the specific gravity, and the rest of the
variables are the predictors.
This is the dataset:
dput(mod14data_random)
structure(list(Mixture = c(17, 14, 5, 1, 11, 2, 16, 7, 19, 23,
20, 6, 13, 21, 3, 18, 15, 26, 8, 22), x1 = c(69.98, 72.5, 77.6,
79.98, 74.98, 80.06, 69.98, 77.34, 69.99, 67.49, 67.51, 77.63,
72.5, 67.5, 80.1, 69.99, 72.49, 64.99, 75.02, 67.48), x2 = c(29,
25.48, 21.38, 19.85, 22, 18.91, 29.99, 19.65, 26.99, 29.49, 32.47,
20.35, 26.48, 31.47, 16.87, 27.99, 24.49, 31.99, 24.96, 30.5),
x3 = c(1, 2, 1, 0, 3, 1, 0, 2.99, 3, 3, 0, 2, 1, 1, 3, 2,
3, 3, 0, 2), y = c(1.4287, 1.4426, 1.4677, 1.4774, 1.4565,
1.4807, 1.4279, 1.4684, 1.4301, 1.4188, 1.4157, 1.4686, 1.4414,
1.4172, 1.4829, 1.4291, 1.4438, 1.4068, 1.4524, 1.4183)), row.names =
c(NA,
-20L), class = "data.frame")
The model is the following:
y = 1/(Beta1x1 + Beta2x2 + Beta3x3)
I need to determine starting (initial) values for the model parameters for
this nonlinear regression model, any ideas on how to accomplish this using
R?
Cheers,
Paul
   [[alternative HTML version deleted]]
__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C34eca026294a401cee6e08dba0f3e0d0%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638280741555924966%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=aQ9ApqQ%2BTJfvErHnTy4196dqj%2FZ2ed4vjXp50%2F%2B8uRs%3D&reserved=0<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide 
https://nam11.s

Re: [R] Could not read time series data using read.zoo()

2023-08-03 Thread John Kane
One reason seems to be you are saying sep = "," and there is no "," in the
file.
Also you only have 3 columns of data but 4 variable names.

On Thu, 3 Aug 2023 at 10:53, Christofer Bogaso 
wrote:

> Hi,
>
> I have a CSV which contains data like below (only first few rows),
>
> Date Adj Close lret
> 02-01-1997 737.01
> 03-01-1997 748.03 1.48416235
> 06-01-1997 747.65 -0.050813009
> 07-01-1997 753.23 0.743567202
> 08-01-1997 748.41 -0.64196699
> 09-01-1997 754.85 0.856809786
> 10-01-1997 759.5 0.614126802
>
> However when I try to read this data using below code I get error,
>
> read.zoo("1.csv", sep = ',', format = '%d-%m-%Y')
>
> Error reads as,
>
> index has 4500 bad entries at data rows: 1 2 3 4 5 6 7 8 9.
>
> Could you please help to understand why I am getting this error?
>
> > sessionInfo()
>
> R version 4.2.2 (2022-10-31)
>
> Platform: x86_64-apple-darwin17.0 (64-bit)
>
> Running under: macOS Big Sur ... 10.16
>
>
> Matrix products: default
>
> BLAS:
>  /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
>
> LAPACK:
> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
>
>
> locale:
>
> [1] C/UTF-8/C/C/C/C
>
>
> attached base packages:
>
> [1] stats graphics  grDevices utils datasets  methods   base
>
>
> other attached packages:
>
> [1] zoo_1.8-12
>
>
> loaded via a namespace (and not attached):
>
> [1] compiler_4.2.2  tools_4.2.2 grid_4.2.2  lattice_0.20-45
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Obtaining R-squared from All Possible Combinations of Linear Models Fitted

2023-07-17 Thread John C Frain
MuMln is a package designed to select optimum models mainly based on
information criteria.  R-squared is not a suitable criterion for this
purpose.   As far as I can see is not covered in this package.  (I presume
you already know that R-squared for the model with all possible regressors
is at least as great as R with any subset of the regressors).

If you want to calculate all these R-squared's it should be easy to write a
small routine to estimate them.  I am very curious as to why you wish to do
this.


John C Frain.
3 Aranleigh Park
Rathfarnham
Dublin 14
Ireland
www.tcd.ie/Economics/staff/frainj/home.html
https://jcfrain.wordpress.com/
https://jcfraincv19.wordpress.com/

mailto:fra...@tcd.ie
mailto:fra...@gmail.com


On Mon, 17 Jul 2023 at 18:25, Paul Bernal  wrote:

> Dear friends,
>
> I need to automatically fit all possible linear regression models (with all
> possible combinations of regressors), and found the MuMIn package, which
> has the dredge function.
>
> This is the dataset  I am working with:
> > dput(final_frame)
> structure(list(y = c(41.9, 44.5, 43.9, 30.9, 27.9, 38.9, 30.9,
> 28.9, 25.9, 31, 29.5, 35.9, 37.5, 37.9), x1 = c(6.6969, 8.7951,
> 9.0384, 5.9592, 4.5429, 8.3607, 5.898, 5.6039, 4.9176, 6.2712,
> 5.0208, 5.8282, 5.9894, 7.5422), x4 = c(1.488, 1.82, 1.5, 1.121,
> 1.175, 1.777, 1.24, 1.501, 0.998, 0.975, 1.5, 1.225, 1.256, 1.69
> ), x8 = c(22, 50, 23, 32, 40, 48, 51, 32, 42, 30, 62, 32, 40,
> 22), x2 = c(1.5, 1.5, 1, 1, 1, 1.5, 1, 1, 1, 1, 1, 1, 1, 1.5),
> x7 = c(3, 4, 3, 3, 3, 4, 3, 3, 4, 2, 4, 3, 3, 3)), class =
> "data.frame", row.names = c(NA,
> -14L))
>
> I started with the all regressor model, which I called globalmodel as
> follows:
> #Fitting Regression model with all possible combinations of regressors
> options(na.action = "na.fail") # change the default "na.omit" to prevent
> models
> globalmodel <- lm(y~., data=final_frame)
>
> Then, the following code provides the different coefficients (for
> regressors and the intercept) for each of the possible model combinations:
> combinations <- dredge(globalmodel)
> print(combinations)
>  I would like to retrieve  the R-squared generated by each combination, but
> have not been able to get it thus far.
>
> Any guidance on how to retrieve the R-squared from all linear model
> combinations would be greatly appreciated.
>
> Kind regards,
> Paul
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot: Can plot graphs with points, can't plot graph with points and line

2023-07-13 Thread John Kane
Hi John,

This should do what you want.  I've changed your data.frame name for my own
convenience to "dat1".

###===
dat1  <- data.frame(
Time  = c("Age.25","Age.35","Age.45","Age.55"),
Medians = c(128.25,148.75,158.5,168.75)
)

# create segments data.frame

dat2  <- data.frame(x = dat1$Time[1:3], xend = dat1$Time[2:4],
y = dat1$Medians[1:3], yend = dat1$Medians[2:4])



p1  <- ggplot(dat1 ,aes(x = Time, y = Medians)) +
  geom_point()

p1 + geom_segment( x = "Age.25", y = 128.25, xend = "Age.35", yend =
148.75) +
  geom_segment( x = "Age.35", y = 148.75, xend = "Age.45", yend = 158.5) +
  geom_segment( x = "Age.45", y = 158.5, xend = "Age.55", yend = 168.55)


# This solution shamelessly stolen from
##
https://stackoverflow.com/questions/62536499/how-to-draw-multiple-line-segment-in-ggplot
p1 + geom_segment(
  data = dat2,
  mapping = aes(x=x, y=y, xend=xend, yend=yend),
  inherit.aes = FALSE
)


On Thu, 13 Jul 2023 at 01:11, Jim Lemon  wrote:

> Hi John,
> I'm not sure how to do this with ggplot, but:
>
> Time<- c("Age.25","Age.35","Age.45","Age.55")
> Medians<-c(128.25,148.75,158.5,168.75)
> > is.character(Time)
> # [1] TRUE - thus it has no intrinsic numeric value to plot
> > is.numeric(Medians)
> # [1] TRUE
> # coerce Medians to factor and then plot against Time, but can't do
> point/line
> plot(as.factor(Time),Medians,type="p")
> # let R determine the x values (1:4) and omit the x-axis
> plot(Medians,type="b",xaxt="n")
> # add the x-axis with the "Time" labels
> axis(1,at=1:4,labels=Time)
>
>
> On Thu, Jul 13, 2023 at 11:18 AM Sorkin, John 
> wrote:
> >
> > I am trying to plot four points, and join the points with lines. I can
> plot the points, but I can't plot the points and the line.
> > I hope someone can help my with my ggplot code.
> >
> > # load ggplot2
> > if(!require(ggplot2)){install.packages("ggplot2")}
> > library(ggplot2)
> >
> > # Create data
> > Time   <- c("Age.25","Age.35","Age.45","Age.55")
> > Medians<-c(128.25,148.75,158.5,168.75)
> > themedians <- matrix(data=cbind(Time,Medians),nrow=4,ncol=2)
> > dimnames(themedians) <- list(NULL,c("Time","Median"))
> > # Convert to dataframe the data format used by ggplot
> > themedians <- data.frame(themedians)
> > themedians
> >
> > # This plot works
> > ggplot(themedians,aes(x=Time,y=Median))+
> >   geom_point()
> > # This plot does not work!
> > ggplot(themedians,aes(x=Time,y=Median))+
> >   geom_point()+
> >   geom_line()
> >
> > Thank you,
> > John
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot: Can plot graphs with points, can't plot graph with points and line

2023-07-12 Thread Sorkin, John
I am trying to plot four points, and join the points with lines. I can plot the 
points, but I can't plot the points and the line.
I hope someone can help my with my ggplot code.

# load ggplot2
if(!require(ggplot2)){install.packages("ggplot2")}
library(ggplot2)

# Create data
Time   <- c("Age.25","Age.35","Age.45","Age.55")
Medians<-c(128.25,148.75,158.5,168.75)
themedians <- matrix(data=cbind(Time,Medians),nrow=4,ncol=2)
dimnames(themedians) <- list(NULL,c("Time","Median"))
# Convert to dataframe the data format used by ggplot
themedians <- data.frame(themedians)
themedians

# This plot works
ggplot(themedians,aes(x=Time,y=Median))+
  geom_point()
# This plot does not work!
ggplot(themedians,aes(x=Time,y=Median))+
  geom_point()+
  geom_line()

Thank you,
John

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting an error calling MASS::boxcox in a function

2023-07-08 Thread John Fox

Hi Bert,

On 2023-07-08 3:42 p.m., Bert Gunter wrote:

Caution: This email may have originated from outside the organization. Please 
exercise additional caution with any links and attachments.


Thanks John.

?boxcox says:

*
Arguments

object

a formula or fitted model object. Currently only lm and aov objects are handled.
*
I read that as saying that

boxcox(lm(z+1 ~ 1),...)

should run without error. But it didn't. And perhaps here's why:
BoxCoxLambda <- function(z){
b <- MASS:::boxcox.lm(lm(z+1 ~ 1), lambda = seq(-5, 5, length.out =
61), plotit = FALSE)
b$x[which.max(b$y)]# best lambda
}


lambdas <- apply(dd,2 , BoxCoxLambda)

Error in NextMethod() : 'NextMethod' called from an anonymous function

and, indeed, ?UseMethod says:
"NextMethod should not be called except in methods called by UseMethod
or from internal generics (see InternalGenerics). In particular it
will not work inside anonymous calling functions (e.g.,
get("print.ts")(AirPassengers))."

BUT 
BoxCoxLambda <- function(z){
   b <- MASS:::boxcox(z+1 ~ 1, lambda = seq(-5, 5, length.out = 61),
plotit = FALSE)
   b$x[which.max(b$y)]# best lambda
}


lambdas <- apply(dd,2 , BoxCoxLambda)
lambdas

[1] 0.167 0.167


As it turns out, it's the update() step in boxcox.lm() that fails, and 
the update takes place because $y is missing from the lm object, so the 
following works:


BoxCoxLambda <- function(z){
b <- boxcox(lm(z + 1 ~ 1, y=TRUE),
lambda = seq(-5, 5, length.out = 101),
plotit = FALSE)
b$x[which.max(b$y)]
}



The identical lambdas do not seem right to me; 


I think that's just an accident of the example (using the BoxCoxLambda() 
above):


> apply(dd, 2, BoxCoxLambda, simplify = TRUE)
[1] 0.2 0.2

> dd[, 2]  <- dd[, 2]^3
> apply(dd, 2, BoxCoxLambda, simplify = TRUE)
[1] 0.2 0.1

Best,
 John


nor do I understand why
boxcox.lm apparently throws the error while boxcox.formula does not
(it also calls NextMethod()) So I would welcome clarification to clear
my clogged (cerebral) sinuses. :-)


Best,
Bert


On Sat, Jul 8, 2023 at 11:25 AM John Fox  wrote:


Dear Ron and Bert,

First (and without considering why one would want to do this, e.g.,
adding a start of 1 to the data), the following works for me:

-- snip --

  > library(MASS)

  > BoxCoxLambda <- function(z){
+   b <- boxcox(z + 1 ~ 1,
+   lambda = seq(-5, 5, length.out = 101),
+   plotit = FALSE)
+   b$x[which.max(b$y)]
+ }

  > mrow <- 500
  > mcol <- 2
  > set.seed(12345)
  > dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol =
+mcol)

  > dd1 <- dd[, 1] # 1st column of dd
  > res <- boxcox(lm(dd1 + 1 ~ 1), lambda = seq(-5, 5, length.out = 101),
plotit
+  = FALSE)
  > res$x[which.max(res$y)]
[1] 0.2

  > apply(dd, 2, BoxCoxLambda, simplify = TRUE)
[1] 0.2 0.2

-- snip --

One could also use the powerTransform() function in the car package,
which in this context transforms towards *multi*normality:

-- snip --

  > library(car)
Loading required package: carData

  > powerTransform(dd + 1)
Estimated transformation parameters
 Y1Y2
0.1740200 0.2089925

I hope this helps,
   John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/

On 2023-07-08 12:47 p.m., Bert Gunter wrote:

Caution: This email may have originated from outside the organization. Please 
exercise additional caution with any links and attachments.


No, I'm afraid I'm wrong. Something went wrong with my R session and gave
me incorrect answers. After restarting, I continued to get the same error
as you did with my supposed "fix." So just ignore what I said and sorry for
the noise.

-- Bert

On Sat, Jul 8, 2023 at 8:28 AM Bert Gunter  wrote:


Try this for your function:

BoxCoxLambda <- function(z){
 y <- z
 b <- boxcox(y + 1 ~ 1,lambda = seq(-5, 5, length.out = 61), plotit =
FALSE)
 b$x[which.max(b$y)]# best lambda
}

***I think*** (corrections and clarification strongly welcomed!) that `~`
(the formula function) is looking for 'z' in the GlobalEnv, the caller of
apply(), and not finding it. It finds 'y' here explicitly in the
BoxCoxLambda environment.

Cheers,
Bert



On Sat, Jul 8, 2023 at 4:28 AM Ron Crump via R-help 
wrote:


Hi,

Firstly, apologies as I have posted this on community.rstudio.com too.

I want to optimise a Box-Cox transformation on columns of a matrix (ie, a
unique lambda for each column). So I wrote a function that includes the
call to MASS::boxcox in order that it can be applied to each column easily.
Except that I'm getting an error when calling the function. If I just
extract a column of the matrix 

Re: [R] Getting an error calling MASS::boxcox in a function

2023-07-08 Thread John Fox

Dear Ron and Bert,

First (and without considering why one would want to do this, e.g., 
adding a start of 1 to the data), the following works for me:


-- snip --

> library(MASS)

> BoxCoxLambda <- function(z){
+   b <- boxcox(z + 1 ~ 1,
+   lambda = seq(-5, 5, length.out = 101),
+   plotit = FALSE)
+   b$x[which.max(b$y)]
+ }

> mrow <- 500
> mcol <- 2
> set.seed(12345)
> dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol =
+mcol)

> dd1 <- dd[, 1] # 1st column of dd
> res <- boxcox(lm(dd1 + 1 ~ 1), lambda = seq(-5, 5, length.out = 101), 
plotit

+  = FALSE)
> res$x[which.max(res$y)]
[1] 0.2

> apply(dd, 2, BoxCoxLambda, simplify = TRUE)
[1] 0.2 0.2

-- snip --

One could also use the powerTransform() function in the car package, 
which in this context transforms towards *multi*normality:


-- snip --

> library(car)
Loading required package: carData

> powerTransform(dd + 1)
Estimated transformation parameters
   Y1Y2
0.1740200 0.2089925

I hope this helps,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/

On 2023-07-08 12:47 p.m., Bert Gunter wrote:

Caution: This email may have originated from outside the organization. Please 
exercise additional caution with any links and attachments.


No, I'm afraid I'm wrong. Something went wrong with my R session and gave
me incorrect answers. After restarting, I continued to get the same error
as you did with my supposed "fix." So just ignore what I said and sorry for
the noise.

-- Bert

On Sat, Jul 8, 2023 at 8:28 AM Bert Gunter  wrote:


Try this for your function:

BoxCoxLambda <- function(z){
y <- z
b <- boxcox(y + 1 ~ 1,lambda = seq(-5, 5, length.out = 61), plotit =
FALSE)
b$x[which.max(b$y)]# best lambda
}

***I think*** (corrections and clarification strongly welcomed!) that `~`
(the formula function) is looking for 'z' in the GlobalEnv, the caller of
apply(), and not finding it. It finds 'y' here explicitly in the
BoxCoxLambda environment.

Cheers,
Bert



On Sat, Jul 8, 2023 at 4:28 AM Ron Crump via R-help 
wrote:


Hi,

Firstly, apologies as I have posted this on community.rstudio.com too.

I want to optimise a Box-Cox transformation on columns of a matrix (ie, a
unique lambda for each column). So I wrote a function that includes the
call to MASS::boxcox in order that it can be applied to each column easily.
Except that I'm getting an error when calling the function. If I just
extract a column of the matrix and run the code not in the function, it
works. If I call the function either with an extracted column (ie dd1 in
the reprex below) or in a call to apply I get an error (see the reprex
below).

I'm sure I'm doing something silly, but I can't see what it is. Any help
appreciated.

library(MASS)

# Find optimised Lambda for Boc-Cox transformation
BoxCoxLambda <- function(z){
 b <- boxcox(lm(z+1 ~ 1), lambda = seq(-5, 5, length.out = 61), plotit
= FALSE)
 b$x[which.max(b$y)]# best lambda
}

mrow <- 500
mcol <- 2
set.seed(12345)
dd <- matrix(rgamma(mrow*mcol, shape = 2, scale = 5), nrow = mrow, ncol =
mcol)

# Try it not using the BoxCoxLambda function:
dd1 <- dd[,1] # 1st column of dd
bb <- boxcox(lm(dd1+1 ~ 1), lambda = seq(-5, 5, length.out = 101), plotit
= FALSE)
print(paste0("1st column's lambda is ", bb$x[which.max(bb$y)]))
#> [1] "1st column's lambda is 0.2"

# Calculate lambda for each column of dd
lambdas <- apply(dd, 2, BoxCoxLambda, simplify = TRUE)
#> Error in eval(predvars, data, env): object 'z' not found

Created on 2023-07-08 with reprex v2.0.2

Thanks for your time and help.

Ron
 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create a variable lenght string that can be used in a dimnames statement

2023-07-03 Thread Sorkin, John
My life is complete.
I have inspired a fortune!
John


From: Rolf Turner 
Sent: Monday, July 3, 2023 6:34 PM
To: Bert Gunter
Cc: Sorkin, John; r-help@r-project.org (r-help@r-project.org); Achim Zeileis
Subject: Re: [R]  Create a variable lenght string that can be used in a 
dimnames statement


On Mon, 3 Jul 2023 13:40:41 -0700
Bert Gunter  wrote:

> I am not going to try to sort out your confusion, as others have
> already tried and failed.



Fortune nomination!!!

cheers,

Rolf Turner

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Stats. Dep't. (secretaries) phone:
 +64-9-373-7599 ext. 89622
Home phone: +64-9-480-4619

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Create a variable lenght string that can be used in a dimnames statement

2023-07-03 Thread Sorkin, John
Colleagues,

I am sending this email again with a better description of my problem and the 
area where I need help.

I need help creating a string of variables that will be accepted by the 
dimnames function. The string needs to start with the dimnames j and k followed 
by a series of dimnames xxx1, . . . ., xxx2, . . ., xxxn. I create xxx1, xxx2 
(not  going to xxxn to shorten the code below) as a string using a for loop and 
the paste function. I then use a paste function, zzz <- paste("j","k",string) 
to create the full set of dimnames, j, k, xxx1, xxx2 as string. I create the 
matrix myvalues in the usual way and attempt to assign dim names to the matrix 
using the following dimnames statement,
dimnames(myvalues)<-list(NULL,c(zzz))
The dimnames statement leads to the following error, 
 Error in dimnames(x) <- dn : 
  length of 'dimnames' [2] not equal to array extent
A colnames statement,
colnames(myvalues)<-as.character(zzz)
produces the same error. 

Can someone tell me how to create a sting that can be used in the dimnames 
statment?

Thank you (and please accept my apologies for double posting).

John

# create variable names xxx1 and xxx2.
string=""
for (j in 1:2){
  name <- paste("xxx",j,sep="")
  string <- paste(string,name)
  print(string)
}
# Creation of xxx1 and xxx2 works
string

# Create matrix
myvalues <- matrix(nrow=2,ncol=4)
head(myvalues,1)
# Add "j" and "k" to the string of column names
zzz <- paste("j","k",string)
zzz
# assign column names, j, k, xxx1, xxx2 to the matrix
# create column names, j, k, xxx1, xxx2.
dimnames(myvalues)<-list(NULL,c(zzz))
colnames(myvalues) <- string
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create matrix with column names wiht the same prefix xxxx and that end in 1, 2

2023-07-03 Thread Sorkin, John
Jeff,
Again my thanks for your guidance.
I replaced dimnames(myvalues)<-list(NULL,c(zzz))
with
colnames(myvalues)<-zzz
and get the same error,
Error in dimnames(x) <- dn :
  length of 'dimnames' [2] not equal to array extent
It appears that I am creating the string zzz in a manner that is not compatable 
with either
dimnames(myvalues)<-list(NULL,c(zzz))
or
colnames(myvalues)<-zzz

I think I need to modify the way I create the string zzz.

# create variable names xxx1 and xxx2.
string=""
for (j in 1:2){
  name <- paste("xxx",j,sep="")
  string <- paste(string,name)
  print(string)
}
# Creation of xxx1 and xxx2 works
string

# Create matrix
myvalues <- matrix(nrow=2,ncol=4)
head(myvalues,1)
# Add "j" and "k" to the string of column names
zzz <- paste("j","k",string)
zzz
# assign column names, j, k, xxx1, xxx2 to the matrix
# create column names, j, k, xxx1, xxx2.
dimnames(myvalues)<-list(NULL,c(zzz))
colnames(myvalues)<-zzz
____
From: Jeff Newmiller 
Sent: Monday, July 3, 2023 2:45 PM
To: Sorkin, John
Cc: r-help@r-project.org
Subject: Re: [R]  Create matrix with column names wiht the same prefix  and 
that end in 1, 2

I really think you should read that help page.  colnames() accesses the second 
element of dimnames() directly.

On July 3, 2023 11:39:37 AM PDT, "Sorkin, John"  
wrote:
>Jeff,
>Thank you for your reply.
>I should have said with dim names not column names. I want the Mateix to have 
>dim names, no row names, dim names j, k, xxx1, xxx2.
>
>John
>
>John David Sorkin M.D., Ph.D.
>Professor of Medicine
>Chief, Biostatistics and Informatics
>University of Maryland School of Medicine Division of Gerontology and 
>Geriatric Medicine
>Baltimore VA Medical Center
>10 North Greene Street
>GRECC (BT/18/GR)
>Baltimore, MD 21201-1524
>(Phone) 410-605-7119
>(Fax) 410-605-7913 (Please call phone number above prior to 
>faxing)
>
>On Jul 3, 2023, at 2:11 PM, Jeff Newmiller  wrote:
>
>?colnames
>
>On July 3, 2023 11:00:32 AM PDT, "Sorkin, John"  
>wrote:
>I am trying to create an array, myvalues, having 2 rows and 4 columns, where 
>the column names are j,k,xxx1,xxx2. The code below fails, with the following 
>error, "Error in dimnames(myvalues) <- list(NULL, zzz) :
>length of 'dimnames' [2] not equal to array extent"
>
>Please help me get the code to work.
>
>Thank you,
>John
>
># create variable names xxx1 and xxx2.
>string=""
>for (j in 1:2){
>name <- paste("xxx",j,sep="")
>string <- paste(string,name)
>print(string)
>}
># Creation of xxx1 and xxx2 works
>string
>
># Create matrix
>myvalues <- matrix(nrow=2,ncol=4)
>head(myvalues,1)
># Add "j" and "k" to the string of column names
>zzz <- paste("j","k",string)
>zzz
># assign column names, j, k, xxx1, xxx2 to the matrix
># create column names, j, k, xxx1, xxx2.
>dimnames(myvalues)<-list(NULL,zzz)
>
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
>
>--
>Sent from my phone. Please excuse my brevity.
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create matrix with column names wiht the same prefix xxxx and that end in 1, 2

2023-07-03 Thread Sorkin, John
Jeff,
Thank you for your reply.
I should have said with dim names not column names. I want the Mateix to have 
dim names, no row names, dim names j, k, xxx1, xxx2.

John

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to 
faxing)

On Jul 3, 2023, at 2:11 PM, Jeff Newmiller  wrote:

?colnames

On July 3, 2023 11:00:32 AM PDT, "Sorkin, John"  
wrote:
I am trying to create an array, myvalues, having 2 rows and 4 columns, where 
the column names are j,k,xxx1,xxx2. The code below fails, with the following 
error, "Error in dimnames(myvalues) <- list(NULL, zzz) :
length of 'dimnames' [2] not equal to array extent"

Please help me get the code to work.

Thank you,
John

# create variable names xxx1 and xxx2.
string=""
for (j in 1:2){
name <- paste("xxx",j,sep="")
string <- paste(string,name)
print(string)
}
# Creation of xxx1 and xxx2 works
string

# Create matrix
myvalues <- matrix(nrow=2,ncol=4)
head(myvalues,1)
# Add "j" and "k" to the string of column names
zzz <- paste("j","k",string)
zzz
# assign column names, j, k, xxx1, xxx2 to the matrix
# create column names, j, k, xxx1, xxx2.
dimnames(myvalues)<-list(NULL,zzz)


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C4347c6a62c4b4956756708db7bf0ea2b%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638240046889096206%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=y8kLruSvrjxQLegbbPNMMl665EEApCgiSOq%2BEmhQfNE%3D&reserved=0
PLEASE do read the posting guide 
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C4347c6a62c4b4956756708db7bf0ea2b%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638240046889096206%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HBUMNAeG1KurerS2DAhKxxZVRs71TSF0YJSGjP%2FCixA%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C4347c6a62c4b4956756708db7bf0ea2b%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638240046889096206%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=y8kLruSvrjxQLegbbPNMMl665EEApCgiSOq%2BEmhQfNE%3D&reserved=0
PLEASE do read the posting guide 
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C4347c6a62c4b4956756708db7bf0ea2b%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638240046889096206%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HBUMNAeG1KurerS2DAhKxxZVRs71TSF0YJSGjP%2FCixA%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Create matrix with column names wiht the same prefix xxxx and that end in 1, 2

2023-07-03 Thread Sorkin, John
I am trying to create an array, myvalues, having 2 rows and 4 columns, where 
the column names are j,k,xxx1,xxx2. The code below fails, with the following 
error, "Error in dimnames(myvalues) <- list(NULL, zzz) : 
  length of 'dimnames' [2] not equal to array extent" 

Please help me get the code to work.

Thank you,
John

# create variable names xxx1 and xxx2.
string=""
for (j in 1:2){
  name <- paste("xxx",j,sep="")
  string <- paste(string,name)
  print(string)
}
# Creation of xxx1 and xxx2 works
string

# Create matrix
myvalues <- matrix(nrow=2,ncol=4)
head(myvalues,1)
# Add "j" and "k" to the string of column names
zzz <- paste("j","k",string)
zzz
# assign column names, j, k, xxx1, xxx2 to the matrix
# create column names, j, k, xxx1, xxx2.
dimnames(myvalues)<-list(NULL,zzz)


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting factors in graph panel

2023-06-29 Thread John Kane
gt;  header=TRUE,stringsAsFactors=FALSE)
> > at_df<-at_df[at_df$Income!="No_Answer",which(names(at_df)!="Bank_NA")]
> > png("MF_Bank.png",height=600)
> > par(mfrow=c(2,1))
> > matplot(at_df[,c("MF_None","MF_Equity","MF_Debt","MF_Hybrid")],
> >  type="l",col=1:4,lty=1:4,lwd=3,
> >  main="Percentages by Income and MF type",
> > xlab="Income",ylab="Percentage of group",xaxt="n")
> > axis(1,at=1:5,labels=at_df$Income)
> > legend(3,24,c("MF_None","MF_Equity","MF_Debt","MF_Hybrid"),
> >  lty=1:4,lwd=3,col=1:4)
> > matplot(at_df[,c("Bank_None","Bank_Current","Bank_Savings")],
> >  type="l",col=1:3,lty=1:4,lwd=3,
> >  main="Percentages by Income and Bank type",
> > xlab="Income",ylab="Percentage of group",xaxt="n")
> > axis(1,at=1:5,labels=at_df$Income)
> > legend(3,54,c("Bank_None","Bank_Current","Bank_Savings"),
> >  lty=1:4,lwd=3,col=1:3)
> > dev.off()
> >
> > Jim
> >
> > On Wed, Jun 28, 2023 at 6:33 PM Anupam Tyagi 
> wrote:
> > >
> > > Hello,
> > >
> > > I want to plot the following kind of data (percentage of respondents
> > from a
> > > survey) that varies by Income into many small *line* graphs in a
> > > panel of graphs. I want to omit "No Answer" categories. I want to
> > > see how each one of the categories (percentages), "None", " Equity",
> > > etc. varies by
> > Income.
> > > How can I do this? How to organize the data well and how to plot? I
> > thought
> > > Lattice may be a good package to plot this, but I don't know for
> > > sure. I prefer to do this in Base-R if possible, but I am open to
> > > ggplot. Any
> > ideas
> > > will be helpful.
> > >
> > > Income
> > > $10 $25 $40 $75 > $75 No Answer
> > > MF 1 2 3 4 5 9
> > > None 1 3.05 2.29 2.24 1.71 1.30 2.83 Equity 2 29.76 28.79 29.51
> > > 28.90 31.67 36.77 Debt 3 31.18 32.64 34.31 35.65 37.59 33.15 Hybrid
> > > 4 36.00 36.27 33.94 33.74 29.44 27.25 Bank AC None 1 46.54 54.01
> > > 59.1 62.17 67.67 60.87 Current 2 24.75 24.4 25 24.61 24.02 21.09
> > > Savings 3 25.4 18.7 29 11.48 7.103 13.46 No Answer 9 3.307 2.891
> > > 13.4 1.746 1.208 4.577
> > >
> > > Thanks.
> > > --
> > > Anupam.
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://st/
> > > at.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl
> > > .edu%7C59874e74164c46133f2c08db7853d28f%7C0d4da0f84a314d76ace60a6233
> > > 1e1b84%7C0%7C0%7C638236073642897221%7CUnknown%7CTWFpbGZsb3d8eyJWIjoi
> > > MC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C
> > > %7C%7C&sdata=xoaDMG7ogY4tMtqe30pONZrBdk0eq2cW%2BgdwlDHneWY%3D&reserv
> > > ed=0
> > > PLEASE do read the posting guide
> > http://www.r/
> > -project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7C59
> > 874e74164c46133f2c08db7853d28f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
> > 7C0%7C638236073642897221%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=H7
> > 6XCa%2FULBGUn0Lok93l6mtHzo0snq5G0a%2BL4sEH8%2F8%3D&reserved=0
> > > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --
> Anupam.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.r-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R does not run under latest RStudio

2023-04-06 Thread Sorkin, John
I have also had difficulty running R in RStudio. Has anyone else had problems?
 It will be a shame if we need to abandon R Studio. It is a very good IDE.
John

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to 
faxing)

On Apr 6, 2023, at 5:30 PM, David Winsemius  wrote:


On 4/6/23 03:49, Steven Yen wrote:
The RStudio list generally does not respond to free version users. I was hoping 
someone one this (R) list would be kind enough to help me.


I don't think that is true. It is perhaps true that you cannot get personalized 
help from employed staff, but you can certainly submit to the Q&A forum.


--

David


Steven from iPhone

On Apr 6, 2023, at 6:22 PM, Uwe Ligges  wrote:

No, but you need to ask on an RStudio mailing list.
This one is about R.

Best,
Uwe Ligges




On 06.04.2023 11:28, Steven T. Yen wrote:
I updated to latest RStudio (RStudio-2023.03.0-386.exe) but
R would not run. Error message:
Error Starting R
The R session failed to start.
RSTUDIO VERSION
RStudio 2023.03.0+386 "Cherry Blossom " (3c53477a, 2023-03-09) for Windows
[No error available]
I also tried RStudio 2022.12.0+353 --- same problem.
I then tried another older version of RStudio (not sure version
as I changed file name by accident) and R ran.
Any clues? Please help. Thanks.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C93ce6a082163463da71b08db36e62f3c%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638164134503963420%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EpNjOFynmxiXP3%2FkBx73iTmJJSX2cBXl92waOopal0A%3D&reserved=0
PLEASE do read the posting guide 
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C93ce6a082163463da71b08db36e62f3c%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638164134503963420%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iOZi2L%2F6B9B3RawVWM5dZ8iJV3SeAJ1K8j5cq38m%2BAA%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C93ce6a082163463da71b08db36e62f3c%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638164134503963420%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EpNjOFynmxiXP3%2FkBx73iTmJJSX2cBXl92waOopal0A%3D&reserved=0
PLEASE do read the posting guide 
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C93ce6a082163463da71b08db36e62f3c%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638164134503963420%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iOZi2L%2F6B9B3RawVWM5dZ8iJV3SeAJ1K8j5cq38m%2BAA%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C93ce6a082163463da71b08db36e62f3c%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638164134503963420%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=EpNjOFynmxiXP3%2FkBx73iTmJJSX2cBXl92waOopal0A%3D&reserved=0
PLEASE do read the posting guide 
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C93ce6a082163463da71b08db36e62f3c%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638164134503963420%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iOZi2L%2F6B9B3RawVWM5dZ8iJV3SeAJ1K8j5cq38m%2BAA%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-proje

Re: [R] R does not run under latest RStudio

2023-04-06 Thread John Dougherty via R-help
On Thu, 6 Apr 2023 17:28:32 +0800
"Steven T. Yen"  wrote:

> I updated to latest RStudio (RStudio-2023.03.0-386.exe) but
> R would not run. Error message:
> 
> Error Starting R
> The R session failed to start.
> 
> RSTUDIO VERSION
> RStudio 2023.03.0+386 "Cherry Blossom " (3c53477a, 2023-03-09) for
> Windows [No error available]
> 
> I also tried RStudio 2022.12.0+353 --- same problem.
> 
> I then tried another older version of RStudio (not sure version
> as I changed file name by accident) and R ran.
> 
> Any clues? Please help. Thanks.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.
> 

Just to be thorough, what version of R are you running.  RStudio is its
own project, and they have shifted their emphasis somewhat regarding R
somewhat.  The web site now states that the organization - now
called Posit - is not de-emphasizing R so much as extending to empbrase
Python.  The current version of RStudio requires R 3.3.0 or later.

JWDougherty

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R does not run under latest RStudio

2023-04-06 Thread John C Frain
Does R run from a command prompt?  If so, the problem is likely due to your
Rstudio setup.  If R does not run from a command prompt, any error messages
might give some idea of the problem.  I can run R and Rstudio in Windows
11?, Windows 10 and the current version of Linux Mint.

On Thu 6 Apr 2023, 11:31 Uwe Ligges, 
wrote:

> No, but you need to ask on an RStudio mailing list.
> This one is about R.
>
> Best,
> Uwe Ligges
>
>
>
>
> On 06.04.2023 11:28, Steven T. Yen wrote:
> > I updated to latest RStudio (RStudio-2023.03.0-386.exe) but
> > R would not run. Error message:
> >
> > Error Starting R
> > The R session failed to start.
> >
> > RSTUDIO VERSION
> > RStudio 2023.03.0+386 "Cherry Blossom " (3c53477a, 2023-03-09) for
> Windows
> > [No error available]
> >
> > I also tried RStudio 2022.12.0+353 --- same problem.
> >
> > I then tried another older version of RStudio (not sure version
> > as I changed file name by accident) and R ran.
> >
> > Any clues? Please help. Thanks.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simple Stacking of Two Columns

2023-04-03 Thread Sparks, John
Hi R-Helpers,

Sorry to bother you, but I have a simple task that I can't figure out how to do.

For example, I have some names in two columns

NamesWide<-data.frame(Name1=c("Tom","Dick"),Name2=c("Larry","Curly"))

and I simply want to get a single column
NamesLong<-data.frame(Names=c("Tom","Dick","Larry","Curly"))
> NamesLong
  Names
1   Tom
2  Dick
3 Larry
4 Curly


Stack produces an error
NamesLong<-stack(NamesWide$Name1,NamesWide$Names2)
Error in if (drop) { : argument is of length zero

So does bind_rows
> NamesLong<-dplyr::bind_rows(NamesWide$Name1,NamesWide$Name2)
Error in `dplyr::bind_rows()`:
! Argument 1 must be a data frame or a named atomic vector.
Run `rlang::last_error()` to see where the error occurred.

I tried making separate dataframes to get around the error in bind_rows but it 
puts the data in two different columns
Name1<-data.frame(c("Tom","Dick"))
Name2<-data.frame(c("Larry","Curly"))
NamesLong<-dplyr::bind_rows(Name1,Name2)
> NamesLong
  c..TomDick.. c..LarryCurly..
1  Tom
2 Dick
3Larry
4Curly

gather makes no change to the data
NamesLong<-gather(NamesWide,Name1,Name2)
> NamesLong
  Name1 Name2
1   Tom Larry
2  Dick Curly


Please help me solve what should be a very simple problem.

Thanks,
John Sparks





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to test the difference between paired correlations?

2023-03-23 Thread John C Frain
   1. estimate r
   2. do the z transformation - z is a simple function of r - z has an
   approximate standard normal distribution.
   3. use the normal distribution tables to decide on the significance of z
   or of differences between two z's.

I don't see the need for packages.
John C Frain
3 Aranleigh Park
Rathfarnham
Dublin 14
Ireland
www.tcd.ie/Economics/staff/frainj/home.html
https://jcfrain.wordpress.com/
https://jcfraincv19.wordpress.com/

mailto:fra...@tcd.ie
mailto:fra...@gmail.com


On Thu, 23 Mar 2023 at 09:30, Luigi Marongiu 
wrote:

> Thank you, but this now sounds more difficult: what would be the point
> in having these ready-made functions if I have to do it manually?
> Anyway, How would I implement the last part?
>
> On Thu, Mar 23, 2023 at 1:23 AM Ebert,Timothy Aaron 
> wrote:
> >
> > If you are open to other options:
> > The null hypothesis is that there is no difference.
> >If I have two equations y=x and y=z and there is no difference then
> it would not matter if an observation was from x or z.
> >Randomize the x and z observations. For each randomization calculate
> a correlation for y=x and for y=z.
> >At each iteration calculate the absolute value of the difference in
> the correlations.
> >Generate a frequency distribution from 100,000+ randomizations.
> >Find the observed difference in the frequency from random
> distributions.
> >What proportion of observations are as large or larger than the
> observed. This is your p-value.
> >
> > Tim
> >
> > -Original Message-
> > From: R-help  On Behalf Of Luigi Marongiu
> > Sent: Wednesday, March 22, 2023 5:12 PM
> > To: r-help 
> > Subject: [R] How to test the difference between paired correlations?
> >
> > [External Email]
> >
> > Hello,
> > I have three numerical variables and I would like to test if their
> correlation is significantly different.
> > I have seen that there is a package that "Test the difference between
> two (paired or unpaired) correlations".
> > [
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.personality-project.org%2Fr%2Fhtml%2Fpaired.r.html&data=05%7C01%7Ctebert%40ufl.edu%7C35f2e7d6d9e844553c6408db2b1a337f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638151163767327230%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=S5T%2F1r%2BotV2BeL7S8bQFR0Avi4jDOuRX8N7LxACA6jg%3D&reserved=0
> ]
> > However, there is the need to convert the correlations to "z scores
> using the Fisher r-z transform". I have seen that there is another package
> that does that [
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsearch.r-project.org%2FCRAN%2Frefmans%2FDescTools%2Fhtml%2FFisherZ.html&data=05%7C01%7Ctebert%40ufl.edu%7C35f2e7d6d9e844553c6408db2b1a337f%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638151163767327230%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gI3vIHV5UnFbPSmeMyuCVvg9hpFCdF33qNgAXmOQOXU%3D&reserved=0
> ].
> > Yet, I do not understand how to process the data. Shall I pass the raw
> data or the correlations directly?
> >
> > I have made the following working example:
> > ```
> > # define data
> > v1 <- c(62.480,  59.492,  74.060,  88.519,  91.417,  53.907,  64.202,
> 62.426,
> > 54.406,  88.117)
> > v2 <- c(56.814, 42.005, 56.074, 65.990, 81.572, 53.855, 50.335, 63.537,
> 41.713,
> > 78.265)
> > v3 <- c(54.170,  64.224,  57.569,  85.089, 104.056,  48.713,  61.239,
> 60.290,
> > 67.308,  71.179)
> > # visual exploration
> > par(mfrow=c(2, 1))
> > plot(v2~v1, ylim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))),
> >  xlim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))),
> >  main="V1 vs V2")
> > abline(lm(v2~v1))
> > plot(v3~v1, ylim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))),
> >  xlim=c(min(c(v1,v2,v3)), max(c(v1,v2,v3))),
> >  main="V1 vs V3")
> > abline(lm(v3~v1))
> > ## test differences in correlation
> > # convert raw data into z-scores
> > library(psych)
> > library(DescTools)
> > FisherZ(v1) # I cannot convert the raw data into z scores (same for the
> other variables):
> > > [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN Warning message:
> > > In log((1 + rho)/(1 - rho)) : NaNs produced
> > # convert correlations into z scores
> > # (the correlation score of 0.79 has been converted into 1.08; is this
> correct?)
> > FisherZ(lm(v2~v1)$coefficients[2])
> > >  v1
> > > 1.081667
>

Re: [R] loess plotting problem

2023-03-23 Thread John Fox

Dear ,

On 2023-03-23 11:08 a.m., Anupam Tyagi wrote:

Thanks, John.

However, loess.smooth() is producing a very different curve compared to 
the one that results from applying predict() on a loess(). I am guessing 
they are using different defaults. Correct?


No need to guess. Just look at the help pages ?loess and ?loess.smooth. 
If you don't like the default for loess.smooth(), just specify the 
arguments you want.


Best,
 John




On Thu, 23 Mar 2023 at 20:20, John Fox <mailto:j...@mcmaster.ca>> wrote:


Dear Anupam Tyagi,

You didn't include your data, so it's not possible to see exactly what
happened, but I think that you misunderstand the object that loess()
returns. It returns a "loess" object with several components, including
the original data in x and y. So if pass the object to lines(), you'll
simply connect the points, and if x isn't sorted, the points won't
be in
order. Try, e.g.,

plot(speed ~ dist, data=cars)
m <- loess(speed ~ dist, data=cars)
names(m)
lines(m)

You'd do better to use loess.smooth(), which is intended for adding a
loess regression to a scatterplot; for example,

plot(speed ~ dist, data=cars)
with(cars, lines(loess.smooth(dist, speed)))

Other points: You don't have to load the stats package which is
available by default when you start R. It's best to avoid attach(), the
    use of which can cause confusion.

I hope this helps,
   John

-- 
* preferred email: john.david@proton.me

<mailto:john.david@proton.me>
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/ <https://www.john-fox.ca/>

On 2023-03-23 10:18 a.m., Anupam Tyagi wrote:
 > For some reason the following code is not plotting as I want it
to. I want
 > to plot a "loess" line plotted over a scatter plot. I get a
jumble, with
 > lines connecting all the points. I had a similar problem with
"lowess". I
 > solved that by dropping "NA" rows from the data columns. Please help.
 >
 > library(stats)
 > attach(gini_pci_wdi_narm)
 > plot(ny_gnp_pcap_pp_kd, si_pov_gini)
 > lines(loess(si_pov_gini ~ ny_gnp_pcap_pp_kd, gini_pci_wdi_narm))
 > detach(gini_pci_wdi_narm)
 >



--
Anupam.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] loess plotting problem

2023-03-23 Thread John Fox

Dear Anupam Tyagi,

You didn't include your data, so it's not possible to see exactly what 
happened, but I think that you misunderstand the object that loess() 
returns. It returns a "loess" object with several components, including 
the original data in x and y. So if pass the object to lines(), you'll 
simply connect the points, and if x isn't sorted, the points won't be in 
order. Try, e.g.,


plot(speed ~ dist, data=cars)
m <- loess(speed ~ dist, data=cars)
names(m)
lines(m)

You'd do better to use loess.smooth(), which is intended for adding a 
loess regression to a scatterplot; for example,


plot(speed ~ dist, data=cars)
with(cars, lines(loess.smooth(dist, speed)))

Other points: You don't have to load the stats package which is 
available by default when you start R. It's best to avoid attach(), the 
use of which can cause confusion.


I hope this helps,
 John

--
* preferred email: john.david@proton.me
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/

On 2023-03-23 10:18 a.m., Anupam Tyagi wrote:

For some reason the following code is not plotting as I want it to. I want
to plot a "loess" line plotted over a scatter plot. I get a jumble, with
lines connecting all the points. I had a similar problem with "lowess". I
solved that by dropping "NA" rows from the data columns. Please help.

library(stats)
attach(gini_pci_wdi_narm)
plot(ny_gnp_pcap_pp_kd, si_pov_gini)
lines(loess(si_pov_gini ~ ny_gnp_pcap_pp_kd, gini_pci_wdi_narm))
detach(gini_pci_wdi_narm)



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error: 'format_glimpse' is not an exported object from 'namespace:pillar'

2023-03-21 Thread Sorkin, John
I am receiving the following error message. I don't understand what it means, 
and I don't know how to fix it. I am running my code in R studio. I do not know 
if the error comes from R or RStudio. Please see session data below,
Thank you,
John


version data:
platform   x86_64-w64-mingw32  
arch   x86_64  
os mingw32 
system x86_64, mingw32 
status 
major  3   
minor  6.1 
year   2019
month  07  
day05  
svn rev76782   
language   R   
version.string R version 3.6.1 (2019-07-05)
nickname   Action of the Toes  

Rstudio.version()
$mode
[1] "desktop"

$version
[1] ‘2023.3.0.386’

$long_version
[1] "2023.03.0+386"

$release_name
[1] "Cherry Blossom"
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Good Will Legal Question

2023-03-21 Thread John Fox

Dear Timothy,

On 2023-03-21 1:38 p.m., Ebert,Timothy Aaron wrote:

My guess: It I clear from the link that they can use the R logo for commercial purposes. The issue 
is what to do about the "appropriate credit" and "link to the license." How 
would I do that on a hoodie? Would they need a web address or something?


That's a good question, and one that I missed -- the implicit focus is 
on using the logo, e.g., in software.


With the caveat that I'm not speaking for the R Foundation, I think that 
it would be sufficient to provide credit and a link to the license on 
the webpage that sells the hoodie. FWIW, I (and I expect you) have seen 
many t-shirts, etc., with R logos, some from companies, and I even have 
a few. I doubt that anyone will care.


Best,
 John



-Original Message-----
From: R-help  On Behalf Of John Fox
Sent: Tuesday, March 21, 2023 1:19 PM
To: Coding Hoodies 
Cc: r-help@r-project.org
Subject: Re: [R] Good Will Legal Question

[External Email]

Dear Arid Sweeting,

R-help is probably not the place to ask this question, although perhaps since 
you're seeking moral advice, people might want to say something. I would 
normally expect to see a query like this addressed to the R website webmasters, 
of which I'm one -- with the caveat that the R Foundation doesn't give legal 
advice.

Just to be sure, you say that you read the rules for use of the R logo, so I assume that you've 
seen 
<https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.r-project.org%2Flogo%2F&data=05%7C01%7Ctebert%40ufl.edu%7C99f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638150166126816193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jNvmCKITcZFcmqiRqkjqZnJVY3TYuD3wu3Mp0zhSHPs%3D&reserved=0>,
 which seems entirely clear to me. I think that it's safe to say that if the R Foundation wanted 
to limit commercial use of the R logo, it wouldn't have released it under the CC-BY-SA 4.0 
license. I'm not sure what moral issues concern you.

I hope this helps,
   John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: 
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsocialsciences.mcmaster.ca%2Fjfox%2F&data=05%7C01%7Ctebert%40ufl.edu%7C99f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638150166126816193%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=iLeUGFcyjk3kYNi2v8fV1jgc9M9OVdWYv9nJeI1G7Q4%3D&reserved=0

On 2023-03-21 6:18 a.m., Coding Hoodies wrote:

Hi R Team!,

We are opening a new start up soon, codinghoodies.com, we want to make coders 
feel stylish.

Out of goodwill I wanted to ask you formally if I can have permission to use 
the standard R logo on the front of hoodies to sell? I have read your rules but 
wanted to ask as I feel a moral right to email you asking to show support and 
respect for the R project.

If it makes it easier I could build send a picture of the hoodie with the logo 
on to you to see if this is acceptable.

Arid Sweeting




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu
%7C99f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84
%7C0%7C0%7C638150166126972400%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sda
ta=p2ffNKEh6intBdGjjtr6jaaaRcdtiBw4iMI1CL6K9Xg%3D&reserved=0
PLEASE do read the posting guide
https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r
-project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7C99
f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84%7C0%
7C0%7C638150166126972400%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=bg
OZVdlLFSw3mbQGmF0OLrMOVUcYonH9wHMN3Y2TqDM%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu%7C99f01774c9f5452bd99a08db2a31ec23%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C638150166126972400%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=p2ffNKEh6intBdGjjtr6jaaaRcdtiBw4iMI1CL6K9Xg%3D&reserved=0
PLEASE do read the posting guide 
https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=

Re: [R] DOUBT

2023-03-21 Thread John Fox

Dear Nandiniraj,

Please cc r-help in your emails so that others can see what happened 
with your problem.


You don't provide enough information to know what exactly is the source 
of your problem  -- you're more likely to get effective help if you 
provide a minimal reproducible example of the problem -- but it's a good 
guess that the variable (HHsize or perhaps some other variable) isn't in 
the newdata data frame.


Best,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/

On 2023-03-21 1:24 p.m., Nandini raj wrote:

I removed space even though it is showing error. I.e Variable not found

Nandiniraj

On Tue, Mar 21, 2023, 10:36 PM John Fox <mailto:j...@mcmaster.ca>> wrote:


Dear Nandini raj,

You have a space in the variable name "HH size".

    I hope this helps,
   John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/
<https://socialsciences.mcmaster.ca/jfox/>

On 2023-03-20 1:16 p.m., Nandini raj wrote:
 > Respected sir/madam
 > can you please suggest what is an unexpected symbol in the below
code for
 > running a multinomial logistic regression
 >
 > model <- multinom(adoption ~ age + education + HH size +
landholding +
 > Farmincome + nonfarmincome + creditaccesibility + LHI, data=newdata)
 >
 >       [[alternative HTML version deleted]]
 >
 > __
 > R-help@r-project.org <mailto:R-help@r-project.org> mailing list
-- To UNSUBSCRIBE and more, see
 > https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
 > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
 > and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Good Will Legal Question

2023-03-21 Thread John Fox

Dear Arid Sweeting,

R-help is probably not the place to ask this question, although perhaps 
since you're seeking moral advice, people might want to say something. I 
would normally expect to see a query like this addressed to the R 
website webmasters, of which I'm one -- with the caveat that the R 
Foundation doesn't give legal advice.


Just to be sure, you say that you read the rules for use of the R logo, 
so I assume that you've seen <https://www.r-project.org/logo/>, which 
seems entirely clear to me. I think that it's safe to say that if the R 
Foundation wanted to limit commercial use of the R logo, it wouldn't 
have released it under the CC-BY-SA 4.0 license. I'm not sure what moral 
issues concern you.


I hope this helps,
 John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

On 2023-03-21 6:18 a.m., Coding Hoodies wrote:

Hi R Team!,
  
We are opening a new start up soon, codinghoodies.com, we want to make coders feel stylish.
  
Out of goodwill I wanted to ask you formally if I can have permission to use the standard R logo on the front of hoodies to sell? I have read your rules but wanted to ask as I feel a moral right to email you asking to show support and respect for the R project.
  
If it makes it easier I could build send a picture of the hoodie with the logo on to you to see if this is acceptable.
  
Arid Sweeting
  
  
  


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] DOUBT

2023-03-21 Thread John Fox

Dear Nandini raj,

You have a space in the variable name "HH size".

I hope this helps,
 John

John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

On 2023-03-20 1:16 p.m., Nandini raj wrote:

Respected sir/madam
can you please suggest what is an unexpected symbol in the below code for
running a multinomial logistic regression

model <- multinom(adoption ~ age + education + HH size + landholding +
Farmincome + nonfarmincome + creditaccesibility + LHI, data=newdata)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trying to learn how to write an "advanced" function

2023-03-16 Thread Sorkin, John
Although I owe thanks to Ramus and Ivan, I still do not know how to write and 
"advanced" function. 

My most recent try (after looking at the material Ramus and Ivan set) still 
does not work. I am trying to run the lm function on two different formulae:
1) y~x, 
2) y~x+z
Any corrections would be appreciated!

Thank you,
John


doit <- function(x){
  ds <- deparse(substitute(x))
  cat("1\n")
  print(ds)
  eval(lm(quote(ds)),parent.frame())
}

# define data that will be used in regression
y <- 1:10
x <- y+rnorm(10)
z <- c(rep(1,5),rep(2,5))
# Show what x, y  and z look like
rbind(x,y,z)

# run formula y~x
JD <- doit(y~x)
JD

# run formula y~x+z
JD2 <- doit(y~x+z)
JD2




From: R-help  on behalf of Rasmus Liland 

Sent: Thursday, March 16, 2023 8:42 AM
To: r-help
Subject: Re: [R] Trying to learn how to write an "advanced" function

On 2023-03-16 12:11 +, Sorkin, John wrote:
> (1) can someone point me to an
> explanation of match.call or match
> that can be understood by the
> uninitiated?

Dear John,

the man page ?match tells us that match
matches the first vector against the
second, and returns a vector of indecies
the same length as the first, e.g.

> match(c("formula", "data", "subset", "weights", "na.action", 
"offset"), c("Maryland", "formula", "data", "subset", "weights", "na.action", 
"offset", "Sorkin", "subset"), 0L)
[1] 2 3 4 5 6 7

perhaps a bad answer ...

> (2) can someone point me to a document
> that will help me learn how to write
> an "advanced" function?

Perhaps the background here is looking
at the lm function as a basis for
writing something more advanced, then
the exercise becomes looking at
dput(lm), understanding every line by
looking up all the functions you do not
understand in the man pages e.g. ?match.
Remember, you can search for things
inside R by using double questionmark,
??match, finding versions of match
existing inside other installed
packages, e.g.  raster::match and
posterior::match, perhaps this exercise
becomes writing ones own version of lm
inside ones own package?

Best,
Rasmus

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C8e0b6e6627474ceceed608db261ca383%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638145676673285449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=coJy2i9Nj%2Fs23ElOAM7kaYpTTBSKDo5B557tNf2twSA%3D&reserved=0
PLEASE do read the posting guide 
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7C8e0b6e6627474ceceed608db261ca383%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638145676673285449%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=eByabBUy5c7zdefrSSq3xbgjMTcsxwbBGD33lTwv4Pg%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trying to learn how to write an "advanced" function

2023-03-16 Thread Sorkin, John
I am trying to understand how to write an "advanced" function. To do so, I am 
examining the lm fucnction, a portion of which is pasted below. I am unable to 
understand what match.call or   match does, and several other parts of lm, even 
when I read the help page for match.call or match. 
(1) can someone point me to an explanation of match.call or match that can be 
understood by the uninitiated? 
(2) can someone point me to a document that will help me learn how to write an 
"advanced" function?

Thank you,
John

> lm
function (formula, data, subset, weights, na.action, method = "qr", 
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, 
contrasts = NULL, offset, ...) 
{
ret.x <- x
ret.y <- y
cl <- match.call()
mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data", "subset", "weights", "na.action", 
"offset"), names(mf), 0L)
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trying to learn how to write a function

2023-03-16 Thread Sorkin, John
I am trying to understand how to write an "advanced" function. To do this, I am 
examining the code of lm, a small part of the lm code is below. N

> lm
function (formula, data, subset, weights, na.action, method = "qr", 
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, 
contrasts = NULL, offset, ...) 
{
ret.x <- x
ret.y <- y
cl <- match.call()
mf <- match.call(expand.dots = FALSE)
m <- match(c("formula", "data", "subset", "weights", "na.action", 
"offset"), names(mf), 0L)

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tcl tk: set the position button

2023-03-13 Thread John Fox

Dear Rodrigo,

Try tkwm.geometry(win1, "-0+0"), which should position win1 at the top 
right.


I hope this helps,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

On 2023-03-12 8:41 p.m., Rodrigo Badilla wrote:

Hi all,
I am using tcltk2 library to show buttons and messages. Everything
work fine but I would like set the tk2button to the right of my screen, by 
default it display at the left of my screen.
my script example:
library(tcltk2) win1 <- tktoplevel() butOK <- tk2button(win1, text = "TEST", 
width = 77) tkgrid(butOK)
Thanks in advance
Saludos
Rodrigo


--
Este correo electrónico ha sido analizado en busca de virus por el software 
antivirus de Avast.
www.avast.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Shaded area

2023-03-03 Thread John Kane
As Peter says, the list is very cautious about what types of files it
allows. A handy way to supply some sample data is the dput() function.  In
the case of a large dataset something like dput(head(mydata, 100)) should
supply the data we need. Just do dput(mydata) where *mydata* is your data.
Copy the output and paste it here.

On Wed, 1 Mar 2023 at 09:58, PIKAL Petr  wrote:

> Hallo
>
> Excel attachment is not allowed here, but shading area is answered many
> times elsewhere. Use something like . "shading area r" in google.
>
> See eg.
> https://www.geeksforgeeks.org/how-to-shade-a-graph-in-r/
>
> Cheers Petr
>
> -Original Message-
> From: R-help  On Behalf Of George Brida
> Sent: Wednesday, March 1, 2023 3:21 PM
> To: r-help@r-project.org
> Subject: [R] Shaded area
>
> Dear R users,
>
> I have an xlsx file (attached to this mail) that shows the values of a
> "der" series observed on a daily basis from January 1, 2017 to January 25,
> 2017. This series is strictly positive during two periods: from January 8,
> 2017 to January 11, 2017 and from January 16, 2017 to January 20, 2017. I
> would like to plot the series with two shaded areas corresponding to the
> positivity of the series. Specifically, I would like to draw 4 vertical
> lines intersecting the x-axis in the 4 dates mentioned above and shade the
> two areas of positivity. Thanks for your help.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních
> partnerů PRECHEZA a.s. jsou zveřejněny na:
> https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information
> about processing and protection of business partner’s personal data are
> available on website:
> https://www.precheza.cz/en/personal-data-protection-principles/
> Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou
> důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení
> odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any
> documents attached to it may be confidential and are subject to the legally
> binding disclaimer: https://www.precheza.cz/en/01-disclaimer/
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generic Function read?

2023-02-28 Thread John Kane
Have a look at the {rio} package.

On Tue, 28 Feb 2023 at 15:00, Leonard Mada via R-help 
wrote:

> Dear R-Users,
>
> I noticed that *read* is not a generic function. Although it could
> benefit from the functionality available for generic functions:
>
> read = function(file, ...) UseMethod("read")
>
> methods(read)
>   # [1] read.csv read.csv2read.dcf read.delim read.delim2
> read.DIF read.fortran
>   # [8] read.ftable  read.fwf read.socket  read.table
>
> The users would still need to call the full function name. But it seems
> useful to be able to find rapidly what formats can be read; including
> with other packages (e.g. for Excel, SAS, ... - although most packages
> do not adhere to the generic naming convention, but maybe they will
> change in the future).
>
> Note:
> This should be possible (even though impractical), but actually does NOT
> work:
> read = function(file, ...) UseMethod("read")
> file = "file.csv"
> class(file) = c("csv", class(file));
> read(file)
>
> Should it not work?
>
> Sincerely,
>
> Leonard
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MFA variables graph, filtered by separate.analyses

2023-02-21 Thread John Fox

Dear gavin,

I think that it's likely that Jim meant the hetcor() function in the 
polycor package.


Best,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/

On 2023-02-21 5:42 p.m., gavin duley wrote:

Hi Jim,

On Tue, 21 Feb 2023 at 22:17, Jim Lemon  wrote:

I can't work through this right now, but I would start by looking at
the 'hetcor' package to get the correlations, or if they are already
in the return object, build a plot from these.


Thanks for the suggestion. I'll read up on the 'hetcor' package.

Thanks,
gavin,



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bug in R-Help Archives?

2023-01-27 Thread Sorkin, John
My apologies,
I did not mean to be part of the discussion. If there is such a thing as a 
pocket email (similar to a pocket dial) the email would be classified as a 
pocket email.
John


From: R-help  on behalf of Rui Barradas 

Sent: Friday, January 27, 2023 10:15 AM
To: Ivan Krylov
Cc: R-help Mailing List
Subject: Re: [R] Bug in R-Help Archives?

Às 07:36 de 27/01/2023, Ivan Krylov escreveu:
> On Fri, 27 Jan 2023 13:01:39 +0530
> Deepayan Sarkar  wrote:
>
>>  From looking at the headers in John Sorkin's mail, my guess is that he
>> just replied to the other thread rather than starting a fresh email,
>> and in his attempts to hide that, was outsmarted by Outlook.
>
> That's 100% correct. The starting "Pipe operator" e-mail has
> In-Reply-To: <047e01d91ed5$577e42a0$067ac7e0$@yahoo.com>, and the
> message with this Message-ID is the one from Mukesh Ghanshyamdas
> Lekhrajani with the subject "Re: [R] R Certification" that's
> immediately above the message by John Sorkin.
>
Thanks, I was searching the archives for something else, stumbled on
that and forgot to look at the heders.
Good news there's nothing wrong with R-Help.

Rui Barradas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7CJSorkin%40som.umaryland.edu%7Ca90bca3f346f470c472808db007a65cd%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638104297929279937%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=8CGlDg%2Fdkx28raPOalXjZ7NqN%2BP%2BoWo9UFL%2Boc6NBRU%3D&reserved=0
PLEASE do read the posting guide 
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7CJSorkin%40som.umaryland.edu%7Ca90bca3f346f470c472808db007a65cd%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638104297929279937%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mira%2F3jlC1V3jAJvBiqw53EpaCJknQ1W77NY7jTzfyA%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] return value of {....}

2023-01-15 Thread Sorkin, John
Avi,

Please do not mistake my posting as being a BASHING of R. I greatly admire R 
and the progress it has made from its roots in S. I thank the may people who 
contribute to the development and growth of R. 

Just because a language allows a given syntax does not mean (1) that the 
language is bad or (2) that the syntax should be used (except in rare 
occasions). There may well be a few occasions when using a global variable in a 
function makes sense and in that instance the global variable should be used, 
and the usage should be documented in the comments that are part of the source 
code. Please note that I stated "A general recommendation  use of a global 
variable in a function"; I used the words "recommendation is to AVOID". I did 
not, and would not forbid the use of a global variable in a function. 

English has the word "or" which is not clearly defined; is it an exclusive or 
or an inclusive or? I don't bash English because of this semantic ambiguity. 
What I try to do is make certain that when it is essential to understand if the 
"or" I use in a sentence in exclusive vs. inclusive (or conversely) I make 
certain my meaning is clear. e.g. You can use bleach or ammonia to clean the 
stain, but NEVER use ammonia and bleach together as the combination produces a 
deadly gas. I try to follow this philosophy when I program. I don't use global 
variables in a function unless there is an overwhelming reason to do so. When I 
do, I indicate that that a global variable has been used in my comments. In the 
same vane, I rarely use call by reference in my programs (when this is allowed 
by a programming language); I try to use call by value whenever possible as 
call by reference can be fraught. On the other hand when working with extremely 
large data objects (especially in the old days when I had 
 perhaps 20k of memory rather than 100 gig), I have used call by reference to 
save storage. Despite this, I know that call by reference is not recommended 
just as using a global variable in a function is not recommended, but can, and 
should be used when needed.

John 

From: R-help  on behalf of avi.e.gr...@gmail.com 

Sent: Sunday, January 15, 2023 10:53 PM
Cc: 'R help Mailing list'
Subject: Re: [R] return value of {}

Again, John, we are comparing different designs in languages that are often
decades old and partially retrofitted selectively over the years.

Is it poor form to use global variables? Many think so. Discussions have
been had on how to use variables hidden in various ways that are not global,
such as within a package.

But note R still has global assignment operators like <<- and its partner
->> that explicitly can even create a global variable that did not exist
before the function began and that persists for that session. This is
perhaps a special case of the assign() function which can do the same for
any designated environment.

Although it may sometimes be better form to avoid things like this, it can
also be worse form when you want to do something decentralized with no
control over passing additional arguments to all kinds of functions.

Some languages try to finesse their way past this by creating concepts like
closures that hold values and can even change the values without them being
globally visible. Some may use singleton objects or variables that are part
of a class rather than a single object (which is again a singleton.)

So is the way R allows really a bad thing, especially if rarely used?

All I know is MANY languages use scoping including functions that declare a
variable then create an inner function or many and return the inner
function(s) to the outside where the one getting it can later use that
function and access the variable and even use it as a way to communicate
with the multiple functions it got that were created in that incubator.
Nifty stuff but arguably not always as easy to comprehend!

This forum is not intended for BASHING any language, certainly not R. There
are many other languages to choose from and every one of them will have some
things others consider flaws. How many opted out of say a ++ operator as in
y = x++ for fairly good reasons and later added something like the Walrus
operator so you can now write y = (x := x + 1) as a way to do the same thing
and other things besides?

But to address your point, about a variable outside a function as defined in
a set of environments to search that includes a global context, I want to
note that it is just a set of maskings and your variable "x" can appear in
EVERY environment above you and you can get in real trouble if the order the
environments are put in place changes in some way. The arguably safer way
would be to get a specific value of x would be to not ask for it directly
but as get("x", envir=...) and specify the specific environment that ideally
is in existence

Re: [R] return value of {....}

2023-01-15 Thread Sorkin, John
Richard,
I sent my prior email too quickly:

A slight addition to your code shows an important aspect of R, local vs. global 
variables:

x <- 137
f <- function () {
   a <- x
   x <- 42
   b <- x
   list(a=a, b=b)
   }
 f()
print(x)

When run the program produces the following:

> x <- 137
> f <- function () {
+a <- x
+x <- 42
+b <- x
+list(a=a, b=b)
+}
>  f()
$a
[1] 137

$b
[1] 42

> print(x)
[1] 137

The fist x, a <- x, invokes an x variable that is GLOBAL. It is known both 
inside and outside the function.
The second x, x <- 42, defines an x that is LOCAL to the function, it is not 
known to the program that called the function. The LOCAL value of x is used in 
the expression  b <- x. As can be seen by the print(x) statement, the LOCAL 
value of x is NOT known by the program that calls the function. The class of a 
variable, scoping (i.e. local vs. variable) can be a source of subtle 
programming errors. A general recommendation is to AVOID use of a global 
variable in a function, i.e. don't use a variable in function that is not 
passed as a parameter to the function (as was done in the function above in the 
statment a <- x). If you need to use a variable in a function that is known by 
the program that calls the function, pass the variable as a argument to the 
function e.g. 

Use this code:

# Set values needed by function
y <- 2
b <- 30

myfunction <- function(a,b){
cat("a=",a,"b=",b,"\n")
  y <- a
  y2 <- y+b
  cat("y=",y,"y2=",y2,"\n")
}
# Call the function and pass all needed values to the function
myfunction(y,b)
 
Don't use the following code that depends on a global value that is known to 
the function, but not passed as a parameter to the function:

y <- 2
myNGfunction <- function(a){
  cat("a=",a,"b=",b,"\n")
  y <- a
  y2 <- y+b
  cat("y=",y,"y2=",y2,"\n")
}
# b is a global variable and will be know to the function, 
# but should be passed as a parameter as in example above.
b <- 100
myNGfunction(y)

John


From: R-help  on behalf of Sorkin, John 

Sent: Sunday, January 15, 2023 7:40 PM
To: Richard O'Keefe; Valentin Petzel
Cc: R help Mailing list
Subject: Re: [R] return value of {}

Richard,
A slight addition to your code shows an important aspect of R, local vs. global 
variables:

x <- 137
f <- function () {
   a <- x
   x <- 42
   b <- x
   list(a=a, b=b)
   }
 f()
print(x)


From: R-help  on behalf of Richard O'Keefe 

Sent: Sunday, January 15, 2023 6:39 PM
To: Valentin Petzel
Cc: R help Mailing list
Subject: Re: [R] return value of {}

I wonder if the real confusino is not R's scope rules?
(begin .) is not Lisp, it's Scheme (a major Lisp dialect),
and in Scheme, (begin (define x ...) (define y ...) ...)
declares variables x and y that are local to the (begin ...)
form, just like Algol 68.  That's weirdness 1.  Javascript
had a similar weirdness, when the ECMAscript process eventually
addressed.  But the real weirdness in R is not just that the
existence of variables is indifferent to the presence of curly
braces, it's that it's *dynamic*.  In
f <- function (...) {
   ... use x ...
   x <- ...
   ... use x ...
}
the two occurrences of "use x" refer to DIFFERENT variables.
The first occurrence refers to the x that exists outside the
function.  It has to: the local variable does not exist yet.
The assignment *creates* the variable, so the second
occurrence of "use x" refers to the inner variable.
Here's an actual example.
> x <- 137
> f <- function () {
+ a <- x
+ x <- 42
+ b <- x
+ list(a=a, b=b)
+ }
> f()
$a
[1] 137
$b
[1] 42

Many years ago I set out to write a compiler for R, and this was
the issue that finally sank my attempt.  It's not whether the
occurrence of "use x" is *lexically* before the creation of x.
It's when the assignment is *executed* that makes the difference.
Different paths of execution through a function may result in it
arriving at its return point with different sets of local variables.
R is the only language I routinely use that does this.

So rule 1: whether an identifier in an R function refers to an
outer variable or a local variable depends on whether an assignment
creating that local variable has been executed yet.
And rule 2: the scope of a local variable is the whole function.

If the following transcript not only makes sense to you, but is
exactly what you expect, congratulations, you understand local
variables in R.

> x <- 0
> g <- function () {
+ n <- 10
+ r <- numeric(n)
+ for (i in 1:n) {
+ if (i == 6) x 

Re: [R] return value of {....}

2023-01-15 Thread Sorkin, John
Richard,
A slight addition to your code shows an important aspect of R, local vs. global 
variables:

x <- 137
f <- function () {
   a <- x
   x <- 42
   b <- x
   list(a=a, b=b)
   }
 f()
print(x)


From: R-help  on behalf of Richard O'Keefe 

Sent: Sunday, January 15, 2023 6:39 PM
To: Valentin Petzel
Cc: R help Mailing list
Subject: Re: [R] return value of {}

I wonder if the real confusino is not R's scope rules?
(begin .) is not Lisp, it's Scheme (a major Lisp dialect),
and in Scheme, (begin (define x ...) (define y ...) ...)
declares variables x and y that are local to the (begin ...)
form, just like Algol 68.  That's weirdness 1.  Javascript
had a similar weirdness, when the ECMAscript process eventually
addressed.  But the real weirdness in R is not just that the
existence of variables is indifferent to the presence of curly
braces, it's that it's *dynamic*.  In
f <- function (...) {
   ... use x ...
   x <- ...
   ... use x ...
}
the two occurrences of "use x" refer to DIFFERENT variables.
The first occurrence refers to the x that exists outside the
function.  It has to: the local variable does not exist yet.
The assignment *creates* the variable, so the second
occurrence of "use x" refers to the inner variable.
Here's an actual example.
> x <- 137
> f <- function () {
+ a <- x
+ x <- 42
+ b <- x
+ list(a=a, b=b)
+ }
> f()
$a
[1] 137
$b
[1] 42

Many years ago I set out to write a compiler for R, and this was
the issue that finally sank my attempt.  It's not whether the
occurrence of "use x" is *lexically* before the creation of x.
It's when the assignment is *executed* that makes the difference.
Different paths of execution through a function may result in it
arriving at its return point with different sets of local variables.
R is the only language I routinely use that does this.

So rule 1: whether an identifier in an R function refers to an
outer variable or a local variable depends on whether an assignment
creating that local variable has been executed yet.
And rule 2: the scope of a local variable is the whole function.

If the following transcript not only makes sense to you, but is
exactly what you expect, congratulations, you understand local
variables in R.

> x <- 0
> g <- function () {
+ n <- 10
+ r <- numeric(n)
+ for (i in 1:n) {
+ if (i == 6) x <- 100
+ r[i] <- x + i
+ }
+ r
+ }
> g()
 [1]   1   2   3   4   5 106 107 108 109 110


On Fri, 13 Jan 2023 at 23:28, Valentin Petzel  wrote:

> Hello Akshay,
>
> R is quite inspired by LISP, where this is a common thing. It is not in
> fact that {...} returned something, rather any expression evalulates to
> some value, and for a compound statement that is the last evaluated
> expression.
>
> {...} might be seen as similar to LISPs (begin ...).
>
> Now this is a very different thing compared to {...} in something like C,
> even if it looks or behaves similarly. But in R {...} is in fact an
> expression and thus has evaluate to some value. This also comes with some
> nice benefits.
>
> You do not need to use {...} for anything that is a single statement. But
> you can in each possible place use {...} to turn multiple statements into
> one.
>
> Now think about a statement like this
>
> f <- function(n) {
> x <- runif(n)
> x**2
> }
>
> Then we can do
>
> y <- f(10)
>
> Now, you suggested way would look like this:
>
> f <- function(n) {
> x <- runif(n)
> y <- x**2
> }
>
> And we'd need to do something like:
>
> f(10)
> y <- somehow_get_last_env_of_f$y
>
> So having a compound statement evaluate to a value clearly has a benefit.
>
> Best Regards,
> Valentin
>
> 09.01.2023 18:05:58 akshay kulkarni :
>
> > Dear Valentin,
> >   But why should {} "return" a value? It
> could just as well evaluate all the expressions and store the resulting
> objects in whatever environment the interpreter chooses, and then it would
> be left to the user to manipulate any object he chooses. Don't you think
> returning the last, or any value, is redundant? We are living in the
> 21st century world, and the R-core team might,I suppose, have a definite
> reason for"returning" the last value. Any comments?
> >
> > Thanking you,
> > Yours sincerely,
> > AKSHAY M KULKARNI
> >
> > 
> > *From:* Valentin Petzel 
> > *Sent:* Monday, January 9, 2023 9:18 PM
> > *To:* akshay kulkarni 
> > *Cc:* R help Mailing list 
> > *Subject:* Re: [R] return value of {}
> >
> > Hello Akshai,
> >
> > I think you are confusing {...} with local({...}). This one will
> evaluate the expression in a separate environment, returning the last
> expression.
> >
> > {...} simply evaluates multiple expressions as one and returns the
> result of the last line, but it still evaluates each expression.
> >
> > Assignment returns the assigned value, so we can chain assignments like
> this
> >
> > a <- 1 + (b <- 2)
> >
> > convenien

Re: [R] Removing variables from data frame with a wile card

2023-01-15 Thread Sorkin, John
I am new to this thread. At the risk of presenting something that has been 
shown before, below I demonstrate how a column in a data frame can be dropped 
using a wild card, i.e. a column whose name starts with "th" using nothing more 
than base r functions and base R syntax. While additions to R such as tidyverse 
can be very helpful, many things that they do can be accomplished simply using 
base R.  

# Create data frame with three columns
one <- rep(1,10)
one
two <- rep(2,10)
two
three <- rep(3,10)
three
mydata <- data.frame(one=one, two=two, three=three)
cat("Data frame with three columns\n")
mydata

# Drop the column whose name starts with th, i.e. column three
# Find the location of the column
ColumToDelete <- grep("th",colnames((mydata)))
cat("The colomumn to be dropped is the column called three, which is 
column",ColumToDelete,"\n")
ColumToDelete

# Drop the column whose name starts with "th"
newdata2 <- mydata[,-ColumnToDelete]
cat("Data frame after droping column whose name is three\n")
newdata2

I hope this helps.
John



From: R-help  on behalf of Valentin Petzel 

Sent: Saturday, January 14, 2023 1:21 PM
To: avi.e.gr...@gmail.com
Cc: 'R-help Mailing List'
Subject: Re: [R] Removing variables from data frame with a wile card

Hello Avi,

while something like d$something <- ... may seem like you're directly modifying 
the data it does not actually do so. Most R objects try to be immutable, that 
is, the object may not change after creation. This guarantees that if you have 
a binding for same object the object won't change sneakily.

There is a data structure that is in fact mutable which are environments. For 
example compare

L <- list()
local({L$a <- 3})
L$a

with

E <- new.env()
local({E$a <- 3})
E$a

The latter will in fact work, as the same Environment is modified, while in the 
first one a modified copy of the list is made.

Under the hood we have a parser trick: If R sees something like

f(a) <- ...

it will look for a function f<- and call

a <- f<-(a, ...)

(this also happens for example when you do names(x) <- ...)

So in fact in our case this is equivalent to creating a copy with removed 
columns and rebind the symbol in the current environment to the result.

The data.table package breaks with this convention and uses C based routines 
that allow changing of data without copying the object. Doing

d[, (cols_to_remove) := NULL]

will actually change the data.

Regards,
Valentin

14.01.2023 18:28:33 avi.e.gr...@gmail.com:

> Steven,
>
> Just want to add a few things to what people wrote.
>
> In base R, the methods mentioned will let you make a copy of your original DF 
> that is missing the items you are selecting that match your pattern.
>
> That is fine.
>
> For some purposes, you want to keep the original data.frame and remove a 
> column within it. You can do that in several ways but the simplest is 
> something where you sat the column to NULL as in:
>
> mydata$NAME <- NULL
>
> using the mydata["NAME"] notation can do that for you by using a loop of 
> unctional programming method that does that with all components of your grep.
>
> R does have optimizations that make this less useful as a partial copy of a 
> data.frame retains common parts till things change.
>
> For those who like to use the tidyverse, it comes with lots of tools that let 
> you select columns that start with or end with or contain some pattern and I 
> find that way easier.
>
>
>
> -Original Message-
> From: R-help  On Behalf Of Steven Yen
> Sent: Saturday, January 14, 2023 7:49 AM
> To: Andrew Simmons 
> Cc: R-help Mailing List 
> Subject: Re: [R] Removing variables from data frame with a wile card
>
> Thanks to all. Very helpful.
>
> Steven from iPhone
>
>> On Jan 14, 2023, at 3:08 PM, Andrew Simmons  wrote:
>>
>> You'll want to use grep() or grepl(). By default, grep() uses
>> extended regular expressions to find matches, but you can also use
>> perl regular expressions and globbing (after converting to a regular 
>> expression).
>> For example:
>>
>> grepl("^yr", colnames(mydata))
>>
>> will tell you which 'colnames' start with "yr". If you'd rather you
>> use globbing:
>>
>> grepl(glob2rx("yr*"), colnames(mydata))
>>
>> Then you might write something like this to remove the columns starting with 
>> yr:
>>
>> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]
>>
>>> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen  wrote:
>>>
>>> I have a data frame containing variables &

Re: [R] Removing variables from data frame with a wile card

2023-01-14 Thread John Kane
You rang sir?

library(tidyverse)
xx = 1:10
yr1 = yr2 = yr3 = rnorm(10)
dat1 <- data.frame(xx , yr1, yr2, y3)

dat1  %>%  select(!starts_with("yr"))

or for something a bit more exotic as I have been trying to learn a bit
about the "data.table package

library(data.table)

xx = 1:10
yr1 = yr2 = yr3 = rnorm(10)

dat2 <- data.table(xx , yr1, yr2, yr3)

dat2[, !names(dat2) %like% "yr", with=FALSE ]



On Sat, 14 Jan 2023 at 12:28,  wrote:

> Steven,
>
> Just want to add a few things to what people wrote.
>
> In base R, the methods mentioned will let you make a copy of your original
> DF that is missing the items you are selecting that match your pattern.
>
> That is fine.
>
> For some purposes, you want to keep the original data.frame and remove a
> column within it. You can do that in several ways but the simplest is
> something where you sat the column to NULL as in:
>
> mydata$NAME <- NULL
>
> using the mydata["NAME"] notation can do that for you by using a loop of
> unctional programming method that does that with all components of your
> grep.
>
> R does have optimizations that make this less useful as a partial copy of
> a data.frame retains common parts till things change.
>
> For those who like to use the tidyverse, it comes with lots of tools that
> let you select columns that start with or end with or contain some pattern
> and I find that way easier.
>
>
>
> -Original Message-
> From: R-help  On Behalf Of Steven Yen
> Sent: Saturday, January 14, 2023 7:49 AM
> To: Andrew Simmons 
> Cc: R-help Mailing List 
> Subject: Re: [R] Removing variables from data frame with a wile card
>
> Thanks to all. Very helpful.
>
> Steven from iPhone
>
> > On Jan 14, 2023, at 3:08 PM, Andrew Simmons  wrote:
> >
> > You'll want to use grep() or grepl(). By default, grep() uses
> > extended regular expressions to find matches, but you can also use
> > perl regular expressions and globbing (after converting to a regular
> expression).
> > For example:
> >
> > grepl("^yr", colnames(mydata))
> >
> > will tell you which 'colnames' start with "yr". If you'd rather you
> > use globbing:
> >
> > grepl(glob2rx("yr*"), colnames(mydata))
> >
> > Then you might write something like this to remove the columns starting
> with yr:
> >
> > mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]
> >
> >> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen  wrote:
> >>
> >> I have a data frame containing variables "yr3",...,"yr28".
> >>
> >> How do I remove them with a wild cardsomething similar to "del yr*"
> >> in Windows/doc? Thank you.
> >>
> >>> colnames(mydata)
> >>   [1] "year"   "weight" "confeduc"   "confothr" "college"
> >>   [6] ...
> >>  [41] "yr3""yr4""yr5""yr6" "yr7"
> >>  [46] "yr8""yr9""yr10"   "yr11" "yr12"
> >>  [51] "yr13"   "yr14"   "yr15"   "yr16" "yr17"
> >>  [56] "yr18"   "yr19"   "yr20"   "yr21" "yr22"
> >>  [61] "yr23"   "yr24"   "yr25"   "yr26" "yr27"
> >>  [66] "yr28"...
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pipe operator

2023-01-03 Thread Sorkin, John
Jeff,
Thank you for contributing important information to this thread. 


From: Jeff Newmiller 
Sent: Tuesday, January 3, 2023 2:07 PM
To: r-help@r-project.org; Sorkin, John; Ebert,Timothy Aaron; 'R-help Mailing 
List'
Subject: Re: [R] Pipe operator

The other responses here have been very good, but I felt it necessary to point 
out that the concept of a pipe originated around when you started programming 
[1] (text based). It did take awhile for it to migrate into programming 
languages such as OCaml, but Powershell makes extensive use of (object-based) 
pipes.

Re memory use: not so much. Variables are small... it is the data they point to 
that is large, and it is not possible to analyze data without storing it 
somewhere. But when the variables are numerous they can interfere with our 
ability to understand the program... using pipes lets us focus on results 
obtained after several steps so fewer intermediate values clutter the variable 
space.

Re speed: the magrittr pipe (%>%) is much slower than the built-in pipe at 
coordinating the transfer of data from left to right, but that is not usually 
significant compared to the computation speed on the actual data in the 
functions.

 [1] 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.m.wikipedia.org%2Fwiki%2FPipeline_&data=05%7C01%7Cjsorkin%40som.umaryland.edu%7C94e1ec7b93c642286aae08daedbdc79f%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638083696601759531%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gdooVKcK8iDNN0X6ZaYmDNk9pQ1Pe%2BgQiUGioPGB%2Fps%3D&reserved=0(Unix)#:~:text=The%20concept%20of%20pipelines%20was,Ritchie%20%26%20Thompson%2C%201974).

On January 3, 2023 9:13:22 AM PST, "Sorkin, John"  
wrote:
>Tim,
>
>Thank you for your reply. I did not know about the |> operator. Do both %>% 
>and |> work in base R?
>
>You suggested that the pipe operator can produce code with fewer variables. 
>May I ask you to send a short example in which the pipe operator saves 
>variables. Does said saving of variables speed up processing or result in less 
>memory usage?
>
>Thank you,
>John
>
>____
>From: Ebert,Timothy Aaron 
>Sent: Tuesday, January 3, 2023 12:07 PM
>To: Sorkin, John; 'R-help Mailing List'
>Subject: RE: Pipe operator
>
>The pipe shortens code and results in fewer variables because you do not have 
>to save intermediate steps. Once you get used to the idea it is useful. Note 
>that there is also the |> pipe that is part of base R. As far as I know it 
>does the same thing as %>%, or at my level of programing I have not 
>encountered a difference.
>
>Tim
>
>-Original Message-
>From: R-help  On Behalf Of Sorkin, John
>Sent: Tuesday, January 3, 2023 11:49 AM
>To: 'R-help Mailing List' 
>Subject: [R] Pipe operator
>
>[External Email]
>
>I am trying to understand the reason for existence of the pipe operator, %>%, 
>and when one should use it. It is my understanding that the operator sends the 
>file to the left of the operator to the function immediately to the right of 
>the operator:
>
>c(1:10) %>% mean results in a value of 5.5 which is exactly the same as the 
>result one obtains using the mean function directly, viz. mean(c(1:10)). What 
>is the reason for having two syntactically different but semantically 
>identical ways to call a function? Is one more efficient than the other? Does 
>one use less memory than the other?
>
>P.S. Please forgive what might seem to be a question with an obvious answer. I 
>am a programmer dinosaur. I have been programming for more than 50 years. When 
>I started programming in the 1960s the only pipe one spoke about was a bong.
>
>John
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Cjsorkin%40som.umaryland.edu%7C94e1ec7b93c642286aae08daedbdc79f%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638083696601759531%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jQx8iLm1i%2BQky6NTJ05AmhH6Fb6gJScFuafmEEFs2nM%3D&reserved=0
>PLEASE do read the posting guide 
>https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7Cjsorkin%40som.umaryland.edu%7C94e1ec7b93c642286aae08daedbdc79f%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638083696601759531%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jHwquXRkVY6hOIB7dKo4jcEiuA%

Re: [R] Pipe operator

2023-01-03 Thread Sorkin, John
Tim,

Thank you for your reply. I did not know about the |> operator. Do both %>% and 
|> work in base R?

You suggested that the pipe operator can produce code with fewer variables. May 
I ask you to send a short example in which the pipe operator saves variables. 
Does said saving of variables speed up processing or result in less memory 
usage?

Thank you,
John


From: Ebert,Timothy Aaron 
Sent: Tuesday, January 3, 2023 12:07 PM
To: Sorkin, John; 'R-help Mailing List'
Subject: RE: Pipe operator

The pipe shortens code and results in fewer variables because you do not have 
to save intermediate steps. Once you get used to the idea it is useful. Note 
that there is also the |> pipe that is part of base R. As far as I know it does 
the same thing as %>%, or at my level of programing I have not encountered a 
difference.

Tim

-Original Message-
From: R-help  On Behalf Of Sorkin, John
Sent: Tuesday, January 3, 2023 11:49 AM
To: 'R-help Mailing List' 
Subject: [R] Pipe operator

[External Email]

I am trying to understand the reason for existence of the pipe operator, %>%, 
and when one should use it. It is my understanding that the operator sends the 
file to the left of the operator to the function immediately to the right of 
the operator:

c(1:10) %>% mean results in a value of 5.5 which is exactly the same as the 
result one obtains using the mean function directly, viz. mean(c(1:10)). What 
is the reason for having two syntactically different but semantically identical 
ways to call a function? Is one more efficient than the other? Does one use 
less memory than the other?

P.S. Please forgive what might seem to be a question with an obvious answer. I 
am a programmer dinosaur. I have been programming for more than 50 years. When 
I started programming in the 1960s the only pipe one spoke about was a bong.

John

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Cjsorkin%40som.umaryland.edu%7Cdc0d677272114cf6ba2808daedad0ec5%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638083624783034240%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=7dDMSg%2FmPQ5xXP6zu6MWLmARdtdlrYWb3mXPZQj0La0%3D&reserved=0
PLEASE do read the posting guide 
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7Cjsorkin%40som.umaryland.edu%7Cdc0d677272114cf6ba2808daedad0ec5%7C717009a620de461a88940312a395cac9%7C0%7C0%7C638083624783034240%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=l5BZyjup%2Bho%2FijE1zQMxb5JE3F5VfKBZpUKHYW4k4Fg%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Pipe operator

2023-01-03 Thread Sorkin, John
I am trying to understand the reason for existence of the pipe operator, %>%, 
and when one should use it. It is my understanding that the operator sends the 
file to the left of the operator to the function immediately to the right of 
the operator:

c(1:10) %>% mean results in a value of 5.5 which is exactly the same as the 
result one obtains using the mean function directly, viz. mean(c(1:10)). What 
is the reason for having two syntactically different but semantically identical 
ways to call a function? Is one more efficient than the other? Does one use 
less memory than the other? 

P.S. Please forgive what might seem to be a question with an obvious answer. I 
am a programmer dinosaur. I have been programming for more than 50 years. When 
I started programming in the 1960s the only pipe one spoke about was a bong.  

John

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Certification

2023-01-02 Thread John Kane
Hi Mukesh,

Have a look at the blurb that prints at the start-up of R.

"R is free software and comes with ABSOLUTELY NO WARRANTY."

This is a hint that the R-Project is unlikely to be issuing certificates.



On Mon, 2 Jan 2023 at 08:18, Mukesh Ghanshyamdas Lekhrajani via R-help <
r-help@r-project.org> wrote:

> Thanks Petr !
>
> I will look at other training bodies as Coursera, or few others... but I
> was just wondering if there could be a  certificate from the "originators"
> itself, I mean an "R" certificate from "r-project" itself and that would
> carry more importance than external / unauthorized certificate bodies.
>
> But, if you suggest there is no such certification provided by
> "r-project", then the only option for me is to search else where like -
> Coursera or few others.
>
> I now have got my answers, but later the day - if ever "r-project" comes
> up with "R Language" certifications, do keep me informed.
>
>
> Thanks, Mukesh
> 9819285174.
>
>
> -Original Message-
> From: PIKAL Petr 
> Sent: Monday, January 2, 2023 6:13 PM
> To: mukesh.lekhraj...@yahoo.com; R-help Mailing List  >
> Subject: RE: [R] R Certification
>
> Hallo Mukesh
>
> R project is not Microsoft or Oracle AFAIK. But if you need some
> certificate you could take courses on Coursera, they are offering
> certificates.
>
> Cheers
> Petr
>
> > -Original Message-
> > From: R-help  On Behalf Of Mukesh
> > Ghanshyamdas Lekhrajani via R-help
> > Sent: Monday, January 2, 2023 1:04 PM
> > To: 'Jeff Newmiller' ; 'Mukesh Ghanshyamdas
> > Lekhrajani via R-help' ; r-help@r-project.org
> > Subject: Re: [R] R Certification
> >
> > Hello Jeff !
> >
> > Yes, you are right.. and that’s why I am asking this question - just
> like other
> > governing bodies that issue certification on their respective
> technologies, does
> > "r-project.org" also have a learning path ? and then a certification.
> >
> > Say - Microsoft issues certificate for C#, .Net, etc..
> > Then, Oracle issues certificates for Java, DB etc..
> >
> > These are authentic governing bodies for learning and issuing
> certificates
> >
> > On exactly similar lines -  "r-project.org" would also be having some
> learning
> > path and then let "r-project" take the proctored exam and issue a
> certificate...
> >
> > I am not looking at any external institute for certifying me on "R" -
> but, the
> > governing body itself..
> >
> > So, the question again is - "does r-project provide a learning path and
> issue
> > certificate after taking exams"
> >
> > Thanks, Mukesh
> > 9819285174
> >
> >
> >
> > -Original Message-
> > From: Jeff Newmiller 
> > Sent: Monday, January 2, 2023 2:26 PM
> > To: mukesh.lekhraj...@yahoo.com; Mukesh Ghanshyamdas Lekhrajani via R-
> > help ; r-help@r-project.org
> > Subject: Re: [R] R Certification
> >
> > I think this request is like saying "I want a unicorn." There are many
> > organizations that will enter your name into a certificate form for a
> fee, possibly
> > with some credibility... but if they put "r-project.org" down as the
> name of the
> > organization granting this "certificate" then you are probably getting
> fooled.
> >
> > On December 30, 2022 8:33:09 AM PST, Mukesh Ghanshyamdas Lekhrajani via
> > R-help  wrote:
> > >Hello R Support Team,
> > >
> > >
> > >
> > >I want to do R certification, could you help me with the list of
> > >certificates with their prices so it helps me to register.
> > >
> > >
> > >
> > >I want to do the certification directly from the governing body
> > >"r-project.org" and not from any 3rd party.
> > >
> > >
> > >
> > >Please help.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >Mukesh
> > >
> > >+91 9819285174
> > >
> > >
> > > [[alternative HTML version deleted]]
> > >
> > >__
> > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >https://stat.ethz.ch/mailman/listinfo/r-help
> > >PLEASE do read the posting guide
> > >http://www.R-project.org/posting-guide.html
> > >and provide commented, minimal, self-contained, reproduc

Re: [R] Reg: Help in assigning colors to factor variable in ggplot2

2022-12-26 Thread John Kane
Here is a rough guess at what you may want with a bit of mock data and
using ggplot2.
##=#
library(ggplot2)
library(RColorBrewer)

dat1  <- data.frame(aa = sample(1:10, 20, replace = TRUE), bb =
sample(21:30, 20, replace = TRUE),
outcome = sample(c("died", "home", "other
hospital","secondary care/rehab"), 20, replace = TRUE ))
ggplot(dat1, aes(aa, bb, colour = outcome)) + geom_point() +
scale_colour_brewer(palette = "Dark2") +
labs(
x = "Maximum body temperature",
y =  "Maximum heart rate",
colour = "outcome",
title ="500 ICU patients"
)
 ####

On Mon, 26 Dec 2022 at 09:46, John Kane  wrote:

> I suspect you may be mixing *plot()* commands with *ggplot()* commands and
> they are likely incompatible.
>
> Could you supply some sample data and any error messages that you are
> getting?   A handy way to supply some sample data is the dput() function.
> In the case of a large dataset something like dput(head(mydata, 100))
> should supply the data we need.
>
> On Mon, 26 Dec 2022 at 09:06, Upananda Pani 
> wrote:
>
>> Dear All,
>>
>> I am trying to plot a scatter plot between  temperature and heart rate and
>> additionally marking the outcome of the patients by colors. I am using the
>> standard package Use the standard function plot as well as the functions
>> of
>> package "ggplot2" (Wickham (2009)). Save the plots in pdf files.
>>
>> I am geeting an error to plot when assigning colsOutcome to the
>> scatterplot. I am doing it wrongly. Please advise me.
>> ```{r}
>> library(ggplot2)
>> library(RColorBrewer)
>> library(ggsci)
>> ICUData <- read.csv(file = "ICUData.csv")
>> ```
>> ```{r}
>> ## Generate empty vector
>> colsOutcome <- character(nrow(ICUData))
>> ## Fill with colors
>> colsOutcome[ICUData$outcome == "died"] <- "#E41A1C"
>> colsOutcome[ICUData$outcome == "home"] <- "#377EB8"
>> colsOutcome[ICUData$outcome == "other hospital"] <- "#4DAF4A8"
>> colsOutcome[ICUData$outcome == "secondary care/rehab"] <- "#984EA3"
>> ```
>>
>> ```{r}
>> plot(x = ICUData$temperature, y = ICUData$heart.rate, pch = 19,
>>  xlab = "Maximum body temperature", ylab = "Maximum heart rate",
>>  main = "500 ICU patients", col = colsOutcome, xlim = c(33,43))
>> legend(x = "topleft", legend = c("died", "home", "other hospital",
>> "secondary care/rehab"), pch = 19,
>>col = c("#E41A1C", "#377EB8", "#4DAF4A8", "#984EA3"))
>> ```
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> --
> John Kane
> Kingston ON Canada
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reg: Help in assigning colors to factor variable in ggplot2

2022-12-26 Thread John Kane
I suspect you may be mixing *plot()* commands with *ggplot()* commands and
they are likely incompatible.

Could you supply some sample data and any error messages that you are
getting?   A handy way to supply some sample data is the dput() function.
In the case of a large dataset something like dput(head(mydata, 100))
should supply the data we need.

On Mon, 26 Dec 2022 at 09:06, Upananda Pani  wrote:

> Dear All,
>
> I am trying to plot a scatter plot between  temperature and heart rate and
> additionally marking the outcome of the patients by colors. I am using the
> standard package Use the standard function plot as well as the functions of
> package "ggplot2" (Wickham (2009)). Save the plots in pdf files.
>
> I am geeting an error to plot when assigning colsOutcome to the
> scatterplot. I am doing it wrongly. Please advise me.
> ```{r}
> library(ggplot2)
> library(RColorBrewer)
> library(ggsci)
> ICUData <- read.csv(file = "ICUData.csv")
> ```
> ```{r}
> ## Generate empty vector
> colsOutcome <- character(nrow(ICUData))
> ## Fill with colors
> colsOutcome[ICUData$outcome == "died"] <- "#E41A1C"
> colsOutcome[ICUData$outcome == "home"] <- "#377EB8"
> colsOutcome[ICUData$outcome == "other hospital"] <- "#4DAF4A8"
> colsOutcome[ICUData$outcome == "secondary care/rehab"] <- "#984EA3"
> ```
>
> ```{r}
> plot(x = ICUData$temperature, y = ICUData$heart.rate, pch = 19,
>  xlab = "Maximum body temperature", ylab = "Maximum heart rate",
>  main = "500 ICU patients", col = colsOutcome, xlim = c(33,43))
> legend(x = "topleft", legend = c("died", "home", "other hospital",
> "secondary care/rehab"), pch = 19,
>col = c("#E41A1C", "#377EB8", "#4DAF4A8", "#984EA3"))
> ```
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Amazing AI

2022-12-19 Thread John Kane
Does not Medians <- apply(numeric_data, 1, median) give us the rom medians?

On Mon, 19 Dec 2022 at 05:52, Milan Glacier  wrote:

> On 12/18/22 19:01, Boris Steipe wrote:
> >Technically not a help question. But crucial to be aware of, especially
> for those of us in academia, or otherwise teaching R. I am not aware of a
> suitable alternate forum. If this does not interest you, please simply
> ignore - I already know that this may be somewhat OT.
> >
> >Thanks.
> >--
> >
> >You very likely have heard of ChatGPT, the conversation interface on top
> of the GPT-3 large language model and that it can generate code. I thought
> it doesn't do R - I was wrong. Here is a little experiment:
> >Note that the strategy is quite different (e.g using %in%, not
> duplicated() ), the interpretation of "last variable" is technically
> correct but not what I had in mind (ChatGPT got that right though).
> >
> >
> >Changing my prompts slightly resulted it going for a dplyr solution
> instead, complete with %>% idioms etc ... again, syntactically correct but
> not giving me the fully correct results.
> >
> >--
> >
> >Bottom line: The AI's ability to translate natural language instructions
> into code is astounding. Errors the AI makes are subtle and probably not
> easy to fix if you don't already know what you are doing. But the way that
> this can be "confidently incorrect" and plausible makes it nearly
> impossible to detect unless you actually run the code (you may have noticed
> that when you read the code).
> >
> >Will our students use it? Absolutely.
> >
> >Will they successfully cheat with it? That depends on the assignment. We
> probably need to _encourage_ them to use it rather than sanction - but
> require them to attribute the AI, document prompts, and identify their own,
> additional contributions.
> >
> >Will it help them learn? When you are aware of the issues, it may be
> quite useful. It may be especially useful to teach them to specify their
> code carefully and completely, and to ask questions in the right way. Test
> cases are crucial.
> >
> >How will it affect what we do as instructors? I don't know. Really.
> >
> >And the future? I am not pleased to extrapolate to a job market in which
> they compete with knowledge workers who work 24/7 without benefits,
> vacation pay, or even a salary. They'll need to rethink the value of their
> investment in an academic education. We'll need to rethink what we do to
> provide value above and beyond what AI's can do. (Nb. all of the arguments
> I hear about why humans will always be better etc. are easily debunked, but
> that's even more OT :-)
> >
> >
> >
> >If you have thoughts to share how your institution is thinking about
> academic integrity in this situation, or creative ideas how to integrate
> this into teaching, I'd love to hear from you.
>
> *NEVER* let the AI misleading the students! ChatGPT gives you seemingly
> sound but actually *wrong* code!
>
> ChatGPT never understands the formal abstraction behind the code, it
> just understands the shallow text pattern (and the syntax rules) in the
> code. And it often gives you the code that seemingly correct but indeed
> wrongly output. If it is used with code completion, then it is okay
> (just like github copilot), since the coder need to modify the code
> after getting the completion. But if you want to use ChatGPT for
> students to query information / writing code, it is error proning!
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot a line using ggplot2

2022-12-08 Thread Sorkin, John
Colleagues,

I am trying to plot a simple line using ggplot2. I get the axes, but I don't 
get the line. Please let me know what my error I am making.
Thank you,
John

# Define x and y values
PointEstx <- Estx+1.96*SE
PointEsty  <- 1

row2 <- cbind(PointEstx,PointEsty)
linedata<- data_frame(rbind(row1,row2))
linedata
# make sure we have a data frame
class(linedata)

#plot the data
ggplot(linedata,aes(x=PointEstx, y=PointEsty), geom_line())





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rio: list of extensions for supported formats

2022-11-05 Thread John Kane
Cat was being helpful.

On Sat, 5 Nov 2022 at 15:39, John Kane  wrote:

> o idea but there is a list here
> https://thomasleeper.com/rio/articles/rio.html
>
> On Sat, 5 Nov 2022 at 04:04, Sigbert Klinke 
> wrote:
>
>> Hi,
>>
>> is there a function in the package rio to get the file extensions listed
>> in the vignette under supported formats?
>>
>> Thanks Sigbert
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> --
> John Kane
> Kingston ON Canada
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rio: list of extensions for supported formats

2022-11-05 Thread John Kane
o idea but there is a list here
https://thomasleeper.com/rio/articles/rio.html

On Sat, 5 Nov 2022 at 04:04, Sigbert Klinke 
wrote:

> Hi,
>
> is there a function in the package rio to get the file extensions listed
> in the vignette under supported formats?
>
> Thanks Sigbert
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Single pdf of all R vignettes request

2022-10-30 Thread Sun, John
Dear All,

I am writing to ask whether there exists a single pdf of all the vignettes from 
R packages.
This would be good resource. 

Best regards,
John 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] startup loading issue

2022-10-25 Thread John Dougherty via R-help
On Tue, 25 Oct 2022 08:33:10 -0500
ken eagle  wrote:

> I thought I was loading a ~300M binary (bigwig) file into another
> application . . .

Is the other application R dependent, written in R, or call R
capacities? If it doesn't, the issue might be with the "other
application rather than R.

JWDougherty

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected 'else' in " else"

2022-10-21 Thread John Fox

Dear Jinsong,

When you enter these code lines at the R command prompt, the interpreter 
evaluates an expression when it's syntactically complete, which occurs 
before it sees the else clause. The interpreter can't read your mind and 
know that an else clause will be entered on the next line. When the code 
lines are in a function, the function body is enclosed in braces and so 
the interpreter sees the else clause.


As I believe was already pointed out, you can similarly use braces at 
the command prompt to signal incompleteness of an expression, as in


> {if (FALSE) print(1)
+ else print(2)}
[1] 2

I hope this helps,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://socialsciences.mcmaster.ca/jfox/
On 2022-10-21 8:06 a.m., Jinsong Zhao wrote:

Thanks a lot!

I know the first and third way to correct the error. The second way 
seems make me know why the code is correct in the function 
stats::weighted.residuals.


On 2022/10/21 17:36, Andrew Simmons wrote:
The error comes from the expression not being wrapped with braces. You 
could change it to


if (is.matrix(r)) {
    r[w != 0, , drop = FALSE]
} else r[w != 0]

or

{
    if (is.matrix(r))
        r[w != 0, , drop = FALSE]
    else r[w != 0]
}

or

if (is.matrix(r)) r[w != 0, , drop = FALSE] else r[w != 0]


On Fri., Oct. 21, 2022, 05:29 Jinsong Zhao,  wrote:

    Hi there,

    The following code would cause R error:

     > w <- 1:5
     > r <- 1:5
     >         if (is.matrix(r))
    +             r[w != 0, , drop = FALSE]
     >         else r[w != 0]
    Error: unexpected 'else' in "        else"

    However, the code:
             if (is.matrix(r))
                 r[w != 0, , drop = FALSE]
             else r[w != 0]
    is extracted from stats::weighted.residuals.

    My question is why the code in the function does not cause error?

    Best,
    Jinsong

    __
    R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
    https://stat.ethz.ch/mailman/listinfo/r-help
    PLEASE do read the posting guide
    http://www.R-project.org/posting-guide.html
    <http://www.R-project.org/posting-guide.html>
    and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prcomp - arbitrary direction of the returned principal components

2022-10-13 Thread John C Nash
This reminds me of a situation in 1975 where a large computer service bureau had
contracted to migrate scientific software from a Univac 1108 to a an IBM System 
360.
They spent 3 weeks trying to get the IBM to give the same eigenvectors on a 
problem as the
Univac. There were at least 2 eigenvalues that were equal. They were trying to 
fix
something that was not broken. Their desperation was enough to offer me a very 
large fee
to "fix" things. However, I had a nice job, so told them to go away and read a 
couple of
books on the real-symmetric eigenvalue problem or singular value decomposition, 
though the
latter was just becoming known outside of numerical linear algebra.

I suspect the OP should go back to basics with principal components and not try 
to fiddle
with the output. It is likely that the "loadings" (I'm never sure of the 
nomenclature -- I
use the matrix setup) can be rotated, but you can't just rotate one vector of a 
set on its
own.

Amazing how these old issues linger for decades. Or maybe linear algebra is not 
on the
curriculum.

John Nash



On Thu, 2022-10-13 at 19:35 +0530, Ashim Kapoor wrote:
> Dear All,
> 
> Many thanks for your replies.
> 
> My PC1 loading turns out to be :
> 
> 1/sqrt(2) , -1/sqrt(2)
> 
> In simple words : I had 2 variables and I ran prcomp on them. I got my
> PC1 as :  .7071068 var1 - .7071068 var2
> 
> PC2 turned out to be the same as PC1 with a PLUS replacing the minus,
> ie. .7071068 var1 + .7071068 var2
> 
> But forget PC2 for the time being.
> 
> Now my question is : I am not able to use the rule that : choose the
> variable with a bigger magnitude of loading and multiply PC1 by -1 if
> needed (to flip the PC1 since any vector x and it's flipped version -x
>  are the same vector but with opposite direction) if the variable with
> bigger magnitude is of negative sign.
> 
> I have an alternative measure of stress which is trending UP and has 2
> peaks during 2 recessions and I can see that PC1 is trending DOWN and
> has 2 TROUGHS during the same recessions. That's how I wish to FLIP
> PC1 with a negative sign.
> 
> The data is not mine and I am not at liberty to share it. I can
> construct an artificial example but I would need time to do that.
> 
> That's what's happening.
> 
> Best Regards and
> Many thanks.
> Ashim
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Thu, Oct 13, 2022 at 5:38 PM Ebert,Timothy Aaron  wrote:
> > 
> > I still do not understand. However, the general approach would be to 
> > identify a
> > specific value to test. If the test is TRUE then do "this" otherwise do 
> > nothing. Once
> > the test condition is properly identified, the coding easily follows.
> > 
> >  abs() is the same as
> > if x<0 then x = -x   (non-R code, just idea)
> > The R code might look something more like
> > for (number in 1:ncol(x)){
> >    if (x[3,2] < 0) {
> >  x[number, number] = -x[number, number] #only change the diagonal
> >    }
> > }
> > 
> > Depending on what values need to be changed you may need a nested for loop 
> > to go
> > through all values of x[number1, number2].
> > 
> > Your words: " I can forcefully use a NEGATIVE sign to FLIP the index when 
> > it is LOW."
> > Where it appeared that "low" was defined as values that are negative. You 
> > still will
> > have low values (close to zero) and high values (far from zero).
> > 
> > You could make the condition some other value:
> > 
> > if x< -4 then x = -x
> > 
> > If you just want to rotate about zero then
> > x = -x
> > In this case the positive values will become negative and the negative 
> > values
> > positive.
> > Add an if test to selectively rotate based on the value of a single test 
> > element in x
> > (as in x[3,2]).
> > 
> > In debugging or trouble shooting setting seed is useful. For actual data 
> > analysis you
> > should not set seed, or possibly better yet use set.seed(NULL).
> > 
> > Tim
> > 
> > 
> > 
> > -Original Message-
> > From: Ashim Kapoor 
> > Sent: Thursday, October 13, 2022 12:28 AM
> > To: Ebert,Timothy Aaron 
> > Cc: R Help 
> > Subject: Re: [R] prcomp - arbitrary direction of the returned principal 
> > components
> > 
> > [External Email]
> > 
> > Dear Aaron,
> > 
> > Many thanks for your reply.
> > 
> > Please allow me to illustrate my query a bit.
> > 
> > I take some data, throw it to prcomp and extract the x data frame from 
> > prcomp.
> > 
&

Re: [R] How long does it take to learn the R programming language?

2022-09-28 Thread John Kane
+ 1

On Wed, 28 Sept 2022 at 17:36, Jim Lemon  wrote:

> Given some of the questions that are posted to this list, I am not
> sure that there is an upper bound to the estimate.
>
> Jim
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   4   5   6   7   8   9   10   >