Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Luigi Marongiu
Got it, thank you! On Tue, 10 Aug 2021, 00:12 David Winsemius, wrote: > > On 8/9/21 12:22 PM, Luigi Marongiu wrote: > > Thank you! it worked fine! The only pitfall is that `NA` became > > ``. This is essentially the same thing anyway... > > > It's not "essentially the same thing". It IS the same

Re: [R] Calculation of Age heaping

2021-08-09 Thread Jim Lemon
Here is my hasty attempt last night checked in the light of morning. It seems to return the correct extreme values and contains an example. Jim On Mon, Aug 9, 2021 at 10:50 PM Md. Moyazzem Hossain wrote: > > Dear Jim, > > Thank you very much for your kind help. > > Take care. > > Md > > On Mon,

Re: [R] No "doc" directory in my installation of R.

2021-08-09 Thread Dirk Eddelbuettel
Rolf, Sorry for only briefly chiming in, and late, but I don't usually follow r-help that much these days. I am writing this from an Ubuntu machine running R as well as RStudio from pre-made binary .deb packages. R comes via apt from CRAN (using Michael's binaries), RStudio from them via helper

Re: [R] No "doc" directory in my installation of R.

2021-08-09 Thread Rolf Turner
I thought that I should let everyone know that I have, in some sense at least, resolved my problem with 'no "doc" directory' and Rstudio. I got a useful reply off-list from Duncan Murdoch (thanks Duncan) to the effect that Rstudio requires its own purpose-specific binaries. I was always under t

Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread David Winsemius
On 8/9/21 12:22 PM, Luigi Marongiu wrote: Thank you! it worked fine! The only pitfall is that `NA` became ``. This is essentially the same thing anyway... It's not "essentially the same thing". It IS the same thing. The print function displays those '<>' characters flanking NA's when the cl

Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Luigi Marongiu
Thank you! it worked fine! The only pitfall is that `NA` became ``. This is essentially the same thing anyway... On Mon, Aug 9, 2021 at 5:18 PM Ivan Krylov wrote: > > Thanks for providing a reproducible example! > > On Mon, 9 Aug 2021 15:33:53 +0200 > Luigi Marongiu wrote: > > > df[df[['vect[2]'

Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Ivan Krylov
Thanks for providing a reproducible example! On Mon, 9 Aug 2021 15:33:53 +0200 Luigi Marongiu wrote: > df[df[['vect[2]']] == 2, 'vect[2]'] <- "No" Please don't quote R expressions that you want to evaluate. 'vect[2]' is just a string, like 'hello world' or 'I want to create a new column named "

Re: [R] Sample size Determination to Compare Three Independent Proportions

2021-08-09 Thread Marc Schwartz via R-help
Hi, You are going to need to provide more information than what you have below and I may be mis-interpreting what you have provided. Presuming you are designing a prospective, three-group, randomized allocation study, there is typically an a priori specification of the ratios of the sample s

Re: [R] Sanity check in loading large dataframe

2021-08-09 Thread Bert Gunter
FWIW: Yes, thanks for noting that. My own preference is to always propagate NA's and manually decide how to deal with them, but others may disagree. Best, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley

[R] Sample size Determination to Compare Three Independent Proportions

2021-08-09 Thread AbouEl-Makarim Aboueissa
Dear All: good morning *Re:* Sample Size Determination to Compare Three Independent Proportions *Situation:* Three Binary variables (Yes, No) Three independent populations with fixed sizes (*say:* N1 = 1500, N2 = 900, N3 = 1350). Power = 0.80 How to choose the sample sizes to compare th

Re: [R] Apply gsub to dataframe to modify row values

2021-08-09 Thread Luigi Marongiu
Thank you, it works! On Mon, Aug 9, 2021 at 3:26 PM Andrew Simmons wrote: > > Hello, > > > There are two convenient ways to access a column in a data.frame using `$` > and `[[`. Using `df` from your first email, we would do something like > > df <- data.frame(VAR = 1:3, VAL = c("value is blue",

Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Luigi Marongiu
You are right, vect will contain the names of the columns of the real dataframe buyt the actual simulation of the real case is more like this: ``` > df = data.frame(A = 1:5, B = c(1, 2, NA, 2, NA), C = c("value is blue", > "Value is red", "empty", " value is blue", " Value is green"), D = 9:13, E

Re: [R] Apply gsub to dataframe to modify row values

2021-08-09 Thread Andrew Simmons
Hello, There are two convenient ways to access a column in a data.frame using `$` and `[[`. Using `df` from your first email, we would do something like df <- data.frame(VAR = 1:3, VAL = c("value is blue", "Value is red", "empty")) df$VAL df[["VAL"]] The two convenient ways to update / / replac

Re: [R] Apply gsub to dataframe to modify row values

2021-08-09 Thread Luigi Marongiu
I wanted to remove possible white spaces before or after the string. Actually, it worked, I used `gsub("[:blank:]*val[:blank:]*", "", df$VAL, ignore.case=TRUE)`. I don't know why in the example there were extra columns -- they did not came out in the real case. Thank you, I think the case is closed

Re: [R] Calculation of Age heaping

2021-08-09 Thread Md. Moyazzem Hossain
Dear Jim, Thank you very much for your kind help. Take care. Md On Mon, Aug 9, 2021 at 1:17 PM Jim Lemon wrote: > And if you really don't like programming: > > whipple_index<-function(x,td=c(0,5)) { > wi<-rep(NA,11) > names(wi)<-c(paste0("wi",0:9),"O/all") > for(i in 0:9) { > ttd<-which(

Re: [R] Calculation of Age heaping

2021-08-09 Thread Jim Lemon
And if you really don't like programming: whipple_index<-function(x,td=c(0,5)) { wi<-rep(NA,11) names(wi)<-c(paste0("wi",0:9),"O/all") for(i in 0:9) { ttd<-which((x %% 10) %in% i) wi[i+1]<-length(ttd) * 100/length(x) } ttd<-which((x %% 10) %in% td) wi[11]<-length(ttd) * 100/(length(x)/le

Re: [R] Calculation of Age heaping

2021-08-09 Thread Richard O'Keefe
According to Wikipedia, this is the definition of Whipple's index: "The index score is obtained by summing the number of persons in the age range 23 and 62 inclusive, who report ages ending in 0 and 5, dividing that sum by the total population between ages 23 and 62 years inclusive, and multiplyin

Re: [R] Calculation of Age heaping

2021-08-09 Thread Md. Moyazzem Hossain
Dear Greg, Thank you very much for your suggestion. I will try it and follow your advice. Actually, I want to find out the index for each digit like 0, 1, ..., 9. Thanks in advance. Take care. Md On Mon, Aug 9, 2021 at 12:05 PM Greg Minshall wrote: > Md, > > if this is what you are looking

Re: [R] Apply gsub to dataframe to modify row values

2021-08-09 Thread Jim Lemon
Hi Luigi, You want to get rid of certain strings in the "VAL" column. You are assigning to: df[df$VAL] Error in `[.data.frame`(df, df$VAL) : undefined columns selected when I think you should be assigning to: df$VAL What do you want to remove other than "[V|v]alue is" ? JIim On Mon, Aug 9, 20

Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Ivan Krylov
On Mon, 9 Aug 2021 13:16:02 +0200 Luigi Marongiu wrote: > df = data.frame(VAR = ..., VAL = ...) > vect = letters[1:5] What is the relation between vect and the column names of the data frame? Is it your intention to choose rows or columns using `vect`? > df[df[['vect[2]']] == 2, 'vect[2]'] '..

Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Luigi Marongiu
Thank you but I think I got it wrong: ``` > df = data.frame(VAR = letters[1:5], VAL = c(1, 2, NA, 2, NA)); df VAR VAL 1 a 1 2 b 2 3 c NA 4 d 2 5 e NA > vect = letters[1:5] > df[df[['vect[2]']] == 2, 'vect[2]'] <- "No"; df VAR VAL vect[2] 1 a 1 2 b 2 3 c NA

Re: [R] Calculation of Age heaping

2021-08-09 Thread Greg Minshall
Md, if this is what you are looking for: https://en.wikipedia.org/wiki/Whipple%27s_index then, the article says the algorithm is The index score is obtained by summing the number of persons in the age range 23 and 62 inclusive, who report ages ending in 0 and 5, dividing that sum b

Re: [R] Apply gsub to dataframe to modify row values

2021-08-09 Thread Luigi Marongiu
Sorry, silly question, gsub works already with regex. But still, if I add `[[:blank:]]` still I don't get rid of all instances. And I am keeping obtaining extra columns ``` > df[df$VAL] = gsub("[[:blank:]Value is]", "", df$VAL, ignore.case=TRUE) > df[df$VAL] = gsub("[[:blank:]Value is]", "", df$VAL

Re: [R] Apply gsub to dataframe to modify row values

2021-08-09 Thread Luigi Marongiu
Thank you, that is much appreciated. But on the real data, the substitution works only on few instances. Is there a way to introduce regex into this? Cheers Luigi On Mon, Aug 9, 2021 at 11:01 AM Jim Lemon wrote: > > Hi Luigi, > Ah, now I see: > > df$VAL<-gsub("Value is","",df$VAL,ignore.case=TRU

Re: [R] Calculation of Age heaping

2021-08-09 Thread Md. Moyazzem Hossain
Dear Avi Gross, Thank you very much for your email. Actually, I have a little knowledge of R programming. I have a dataset of ages ranging from 10 to 90. Now, I want to find out the Whipple’s index for age heaping among individuals for each digit like 0,1,...,9. I have searched in google I got t

Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Ivan Krylov
On Mon, 9 Aug 2021 10:26:03 +0200 Luigi Marongiu wrote: > vect = names(df) > sub_df[vect[1]] > df$column[df$column == value] <- new.value Let's see, an equivalent expression without the $ syntax is `df[['column']][df[['column']] == value] <- new.value`. Slightly shorter, matrix-like syntax woul

Re: [R] Apply gsub to dataframe to modify row values

2021-08-09 Thread Jim Lemon
Hi Luigi, Ah, now I see: df$VAL<-gsub("Value is","",df$VAL,ignore.case=TRUE) df VAR VAL 1 1 blue 2 2 red 3 3 empty Jim On Mon, Aug 9, 2021 at 6:43 PM Luigi Marongiu wrote: > > Hello, > I have a dataframe where I would like to change the string of certain > rows, essentially I am lo

Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Luigi Marongiu
Thank you very much, but that would make even more work due to the duplication... On Mon, Aug 9, 2021 at 10:53 AM Jim Lemon wrote: > > Hi Luigi, > It looks to me as though you will have to copy the data frame or store > the output in a new data frame. > > Jim > > On Mon, Aug 9, 2021 at 6:26 PM Lu

Re: [R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Jim Lemon
Hi Luigi, It looks to me as though you will have to copy the data frame or store the output in a new data frame. Jim On Mon, Aug 9, 2021 at 6:26 PM Luigi Marongiu wrote: > > Hello, > I would like to recursively select the columns of a dataframe by > strong the names of the dataframe in a vector

[R] Apply gsub to dataframe to modify row values

2021-08-09 Thread Luigi Marongiu
Hello, I have a dataframe where I would like to change the string of certain rows, essentially I am looking to remove some useless text from the variables. I tried with: ``` > df = data.frame(VAR = 1:3, VAL = c("value is blue", "Value is red", "empty")) > df[df$VAL] = gsub("value is ", "", df$VAL,

[R] substitute column data frame based on name stored in variable in r

2021-08-09 Thread Luigi Marongiu
Hello, I would like to recursively select the columns of a dataframe by strong the names of the dataframe in a vector and extracting one element of the vector at a time. This I can do with, for instance: ``` vect = names(df) sub_df[vect[1]] ``` The problem is that I would like also to change the v