Re: [R] Writing data onto xlsx file without cell formatting
On 9/26/2016 2:56 PM, Christofer Bogaso wrote: Hi again, I have been following above suggestion to export data from R to xlsx file using XLconnect. However recently I am facing Java memory allocation problem with large dataset (looks like a known issue with this package) and therefore decided to move to using "xlsx" package. Now I started facing that same problem of losing my existing formating when I use xlsx package for data export. Can someone help me with some pointer on how can I preserve the cell formating after exporting data.frame to some existing xlsx file using "xlsx" package. Thanks for your time. On Mon, Jul 11, 2016 at 10:43 AM, Ismail SEZEN wrote: I think, this is what you are looking for: http://stackoverflow.com/questions/11228942/write-from-r-into-template-in-excel-while-preserving-formatting On 11 Jul 2016, at 03:43, Christofer Bogaso wrote: Hi again, I am trying to write a data frame to an existing Excel file (xlsx) from row 5 and column 6 of the 1st Sheet. I was going through a previous instruction which is available here : http://stackoverflow.com/questions/32632137/using-write-xlsx-in-r-how-to-write-in-a-specific-row-or-column-in-excel-file However trouble is that it is modifying/removing formatting of all the affected cells. I have predefined formatting of those cells where data to be pasted, and I dont want to modify or remove that formatting. Any idea if I need to pass some additional argument. Appreciate your valuable feedback. Thanks, __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. It would help the list to help you if you gave a reproducible example. In the absence of that, at least show the actual code you are using to write to the Excel (.xlsx) sheet. But maybe reading about the "create" argument on page 13 of this linked document will help: https://cran.r-project.org/web/packages/xlsx/xlsx.pdf Dan -- Daniel Nordlund Port Townsend, WA USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in gam() object 'scat' no found
I received an error message while trying to use family=scat in the GAM package. The models were working fine yesterday. The problem is not with my data seeing as the gaussian distribution is working fine. mod=gam(RT~s(a) + s(b), data=dat, family=gaussian) mod=gam(RT~s(a) + s(b), data=dat, family=scat) Might this problem be unrelated to GAM specifically, and to my R configuration? I have removed the gam package and re-installed it several times to no avail. Thank you for any assistance, Karl [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing data onto xlsx file without cell formatting
openxlsx is not solving my problem either. It is corrupting my xlsx file. I have a large data.frame, which I want to export to an existing xlsx file, without formatting that existing file. With XLconnect there is an option "setStyleAction(wb,XLC$"STYLE_ACTION.NONE")" which does it so. I am looking for a similar codeline for xlsx package which will enable me to save my data.frame in my existing file without formatting my xlsx file. Thanks, On Tue, Sep 27, 2016 at 3:39 AM, jim holtman wrote: > I use the "openxlsx" package to handle spreadsheets. > > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > On Mon, Sep 26, 2016 at 5:56 PM, Christofer Bogaso > wrote: >> >> Hi again, >> >> I have been following above suggestion to export data from R to xlsx >> file using XLconnect. However recently I am facing Java memory >> allocation problem with large dataset (looks like a known issue with >> this package) and therefore decided to move to using "xlsx" package. >> >> Now I started facing that same problem of losing my existing formating >> when I use xlsx package for data export. Can someone help me with some >> pointer on how can I preserve the cell formating after exporting >> data.frame to some existing xlsx file using "xlsx" package. >> >> Thanks for your time. >> >> On Mon, Jul 11, 2016 at 10:43 AM, Ismail SEZEN >> wrote: >> > I think, this is what you are looking for: >> > >> > >> > http://stackoverflow.com/questions/11228942/write-from-r-into-template-in-excel-while-preserving-formatting >> > >> > On 11 Jul 2016, at 03:43, Christofer Bogaso >> > >> > wrote: >> > >> > Hi again, >> > >> > I am trying to write a data frame to an existing Excel file (xlsx) >> > from row 5 and column 6 of the 1st Sheet. I was going through a >> > previous instruction which is available here : >> > >> > >> > http://stackoverflow.com/questions/32632137/using-write-xlsx-in-r-how-to-write-in-a-specific-row-or-column-in-excel-file >> > >> > However trouble is that it is modifying/removing formatting of all the >> > affected cells. I have predefined formatting of those cells where data >> > to be pasted, and I dont want to modify or remove that formatting. >> > >> > Any idea if I need to pass some additional argument. >> > >> > Appreciate your valuable feedback. >> > >> > Thanks, >> > >> > __ >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> > >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot grouped barchart based on marginal proportions
I am trying to create a grouped barplot that uses marginal (row) proportions rather than cell proportions and can't figure out how to change: y = (..count..)/sum(..count..) in ggplot to do this. Using the mtcars dataset as an example and considering two categorical variables (cyl and am - purely for the sake of the example taking cyl as the response and am as the explanatory variable). Can anyone help me to do this: data(mtcars) # Get Proportions mtcars_xtab <- table(mtcars$cyl,mtcars$am) mtcars_xtab margin.table(mtcars_xtab, 1) # A frequencies (summed over B) margin.table(mtcars_xtab, 2) # B frequencies (summed over A) prop.table(mtcars_xtab) # cell percentages - THIS IS WHAT'S USED IN THE PLOT prop.table(mtcars_xtab, 1) # row percentages - THESE ARE WHAT I WANT TO USE IN THE PLOT # Make Plot mtcars$cyl <- as.factor(mtcars$cyl) mtcars$am <- as.factor(mtcars$am) ggplot(mtcars, aes(x=am, fill=cyl)) + geom_bar(aes(y = (..count..)/sum(..count..)), position = "dodge") + scale_fill_brewer(palette="Set2") Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] src/Makevars ignored ?
You failed to read the Posting Guide, which would have told you which mailing list to post this question to. (Hint: not this one.) -- Sent from my phone. Please excuse my brevity. On September 26, 2016 4:46:06 AM PDT, Eric Deveaud wrote: > > > Hello, > >as far as I understood the R library generic compilation mechanism, >compilation of C//C++ sources is controlde > >1) at system level by the ocntentos RHOME/etc/Makeconf >2) at user level by the content of ~/.R/Makevars >3) at package level by the content of src/Makevars > >Problem I have is that src/Makevars is ignored > > >see following example: > >R is compiled and use the following CC and CFLAGS definition > >bigmess:epactsR/src > R CMD config CC >gcc -std=gnu99 >bigmess:epactsR/src > R CMD config CFLAGS >-Wall -g > >so building C sources lead to the following > >bigmess:epactsR/src > R CMD SHLIB index.c >gcc -std=gnu99 -I/local/gensoft2/adm/lib64/R/include -DNDEBUG >-I/usr/local/include-fpic -Wall -g -c index.c -o index.o > >normal, it uses defintion from RHOME/etc/Makeconf > > >when I set upp a ~/.R/Makevars that overwrite CC and CFLAGS definition. > >bigmess:epactsR/src > cat ~/.R/Makevars >CC=gcc >CFLAGS=-O3 >bigmess:epactsR/src > R CMD SHLIB index.c >gcc -I/local/gensoft2/adm/lib64/R/include -DNDEBUG >-I/usr/local/include >-fpic -O3 -c index.c -o index.o >gcc -std=gnu99 -shared -L/usr/local/lib64 -o index.so index.o > > >OK CC and CFLAGS are honored and set accordingly to ~/.R/Makevars > > >but when I try to use src/Makevars, it is ignored > >bigmess:epactsR/src > cat ~/.R/Makevars >cat: /home/edeveaud/.R/Makevars: No such file or directory >bigmess:epactsR/src > cat ./Makevars >CC = gcc >CFLAGS=-O3 >bigmess:epactsR/src > R CMD SHLIB index.c >gcc -std=gnu99 -I/local/gensoft2/adm/lib64/R/include -DNDEBUG >-I/usr/local/include-fpic -Wall -g -c index.c -o index.o > > >what I have missed or is there something wrong ? > > >PS I tested the ssame behaviour with various version of R from R/2.15 >to >R/3.3 > > best regards > > Eric > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing data onto xlsx file without cell formatting
I use the "openxlsx" package to handle spreadsheets. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Mon, Sep 26, 2016 at 5:56 PM, Christofer Bogaso < bogaso.christo...@gmail.com> wrote: > Hi again, > > I have been following above suggestion to export data from R to xlsx > file using XLconnect. However recently I am facing Java memory > allocation problem with large dataset (looks like a known issue with > this package) and therefore decided to move to using "xlsx" package. > > Now I started facing that same problem of losing my existing formating > when I use xlsx package for data export. Can someone help me with some > pointer on how can I preserve the cell formating after exporting > data.frame to some existing xlsx file using "xlsx" package. > > Thanks for your time. > > On Mon, Jul 11, 2016 at 10:43 AM, Ismail SEZEN > wrote: > > I think, this is what you are looking for: > > > > http://stackoverflow.com/questions/11228942/write-from- > r-into-template-in-excel-while-preserving-formatting > > > > On 11 Jul 2016, at 03:43, Christofer Bogaso > > > wrote: > > > > Hi again, > > > > I am trying to write a data frame to an existing Excel file (xlsx) > > from row 5 and column 6 of the 1st Sheet. I was going through a > > previous instruction which is available here : > > > > http://stackoverflow.com/questions/32632137/using- > write-xlsx-in-r-how-to-write-in-a-specific-row-or-column-in-excel-file > > > > However trouble is that it is modifying/removing formatting of all the > > affected cells. I have predefined formatting of those cells where data > > to be pasted, and I dont want to modify or remove that formatting. > > > > Any idea if I need to pass some additional argument. > > > > Appreciate your valuable feedback. > > > > Thanks, > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Writing data onto xlsx file without cell formatting
Hi again, I have been following above suggestion to export data from R to xlsx file using XLconnect. However recently I am facing Java memory allocation problem with large dataset (looks like a known issue with this package) and therefore decided to move to using "xlsx" package. Now I started facing that same problem of losing my existing formating when I use xlsx package for data export. Can someone help me with some pointer on how can I preserve the cell formating after exporting data.frame to some existing xlsx file using "xlsx" package. Thanks for your time. On Mon, Jul 11, 2016 at 10:43 AM, Ismail SEZEN wrote: > I think, this is what you are looking for: > > http://stackoverflow.com/questions/11228942/write-from-r-into-template-in-excel-while-preserving-formatting > > On 11 Jul 2016, at 03:43, Christofer Bogaso > wrote: > > Hi again, > > I am trying to write a data frame to an existing Excel file (xlsx) > from row 5 and column 6 of the 1st Sheet. I was going through a > previous instruction which is available here : > > http://stackoverflow.com/questions/32632137/using-write-xlsx-in-r-how-to-write-in-a-specific-row-or-column-in-excel-file > > However trouble is that it is modifying/removing formatting of all the > affected cells. I have predefined formatting of those cells where data > to be pasted, and I dont want to modify or remove that formatting. > > Any idea if I need to pass some additional argument. > > Appreciate your valuable feedback. > > Thanks, > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using lapply in R data table
... and just for fun, here's an alternative in which mapply() is used to vectorize switch(); again, whether you like it may be just a matter of taste, although I suspect it might be less efficient than ifelse(), which is already vectorized: DT <- within(DT, exposure <- { mapply(function(x,fac)switch(as.character(fac), a = 1, b = difftime(as.Date("2007-01-01"), x, units="days")/365.25, c = .5 ), x = fini, fac = cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")), labels= letters[1:3]) )} ) > DT id fini group exposure 1 2 2005-04-20 A 1.000 2 2 2005-04-20 A 1.000 3 2 2005-04-20 A 1.000 4 5 2006-02-19 B 0.8651608 5 5 2006-06-29 B 0.5092402 6 7 2006-10-08 A 0.500 7 7 2006-10-08 A 0.500 Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Sep 26, 2016 at 1:27 PM, Bert Gunter wrote: > Ista: > > Aha -- now I see the point. My bad. You are right. I was careless. > > However, cut() with ifelse() might simplify the code a bit and/or make > it more readable. To be clear, this is just a matter of taste; e.g. > using your data and a data frame instead of a data table: > >> DT <- within(DT, > exposure <- { > f > <-cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")), > labels= letters[1:3]) > ifelse(f == "a", 1, > ifelse( f == "c", .5, > difftime(as.Date("2007-01-01"), fini, > units="days")/365.25)) > } > ) > > >> DT > id fini group exposure f > 1 2 2005-04-20 A 1.000 a > 2 2 2005-04-20 A 1.000 a > 3 2 2005-04-20 A 1.000 a > 4 5 2006-02-19 B 0.8651608 b > 5 5 2006-06-29 B 0.5092402 b > 6 7 2006-10-08 A 0.500 c > 7 7 2006-10-08 A 0.500 c > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Mon, Sep 26, 2016 at 12:07 PM, Ista Zahn wrote: >> On Mon, Sep 26, 2016 at 2:48 PM, Bert Gunter wrote: >>> I thought that that was a typo from the OP, as it disagrees with his >>> example. But the labels are arbitrary, so in fact cut() will do it >>> whichever way he meant. >> >> I don't see how cut will do it, at least not conveniently. Consider >> this slightly altered example: >> >> library(data.table) >> DT <- data.table( >> id = rep(c(2, 5, 7), c(3, 2, 2)), >> fini = rep(as.Date(c('2005-04-20', >>'2006-02-19', >>'2006-06-29', >>'2006-10-08')), >> c(3, 1, 1, 2)), >> group = rep(c("A", "B", "A"), c(3, 2, 2)) ) >> >> DT[, exposure := vector(mode = "numeric", length = .N)] >> DT[fini < as.Date("2006-01-01"), exposure := 1] >> DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"), >>exposure := difftime(as.Date("2007-01-01"), fini, units="days")/365.25] >> DT[fini >= as.Date("2006-07-01"), exposure := 0.5] >> >> DT >> >> ##id fini group exposure >> ## 1: 2 2005-04-20 A 1.000 >> ## 2: 2 2005-04-20 A 1.000 >> ## 3: 2 2005-04-20 A 1.000 >> ## 4: 5 2006-02-19 B 0.8651608 >> ## 5: 5 2006-06-29 B 0.5092402 >> ## 6: 7 2006-10-08 A 0.500 >> ## 7: 7 2006-10-08 A 0.500 >> >> Best, >> Ista >> >>> >>> -- Bert >>> Bert Gunter >>> >>> "The trouble with having an open mind is that people keep coming along >>> and sticking things into it." >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>> >>> >>> On Mon, Sep 26, 2016 at 11:37 AM, Ista Zahn wrote: On Mon, Sep 26, 2016 at 1:59 PM, Bert Gunter wrote: > This seems like a job for cut() . I thought that at first two, but the middle group shouldn't be .87 but rather exposure" = "2007-01-01" - "fini" so, I think cut alone won't do it. Best, Ista > > (I made DT a data frame to avoid loading the data table package. But I > assume it would work with a data table too, Check this, though!) > >> DT <- within(DT, exposure <- >> cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")), >> labels= c(1,.87,.5))) > >> DT > id fini group exposure > 1 2 2005-04-20 A1 > 2 2 2005-04-20 A1 > 3 2 2005-04-20 A1 > 4 5 2006-02-19 B 0.87 > 5 5 2006-02-19 B 0.87 > 6 7 2006-10-08 A 0.5 > 7 7 2006-10-08 A 0.5 > > > (but note that exposure is a fa
Re: [R] Using lapply in R data table
Ista: Aha -- now I see the point. My bad. You are right. I was careless. However, cut() with ifelse() might simplify the code a bit and/or make it more readable. To be clear, this is just a matter of taste; e.g. using your data and a data frame instead of a data table: > DT <- within(DT, exposure <- { f <-cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")), labels= letters[1:3]) ifelse(f == "a", 1, ifelse( f == "c", .5, difftime(as.Date("2007-01-01"), fini, units="days")/365.25)) } ) > DT id fini group exposure f 1 2 2005-04-20 A 1.000 a 2 2 2005-04-20 A 1.000 a 3 2 2005-04-20 A 1.000 a 4 5 2006-02-19 B 0.8651608 b 5 5 2006-06-29 B 0.5092402 b 6 7 2006-10-08 A 0.500 c 7 7 2006-10-08 A 0.500 c Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Sep 26, 2016 at 12:07 PM, Ista Zahn wrote: > On Mon, Sep 26, 2016 at 2:48 PM, Bert Gunter wrote: >> I thought that that was a typo from the OP, as it disagrees with his >> example. But the labels are arbitrary, so in fact cut() will do it >> whichever way he meant. > > I don't see how cut will do it, at least not conveniently. Consider > this slightly altered example: > > library(data.table) > DT <- data.table( > id = rep(c(2, 5, 7), c(3, 2, 2)), > fini = rep(as.Date(c('2005-04-20', >'2006-02-19', >'2006-06-29', >'2006-10-08')), > c(3, 1, 1, 2)), > group = rep(c("A", "B", "A"), c(3, 2, 2)) ) > > DT[, exposure := vector(mode = "numeric", length = .N)] > DT[fini < as.Date("2006-01-01"), exposure := 1] > DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"), >exposure := difftime(as.Date("2007-01-01"), fini, units="days")/365.25] > DT[fini >= as.Date("2006-07-01"), exposure := 0.5] > > DT > > ##id fini group exposure > ## 1: 2 2005-04-20 A 1.000 > ## 2: 2 2005-04-20 A 1.000 > ## 3: 2 2005-04-20 A 1.000 > ## 4: 5 2006-02-19 B 0.8651608 > ## 5: 5 2006-06-29 B 0.5092402 > ## 6: 7 2006-10-08 A 0.500 > ## 7: 7 2006-10-08 A 0.500 > > Best, > Ista > >> >> -- Bert >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Mon, Sep 26, 2016 at 11:37 AM, Ista Zahn wrote: >>> On Mon, Sep 26, 2016 at 1:59 PM, Bert Gunter wrote: This seems like a job for cut() . >>> >>> I thought that at first two, but the middle group shouldn't be .87 but >>> rather >>> >>> exposure" = "2007-01-01" - "fini" >>> >>> so, I think cut alone won't do it. >>> >>> Best, >>> Ista (I made DT a data frame to avoid loading the data table package. But I assume it would work with a data table too, Check this, though!) > DT <- within(DT, exposure <- > cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")), > labels= c(1,.87,.5))) > DT id fini group exposure 1 2 2005-04-20 A1 2 2 2005-04-20 A1 3 2 2005-04-20 A1 4 5 2006-02-19 B 0.87 5 5 2006-02-19 B 0.87 6 7 2006-10-08 A 0.5 7 7 2006-10-08 A 0.5 (but note that exposure is a factor, not numeric) Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Sep 26, 2016 at 10:05 AM, Ista Zahn wrote: > Hi Frank, > > lapply(DT) iterates over each column. That doesn't seem to be what you > want. > > There are probably better ways, but here is one approach. > > DT[, exposure := vector(mode = "numeric", length = .N)] > DT[fini < as.Date("2006-01-01"), exposure := 1] > DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"), > exposure := difftime(as.Date("2007-01-01"), fini, > units="days")/365.25] > DT[fini >= as.Date("2006-07-01"), exposure := 0.5] > > Best, > Ista > > On Mon, Sep 26, 2016 at 11:28 AM, Frank S. wrote: >> Dear all, >> >> I have a R data table like this: >> >> DT <- data.table( >> id = rep(c(2, 5, 7), c(3, 2, 2)), >> fini = rep(as.Date(c('2005-04-20', '2006-02-19', '2006-10-08')), c(3, >> 2, 2)), >> group = rep(c("A", "B", "A"), c(3, 2, 2)) ) >> >> >> I want to construct a new variable "exposure" defined as follows: >> >> 1) If "fini" earlier than 2006-01-01 --> "expos
Re: [R] 32 and 64 bit R
On 26/09/2016 6:29 AM, Mike meyer wrote: Hello, I have both 32 and 64 bit verions of R installed. What happens if I open a workspace saved from 64 bit R in the 32 bit version or conversely? I am fairly careless but never noticed any problems. No problems will arise because of the different word size. You will possibly see problems if you have the two versions set up to use different libraries; occasionally a workspace will fail to load if it needs a package that is not installed. Normally on WIndows the same library can be used for both 32 and 64 bit R, but you could always choose to break that. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using lapply in R data table
On Mon, Sep 26, 2016 at 2:48 PM, Bert Gunter wrote: > I thought that that was a typo from the OP, as it disagrees with his > example. But the labels are arbitrary, so in fact cut() will do it > whichever way he meant. I don't see how cut will do it, at least not conveniently. Consider this slightly altered example: library(data.table) DT <- data.table( id = rep(c(2, 5, 7), c(3, 2, 2)), fini = rep(as.Date(c('2005-04-20', '2006-02-19', '2006-06-29', '2006-10-08')), c(3, 1, 1, 2)), group = rep(c("A", "B", "A"), c(3, 2, 2)) ) DT[, exposure := vector(mode = "numeric", length = .N)] DT[fini < as.Date("2006-01-01"), exposure := 1] DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"), exposure := difftime(as.Date("2007-01-01"), fini, units="days")/365.25] DT[fini >= as.Date("2006-07-01"), exposure := 0.5] DT ##id fini group exposure ## 1: 2 2005-04-20 A 1.000 ## 2: 2 2005-04-20 A 1.000 ## 3: 2 2005-04-20 A 1.000 ## 4: 5 2006-02-19 B 0.8651608 ## 5: 5 2006-06-29 B 0.5092402 ## 6: 7 2006-10-08 A 0.500 ## 7: 7 2006-10-08 A 0.500 Best, Ista > > -- Bert > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Mon, Sep 26, 2016 at 11:37 AM, Ista Zahn wrote: >> On Mon, Sep 26, 2016 at 1:59 PM, Bert Gunter wrote: >>> This seems like a job for cut() . >> >> I thought that at first two, but the middle group shouldn't be .87 but rather >> >> exposure" = "2007-01-01" - "fini" >> >> so, I think cut alone won't do it. >> >> Best, >> Ista >>> >>> (I made DT a data frame to avoid loading the data table package. But I >>> assume it would work with a data table too, Check this, though!) >>> DT <- within(DT, exposure <- cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")), labels= c(1,.87,.5))) >>> DT >>> id fini group exposure >>> 1 2 2005-04-20 A1 >>> 2 2 2005-04-20 A1 >>> 3 2 2005-04-20 A1 >>> 4 5 2006-02-19 B 0.87 >>> 5 5 2006-02-19 B 0.87 >>> 6 7 2006-10-08 A 0.5 >>> 7 7 2006-10-08 A 0.5 >>> >>> >>> (but note that exposure is a factor, not numeric) >>> >>> >>> Cheers, >>> Bert >>> >>> Bert Gunter >>> >>> "The trouble with having an open mind is that people keep coming along >>> and sticking things into it." >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>> >>> >>> On Mon, Sep 26, 2016 at 10:05 AM, Ista Zahn wrote: Hi Frank, lapply(DT) iterates over each column. That doesn't seem to be what you want. There are probably better ways, but here is one approach. DT[, exposure := vector(mode = "numeric", length = .N)] DT[fini < as.Date("2006-01-01"), exposure := 1] DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"), exposure := difftime(as.Date("2007-01-01"), fini, units="days")/365.25] DT[fini >= as.Date("2006-07-01"), exposure := 0.5] Best, Ista On Mon, Sep 26, 2016 at 11:28 AM, Frank S. wrote: > Dear all, > > I have a R data table like this: > > DT <- data.table( > id = rep(c(2, 5, 7), c(3, 2, 2)), > fini = rep(as.Date(c('2005-04-20', '2006-02-19', '2006-10-08')), c(3, > 2, 2)), > group = rep(c("A", "B", "A"), c(3, 2, 2)) ) > > > I want to construct a new variable "exposure" defined as follows: > > 1) If "fini" earlier than 2006-01-01 --> "exposure" = 1 > 2) If "fini" in [2006-01-01, 2006-06-30] --> "exposure" = "2007-01-01" - > "fini" > 3) If "fini" in [2006-07-01, 2006-12-31] --> "exposure" = 0.5 > > > So the desired output would be the following data table: > >idfini exposure group > 1: 2 2005-04-201.00A > 2: 2 2005-04-201.00A > 3: 2 2005-04-201.00A > 4: 5 2006-02-190.87B > 5: 5 2006-02-190.87B > 6: 7 2006-10-080.50A > 7: 7 2006-10-080.50A > > > I have tried: > > DT <- DT[ , list(id, fini, exposure = 0, group)] > DT.new <- lapply(DT, function(exposure){ > exposure[fini < as.Date("2006-01-01")] <- 1 # 1st case > exposure[fini >= as.Date("2006-01-01") & fini <= > as.Date("2006-06-30")] <- difftime(as.Date("2007-01-01"), fini, > units="days")/365.25 # 2nd case > exposure[fini >= as.Date("2006-07-01") & fini <= > as.Date("2006-12-31")] <- 0.5 # 3rd case > exposure # return value > }) > > > But I get an error message. > > Thanks for any help!! > > > Frank S. > > >
Re: [R] curve() doesn't seem to use the whole range of x? And Error: longer object length is not a multiple of shorter object length
I think you are going to have to be more specific than "having some trouble". Your plot used lka as the x-axis. FWIW note that lm(ruotsi.pist ~ mies + koulu + clka + koulu*clka, data=dta) is the same as lm(ruotsi.pist ~ mies + koulu*clka, data=dta) -- Sent from my phone. Please excuse my brevity. On September 26, 2016 9:41:57 AM PDT, Matti Viljamaa wrote: > >> On 26 Sep 2016, at 19:41, Matti Viljamaa wrote: >> >> Thank you. >> >> However, I’m having some trouble converting your code to use clka, >because the model I was using was: >> >> fit2 <- lm(ruotsi.pist ~ mies + koulu + clka + koulu*clka, data=dta) > >I mean, not to use clka to replace lka. But to use the above fit2, >rather than your fit2. > >>> On 25 Sep 2016, at 21:23, Jeff Newmiller >wrote: >>> >>> This illustrates why you need to post a reproducible example. You >have a number of confounding factors in your code. >>> >>> First, "data" is a commonly-used function... avoid using it for >variable names. >>> >>> Second, using the attach function this way leads to confusion... >best to forget this function until you start building packages. >>> >>> Third, clka is linearly dependent on lka, so having them both in the >regression is not possible. In this case lm has chosen to ignore clka >so that bs("clka") is NA. >>> >>> Fourth, curve expects you to give it a function, and instead you >have given it a vector. >>> >>> Fifth, you are plotting versus lka, but attempting to vary clka in >the curve call. >>> >>> There are a number of directions you could go with this to get a >working output... below is my version. >>> >>> dta <- read.table( >"http://users.jyu.fi/~slahola/files/glm1_datoja/yoruotsi.txt";, >header=TRUE ) >>> fit2 <- lm( ruotsi.pist ~ mies + koulu*lka, data=dta ) >>> bs <- coef( fit2 ) >>> rpBylka <- function( lka ) { >>> kouluB <- factor( "B", levels = levels( dta$koulu ) ) >>> newdta <- expand.grid( mies=0, koulu=kouluB, lka=lka ) >>> predict( fit2, newdata = newdta ) >>> } >>> dtaKouluB <- subset( dta, koulu == "B" ) >>> varitB <- dtaKouluB$mies >>> varitB[ varitB == 0 ] <- 2 >>> plot( dtaKouluB$lka >>> , dtaKouluB$ruotsi.pist >>> , col=varitB >>> , pch=16 >>> , xlab='lka' >>> , ylab='ruotsi.pist' >>> , main='Lukio B' >>> ) >>> curve( rpBylka, from = min( dta$lka ), max( dta$lka ), add=TRUE, >col="red" ) >>> >>> On Sun, 25 Sep 2016, Matti Viljamaa wrote: >>> > On 25 Sep 2016, at 19:37, Matti Viljamaa >wrote: > > Okay here?s a pretty short code to reproduce it: > > data <- >read.table("http://users.jyu.fi/~slahola/files/glm1_datoja/yoruotsi.txt";, >header=TRUE) data$clka <- I(data$lka - mean(data$lka)) > attach(data) > > fit2 <- lm(ruotsi.pist ~ mies + koulu + lka + koulu*clka) > > bs <- coef(fit2) > > varitB <- c(data[koulu == 'B',]$mies) > varitB[varitB == 0] = 2 > plot(data[data$koulu == 'B',]$lka, data[koulu == >'B',]$ruotsi.pist, col=varitB, pch=16, xlab='', ylab='', main='Lukio >B?) > > >curve(bs["(Intercept)"]+bs["mies"]*0+bs["kouluB"]+bs["lka"]*x+bs["kouluB:clka"]*clka, >from=min(lka), to=max(lka), add=TRUE, col='red') > > >> On 25 Sep 2016, at 19:24, Jeff Newmiller > wrote: >> >> Go directly to C. Do not pass go, do not collect $200. >> >> You think curve does something, but you are missing what it >actually does. Since you don't seem to be learning from reading ?curve >or from our responses, you need to give us an example you can learn >from. >> -- >> Sent from my phone. Please excuse my brevity. >> >> On September 25, 2016 9:04:09 AM PDT, mviljamaa > wrote: >>> On 2016-09-25 18:52, Jeff Newmiller wrote: You seem to be confused about what curve is doing vs. what you >are doing. >>> >>> But my x-range in curve()'s parameters from and to should be the >entire >>> >>> lka vector, since they are from=min(lka) and to=max(lka). Then >why does >>> >>> this not span the entire of lka? Because of duplicate entries or >what? >>> >>> It seems like I cannot use curve(), since my x-axis must be >exactly lka >>> >>> for the function to plot the y value for every lka value. >>> A) Compute the points you want to plot and put them into 2 >vectors. Then figure out how to plot those vectors. Then (perhaps) >consider putting that all into one line of code again. B) The predict function is the preferred way to compute points. >It >>> may be educational for you to do the computations by hand at first, >but >>> in the long run using predict will help you avoid problems getting >the equations right in multiple places in your script. C) Learn what makes an example reproducible (e.g. [1] or [2]), >and >>> ask your questions with reproducible code and data so we can give >you concrete resp
Re: [R] curve() doesn't seem to use the whole range of x? And Error: longer object length is not a multiple of shorter object length
If your goal is to visualize the predicted curve from an lm fit (or other model fit) then you may want to look at the Predict.Plot and TkPredict functions from the TeachingDemos package. On Sun, Sep 25, 2016 at 7:01 AM, Matti Viljamaa wrote: > I’m trying to plot regression lines using curve() > > The way I do it is: > > bs <- coef(fit2) > > and then for example: > > curve(bs["(Intercept)"]+bs["mies"]*0+bs["kouluB"]+bs["lka"]*x+bs["kouluB:clka"]*clka, > from=min(lka), to=max(lka), add=TRUE, col='red') > > This above code runs into error: > > Error in curve(bs["(Intercept)"] + bs["mies"] * 0 + bs["kouluB"] + bs["lka"] > * : > 'expr' did not evaluate to an object of length 'n' > In addition: Warning message: > In bs["(Intercept)"] + bs["mies"] * 0 + bs["kouluB"] + bs["lka"] * : > longer object length is not a multiple of shorter object length > > Which I’ve investigated might be related to the lengths of the different > objects being multiplied or summed. > Taking length(g$x) or length(g$y) of > > g <- curve(bs["(Intercept)"]+bs["mies"]*0+bs["kouluB"]+bs["lka"]*x, > from=min(lka), to=max(lka), add=TRUE, col='red') > > returns 101. > > However length(lka) is 375. But perhaps these being different is not the > problem? > > I however do see that the whole range of lka is not plotted, for some reason. > So how can I be sure > that it passes through all x-values in lka? And i.e. that the lengths of > objects inside curve() are correct? > > What can I do? > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using lapply in R data table
I thought that that was a typo from the OP, as it disagrees with his example. But the labels are arbitrary, so in fact cut() will do it whichever way he meant. -- Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Sep 26, 2016 at 11:37 AM, Ista Zahn wrote: > On Mon, Sep 26, 2016 at 1:59 PM, Bert Gunter wrote: >> This seems like a job for cut() . > > I thought that at first two, but the middle group shouldn't be .87 but rather > > exposure" = "2007-01-01" - "fini" > > so, I think cut alone won't do it. > > Best, > Ista >> >> (I made DT a data frame to avoid loading the data table package. But I >> assume it would work with a data table too, Check this, though!) >> >>> DT <- within(DT, exposure <- >>> cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")), >>> labels= c(1,.87,.5))) >> >>> DT >> id fini group exposure >> 1 2 2005-04-20 A1 >> 2 2 2005-04-20 A1 >> 3 2 2005-04-20 A1 >> 4 5 2006-02-19 B 0.87 >> 5 5 2006-02-19 B 0.87 >> 6 7 2006-10-08 A 0.5 >> 7 7 2006-10-08 A 0.5 >> >> >> (but note that exposure is a factor, not numeric) >> >> >> Cheers, >> Bert >> >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Mon, Sep 26, 2016 at 10:05 AM, Ista Zahn wrote: >>> Hi Frank, >>> >>> lapply(DT) iterates over each column. That doesn't seem to be what you want. >>> >>> There are probably better ways, but here is one approach. >>> >>> DT[, exposure := vector(mode = "numeric", length = .N)] >>> DT[fini < as.Date("2006-01-01"), exposure := 1] >>> DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"), >>> exposure := difftime(as.Date("2007-01-01"), fini, >>> units="days")/365.25] >>> DT[fini >= as.Date("2006-07-01"), exposure := 0.5] >>> >>> Best, >>> Ista >>> >>> On Mon, Sep 26, 2016 at 11:28 AM, Frank S. wrote: Dear all, I have a R data table like this: DT <- data.table( id = rep(c(2, 5, 7), c(3, 2, 2)), fini = rep(as.Date(c('2005-04-20', '2006-02-19', '2006-10-08')), c(3, 2, 2)), group = rep(c("A", "B", "A"), c(3, 2, 2)) ) I want to construct a new variable "exposure" defined as follows: 1) If "fini" earlier than 2006-01-01 --> "exposure" = 1 2) If "fini" in [2006-01-01, 2006-06-30] --> "exposure" = "2007-01-01" - "fini" 3) If "fini" in [2006-07-01, 2006-12-31] --> "exposure" = 0.5 So the desired output would be the following data table: idfini exposure group 1: 2 2005-04-201.00A 2: 2 2005-04-201.00A 3: 2 2005-04-201.00A 4: 5 2006-02-190.87B 5: 5 2006-02-190.87B 6: 7 2006-10-080.50A 7: 7 2006-10-080.50A I have tried: DT <- DT[ , list(id, fini, exposure = 0, group)] DT.new <- lapply(DT, function(exposure){ exposure[fini < as.Date("2006-01-01")] <- 1 # 1st case exposure[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30")] <- difftime(as.Date("2007-01-01"), fini, units="days")/365.25 # 2nd case exposure[fini >= as.Date("2006-07-01") & fini <= as.Date("2006-12-31")] <- 0.5 # 3rd case exposure # return value }) But I get an error message. Thanks for any help!! Frank S. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. >>> >>> __ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using lapply in R data table
On Mon, Sep 26, 2016 at 1:59 PM, Bert Gunter wrote: > This seems like a job for cut() . I thought that at first two, but the middle group shouldn't be .87 but rather exposure" = "2007-01-01" - "fini" so, I think cut alone won't do it. Best, Ista > > (I made DT a data frame to avoid loading the data table package. But I > assume it would work with a data table too, Check this, though!) > >> DT <- within(DT, exposure <- >> cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")), >> labels= c(1,.87,.5))) > >> DT > id fini group exposure > 1 2 2005-04-20 A1 > 2 2 2005-04-20 A1 > 3 2 2005-04-20 A1 > 4 5 2006-02-19 B 0.87 > 5 5 2006-02-19 B 0.87 > 6 7 2006-10-08 A 0.5 > 7 7 2006-10-08 A 0.5 > > > (but note that exposure is a factor, not numeric) > > > Cheers, > Bert > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Mon, Sep 26, 2016 at 10:05 AM, Ista Zahn wrote: >> Hi Frank, >> >> lapply(DT) iterates over each column. That doesn't seem to be what you want. >> >> There are probably better ways, but here is one approach. >> >> DT[, exposure := vector(mode = "numeric", length = .N)] >> DT[fini < as.Date("2006-01-01"), exposure := 1] >> DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"), >> exposure := difftime(as.Date("2007-01-01"), fini, units="days")/365.25] >> DT[fini >= as.Date("2006-07-01"), exposure := 0.5] >> >> Best, >> Ista >> >> On Mon, Sep 26, 2016 at 11:28 AM, Frank S. wrote: >>> Dear all, >>> >>> I have a R data table like this: >>> >>> DT <- data.table( >>> id = rep(c(2, 5, 7), c(3, 2, 2)), >>> fini = rep(as.Date(c('2005-04-20', '2006-02-19', '2006-10-08')), c(3, 2, >>> 2)), >>> group = rep(c("A", "B", "A"), c(3, 2, 2)) ) >>> >>> >>> I want to construct a new variable "exposure" defined as follows: >>> >>> 1) If "fini" earlier than 2006-01-01 --> "exposure" = 1 >>> 2) If "fini" in [2006-01-01, 2006-06-30] --> "exposure" = "2007-01-01" - >>> "fini" >>> 3) If "fini" in [2006-07-01, 2006-12-31] --> "exposure" = 0.5 >>> >>> >>> So the desired output would be the following data table: >>> >>>idfini exposure group >>> 1: 2 2005-04-201.00A >>> 2: 2 2005-04-201.00A >>> 3: 2 2005-04-201.00A >>> 4: 5 2006-02-190.87B >>> 5: 5 2006-02-190.87B >>> 6: 7 2006-10-080.50A >>> 7: 7 2006-10-080.50A >>> >>> >>> I have tried: >>> >>> DT <- DT[ , list(id, fini, exposure = 0, group)] >>> DT.new <- lapply(DT, function(exposure){ >>> exposure[fini < as.Date("2006-01-01")] <- 1 # 1st case >>> exposure[fini >= as.Date("2006-01-01") & fini <= >>> as.Date("2006-06-30")] <- difftime(as.Date("2007-01-01"), fini, >>> units="days")/365.25 # 2nd case >>> exposure[fini >= as.Date("2006-07-01") & fini <= as.Date("2006-12-31")] >>> <- 0.5 # 3rd case >>> exposure # return value >>> }) >>> >>> >>> But I get an error message. >>> >>> Thanks for any help!! >>> >>> >>> Frank S. >>> >>> >>> [[alternative HTML version deleted]] >>> >>> __ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using lapply in R data table
This seems like a job for cut() . (I made DT a data frame to avoid loading the data table package. But I assume it would work with a data table too, Check this, though!) > DT <- within(DT, exposure <- > cut(fini,as.Date(c("2000-01-01","2006-01-01","2006-06-30","2006-12-21")), > labels= c(1,.87,.5))) > DT id fini group exposure 1 2 2005-04-20 A1 2 2 2005-04-20 A1 3 2 2005-04-20 A1 4 5 2006-02-19 B 0.87 5 5 2006-02-19 B 0.87 6 7 2006-10-08 A 0.5 7 7 2006-10-08 A 0.5 (but note that exposure is a factor, not numeric) Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Sep 26, 2016 at 10:05 AM, Ista Zahn wrote: > Hi Frank, > > lapply(DT) iterates over each column. That doesn't seem to be what you want. > > There are probably better ways, but here is one approach. > > DT[, exposure := vector(mode = "numeric", length = .N)] > DT[fini < as.Date("2006-01-01"), exposure := 1] > DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"), > exposure := difftime(as.Date("2007-01-01"), fini, units="days")/365.25] > DT[fini >= as.Date("2006-07-01"), exposure := 0.5] > > Best, > Ista > > On Mon, Sep 26, 2016 at 11:28 AM, Frank S. wrote: >> Dear all, >> >> I have a R data table like this: >> >> DT <- data.table( >> id = rep(c(2, 5, 7), c(3, 2, 2)), >> fini = rep(as.Date(c('2005-04-20', '2006-02-19', '2006-10-08')), c(3, 2, >> 2)), >> group = rep(c("A", "B", "A"), c(3, 2, 2)) ) >> >> >> I want to construct a new variable "exposure" defined as follows: >> >> 1) If "fini" earlier than 2006-01-01 --> "exposure" = 1 >> 2) If "fini" in [2006-01-01, 2006-06-30] --> "exposure" = "2007-01-01" - >> "fini" >> 3) If "fini" in [2006-07-01, 2006-12-31] --> "exposure" = 0.5 >> >> >> So the desired output would be the following data table: >> >>idfini exposure group >> 1: 2 2005-04-201.00A >> 2: 2 2005-04-201.00A >> 3: 2 2005-04-201.00A >> 4: 5 2006-02-190.87B >> 5: 5 2006-02-190.87B >> 6: 7 2006-10-080.50A >> 7: 7 2006-10-080.50A >> >> >> I have tried: >> >> DT <- DT[ , list(id, fini, exposure = 0, group)] >> DT.new <- lapply(DT, function(exposure){ >> exposure[fini < as.Date("2006-01-01")] <- 1 # 1st case >> exposure[fini >= as.Date("2006-01-01") & fini <= >> as.Date("2006-06-30")] <- difftime(as.Date("2007-01-01"), fini, >> units="days")/365.25 # 2nd case >> exposure[fini >= as.Date("2006-07-01") & fini <= as.Date("2006-12-31")] >> <- 0.5 # 3rd case >> exposure # return value >> }) >> >> >> But I get an error message. >> >> Thanks for any help!! >> >> >> Frank S. >> >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using lapply in R data table
Hi Frank, lapply(DT) iterates over each column. That doesn't seem to be what you want. There are probably better ways, but here is one approach. DT[, exposure := vector(mode = "numeric", length = .N)] DT[fini < as.Date("2006-01-01"), exposure := 1] DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"), exposure := difftime(as.Date("2007-01-01"), fini, units="days")/365.25] DT[fini >= as.Date("2006-07-01"), exposure := 0.5] Best, Ista On Mon, Sep 26, 2016 at 11:28 AM, Frank S. wrote: > Dear all, > > I have a R data table like this: > > DT <- data.table( > id = rep(c(2, 5, 7), c(3, 2, 2)), > fini = rep(as.Date(c('2005-04-20', '2006-02-19', '2006-10-08')), c(3, 2, > 2)), > group = rep(c("A", "B", "A"), c(3, 2, 2)) ) > > > I want to construct a new variable "exposure" defined as follows: > > 1) If "fini" earlier than 2006-01-01 --> "exposure" = 1 > 2) If "fini" in [2006-01-01, 2006-06-30] --> "exposure" = "2007-01-01" - > "fini" > 3) If "fini" in [2006-07-01, 2006-12-31] --> "exposure" = 0.5 > > > So the desired output would be the following data table: > >idfini exposure group > 1: 2 2005-04-201.00A > 2: 2 2005-04-201.00A > 3: 2 2005-04-201.00A > 4: 5 2006-02-190.87B > 5: 5 2006-02-190.87B > 6: 7 2006-10-080.50A > 7: 7 2006-10-080.50A > > > I have tried: > > DT <- DT[ , list(id, fini, exposure = 0, group)] > DT.new <- lapply(DT, function(exposure){ > exposure[fini < as.Date("2006-01-01")] <- 1 # 1st case > exposure[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30")] > <- difftime(as.Date("2007-01-01"), fini, units="days")/365.25 # 2nd case > exposure[fini >= as.Date("2006-07-01") & fini <= as.Date("2006-12-31")] > <- 0.5 # 3rd case > exposure # return value > }) > > > But I get an error message. > > Thanks for any help!! > > > Frank S. > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] curve() doesn't seem to use the whole range of x? And Error: longer object length is not a multiple of shorter object length
> On 26 Sep 2016, at 19:41, Matti Viljamaa wrote: > > Thank you. > > However, I’m having some trouble converting your code to use clka, because > the model I was using was: > > fit2 <- lm(ruotsi.pist ~ mies + koulu + clka + koulu*clka, data=dta) I mean, not to use clka to replace lka. But to use the above fit2, rather than your fit2. >> On 25 Sep 2016, at 21:23, Jeff Newmiller wrote: >> >> This illustrates why you need to post a reproducible example. You have a >> number of confounding factors in your code. >> >> First, "data" is a commonly-used function... avoid using it for variable >> names. >> >> Second, using the attach function this way leads to confusion... best to >> forget this function until you start building packages. >> >> Third, clka is linearly dependent on lka, so having them both in the >> regression is not possible. In this case lm has chosen to ignore clka so >> that bs("clka") is NA. >> >> Fourth, curve expects you to give it a function, and instead you have given >> it a vector. >> >> Fifth, you are plotting versus lka, but attempting to vary clka in the curve >> call. >> >> There are a number of directions you could go with this to get a working >> output... below is my version. >> >> dta <- read.table( >> "http://users.jyu.fi/~slahola/files/glm1_datoja/yoruotsi.txt";, header=TRUE ) >> fit2 <- lm( ruotsi.pist ~ mies + koulu*lka, data=dta ) >> bs <- coef( fit2 ) >> rpBylka <- function( lka ) { >> kouluB <- factor( "B", levels = levels( dta$koulu ) ) >> newdta <- expand.grid( mies=0, koulu=kouluB, lka=lka ) >> predict( fit2, newdata = newdta ) >> } >> dtaKouluB <- subset( dta, koulu == "B" ) >> varitB <- dtaKouluB$mies >> varitB[ varitB == 0 ] <- 2 >> plot( dtaKouluB$lka >> , dtaKouluB$ruotsi.pist >> , col=varitB >> , pch=16 >> , xlab='lka' >> , ylab='ruotsi.pist' >> , main='Lukio B' >> ) >> curve( rpBylka, from = min( dta$lka ), max( dta$lka ), add=TRUE, col="red" ) >> >> On Sun, 25 Sep 2016, Matti Viljamaa wrote: >> >>> On 25 Sep 2016, at 19:37, Matti Viljamaa wrote: Okay here?s a pretty short code to reproduce it: data <- read.table("http://users.jyu.fi/~slahola/files/glm1_datoja/yoruotsi.txt";, header=TRUE) >>> >>> data$clka <- I(data$lka - mean(data$lka)) >>> attach(data) fit2 <- lm(ruotsi.pist ~ mies + koulu + lka + koulu*clka) bs <- coef(fit2) varitB <- c(data[koulu == 'B',]$mies) varitB[varitB == 0] = 2 plot(data[data$koulu == 'B',]$lka, data[koulu == 'B',]$ruotsi.pist, col=varitB, pch=16, xlab='', ylab='', main='Lukio B?) curve(bs["(Intercept)"]+bs["mies"]*0+bs["kouluB"]+bs["lka"]*x+bs["kouluB:clka"]*clka, from=min(lka), to=max(lka), add=TRUE, col='red') > On 25 Sep 2016, at 19:24, Jeff Newmiller wrote: > > Go directly to C. Do not pass go, do not collect $200. > > You think curve does something, but you are missing what it actually > does. Since you don't seem to be learning from reading ?curve or from our > responses, you need to give us an example you can learn from. > -- > Sent from my phone. Please excuse my brevity. > > On September 25, 2016 9:04:09 AM PDT, mviljamaa > wrote: >> On 2016-09-25 18:52, Jeff Newmiller wrote: >>> You seem to be confused about what curve is doing vs. what you are >>> doing. >> >> But my x-range in curve()'s parameters from and to should be the entire >> >> lka vector, since they are from=min(lka) and to=max(lka). Then why does >> >> this not span the entire of lka? Because of duplicate entries or what? >> >> It seems like I cannot use curve(), since my x-axis must be exactly lka >> >> for the function to plot the y value for every lka value. >> >>> A) Compute the points you want to plot and put them into 2 vectors. >>> Then figure out how to plot those vectors. Then (perhaps) consider >>> putting that all into one line of code again. >>> >>> B) The predict function is the preferred way to compute points. It >> may >>> be educational for you to do the computations by hand at first, but >> in >>> the long run using predict will help you avoid problems getting the >>> equations right in multiple places in your script. >>> >>> C) Learn what makes an example reproducible (e.g. [1] or [2]), and >> ask >>> your questions with reproducible code and data so we can give you >>> concrete responses. >>> >>> [1] http://adv-r.had.co.nz/Reproducibility.html >>> [2] >>> >> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example >>> -- >>> Sent from my phone. Please excuse my brevity. >>> >>> On September 25, 2016 8:36:49 AM PDT, mviljamaa >>> wrote: On 2016-09-25 18:30, Duncan Murdoch wrote: > On 25/09/2016 9:10 AM, Matti Viljam
Re: [R] curve() doesn't seem to use the whole range of x? And Error: longer object length is not a multiple of shorter object length
Thank you. However, I’m having some trouble converting your code to use clka, because the model I was using was: fit2 <- lm(ruotsi.pist ~ mies + koulu + clka + koulu*clka, data=dta) > On 25 Sep 2016, at 21:23, Jeff Newmiller wrote: > > This illustrates why you need to post a reproducible example. You have a > number of confounding factors in your code. > > First, "data" is a commonly-used function... avoid using it for variable > names. > > Second, using the attach function this way leads to confusion... best to > forget this function until you start building packages. > > Third, clka is linearly dependent on lka, so having them both in the > regression is not possible. In this case lm has chosen to ignore clka so that > bs("clka") is NA. > > Fourth, curve expects you to give it a function, and instead you have given > it a vector. > > Fifth, you are plotting versus lka, but attempting to vary clka in the curve > call. > > There are a number of directions you could go with this to get a working > output... below is my version. > > dta <- read.table( > "http://users.jyu.fi/~slahola/files/glm1_datoja/yoruotsi.txt";, header=TRUE ) > fit2 <- lm( ruotsi.pist ~ mies + koulu*lka, data=dta ) > bs <- coef( fit2 ) > rpBylka <- function( lka ) { > kouluB <- factor( "B", levels = levels( dta$koulu ) ) > newdta <- expand.grid( mies=0, koulu=kouluB, lka=lka ) > predict( fit2, newdata = newdta ) > } > dtaKouluB <- subset( dta, koulu == "B" ) > varitB <- dtaKouluB$mies > varitB[ varitB == 0 ] <- 2 > plot( dtaKouluB$lka >, dtaKouluB$ruotsi.pist >, col=varitB >, pch=16 >, xlab='lka' >, ylab='ruotsi.pist' >, main='Lukio B' >) > curve( rpBylka, from = min( dta$lka ), max( dta$lka ), add=TRUE, col="red" ) > > On Sun, 25 Sep 2016, Matti Viljamaa wrote: > >> >>> On 25 Sep 2016, at 19:37, Matti Viljamaa wrote: >>> >>> Okay here?s a pretty short code to reproduce it: >>> >>> data <- >>> read.table("http://users.jyu.fi/~slahola/files/glm1_datoja/yoruotsi.txt";, >>> header=TRUE) >> >> data$clka <- I(data$lka - mean(data$lka)) >> >>> attach(data) >>> >>> fit2 <- lm(ruotsi.pist ~ mies + koulu + lka + koulu*clka) >>> >>> bs <- coef(fit2) >>> >>> varitB <- c(data[koulu == 'B',]$mies) >>> varitB[varitB == 0] = 2 >>> plot(data[data$koulu == 'B',]$lka, data[koulu == 'B',]$ruotsi.pist, >>> col=varitB, pch=16, xlab='', ylab='', main='Lukio B?) >>> >>> curve(bs["(Intercept)"]+bs["mies"]*0+bs["kouluB"]+bs["lka"]*x+bs["kouluB:clka"]*clka, >>> from=min(lka), to=max(lka), add=TRUE, col='red') >>> >>> On 25 Sep 2016, at 19:24, Jeff Newmiller wrote: Go directly to C. Do not pass go, do not collect $200. You think curve does something, but you are missing what it actually does. Since you don't seem to be learning from reading ?curve or from our responses, you need to give us an example you can learn from. -- Sent from my phone. Please excuse my brevity. On September 25, 2016 9:04:09 AM PDT, mviljamaa wrote: > On 2016-09-25 18:52, Jeff Newmiller wrote: >> You seem to be confused about what curve is doing vs. what you are >> doing. > > But my x-range in curve()'s parameters from and to should be the entire > > lka vector, since they are from=min(lka) and to=max(lka). Then why does > > this not span the entire of lka? Because of duplicate entries or what? > > It seems like I cannot use curve(), since my x-axis must be exactly lka > > for the function to plot the y value for every lka value. > >> A) Compute the points you want to plot and put them into 2 vectors. >> Then figure out how to plot those vectors. Then (perhaps) consider >> putting that all into one line of code again. >> >> B) The predict function is the preferred way to compute points. It > may >> be educational for you to do the computations by hand at first, but > in >> the long run using predict will help you avoid problems getting the >> equations right in multiple places in your script. >> >> C) Learn what makes an example reproducible (e.g. [1] or [2]), and > ask >> your questions with reproducible code and data so we can give you >> concrete responses. >> >> [1] http://adv-r.had.co.nz/Reproducibility.html >> [2] >> > http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example >> -- >> Sent from my phone. Please excuse my brevity. >> >> On September 25, 2016 8:36:49 AM PDT, mviljamaa >> wrote: >>> On 2016-09-25 18:30, Duncan Murdoch wrote: On 25/09/2016 9:10 AM, Matti Viljamaa wrote: > Writing: > > >>> > bs["(Intercept)"]+bs["mies"]*0+bs["kouluB"]+bs["lka"]*lka+bs["kouluB:clka"]*clka > > i.e. without that being inside curve produces a vector of length >>> 375. > > So now it seem
[R] Problem in "cannot allocate vector of size"
Hi R-Users, I am running raster to point code in R, but I have an error message that "cannot allocate vector of size 1.7 Gb". One of my friends run the same code I used, and it is working with his computer. I am using Window 7 64-bit with 16 GB ram. When I check memory size and limit in RStudio, I have memory.size() : [1] 11205.57 and memory.limit() : [1] 16341. I already searched on google to solve this problem, but I could not fix it. Could anyone possibly help me to solve this problem? Thanks, Sun [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] package Rdice released
The package "Rdice" has just been released on CRAN. It contains a collection of functions to simulate dice rolls and the like. In particular, experiments and exercises can be performed looking at combinations and permutations of values in dice rolls and coin flips, together with the corresponding frequencies of occurrences. When applying each function, the user has to input the number of times (rolls, flips) to toss the dice. Moreover, the package provides functions to generate non-transitive sets of dice (like Efron's) and to check whether a given set of dice is non-transitive with given probability. A vignette with example and use cases is provided. Best regards, Gennaro [[alternative HTML version deleted]] ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] src/Makevars ignored ?
Hello, as far as I understood the R library generic compilation mechanism, compilation of C//C++ sources is controlde 1) at system level by the ocntentos RHOME/etc/Makeconf 2) at user level by the content of ~/.R/Makevars 3) at package level by the content of src/Makevars Problem I have is that src/Makevars is ignored see following example: R is compiled and use the following CC and CFLAGS definition bigmess:epactsR/src > R CMD config CC gcc -std=gnu99 bigmess:epactsR/src > R CMD config CFLAGS -Wall -g so building C sources lead to the following bigmess:epactsR/src > R CMD SHLIB index.c gcc -std=gnu99 -I/local/gensoft2/adm/lib64/R/include -DNDEBUG -I/usr/local/include-fpic -Wall -g -c index.c -o index.o normal, it uses defintion from RHOME/etc/Makeconf when I set upp a ~/.R/Makevars that overwrite CC and CFLAGS definition. bigmess:epactsR/src > cat ~/.R/Makevars CC=gcc CFLAGS=-O3 bigmess:epactsR/src > R CMD SHLIB index.c gcc -I/local/gensoft2/adm/lib64/R/include -DNDEBUG -I/usr/local/include -fpic -O3 -c index.c -o index.o gcc -std=gnu99 -shared -L/usr/local/lib64 -o index.so index.o OK CC and CFLAGS are honored and set accordingly to ~/.R/Makevars but when I try to use src/Makevars, it is ignored bigmess:epactsR/src > cat ~/.R/Makevars cat: /home/edeveaud/.R/Makevars: No such file or directory bigmess:epactsR/src > cat ./Makevars CC = gcc CFLAGS=-O3 bigmess:epactsR/src > R CMD SHLIB index.c gcc -std=gnu99 -I/local/gensoft2/adm/lib64/R/include -DNDEBUG -I/usr/local/include-fpic -Wall -g -c index.c -o index.o what I have missed or is there something wrong ? PS I tested the ssame behaviour with various version of R from R/2.15 to R/3.3 best regards Eric __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 32 and 64 bit R
Hello, I have both 32 and 64 bit verions of R installed. What happens if I open a workspace saved from 64 bit R in the 32 bit version or conversely? I am fairly careless but never noticed any problems. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using lapply in R data table
Dear all, I have a R data table like this: DT <- data.table( id = rep(c(2, 5, 7), c(3, 2, 2)), fini = rep(as.Date(c('2005-04-20', '2006-02-19', '2006-10-08')), c(3, 2, 2)), group = rep(c("A", "B", "A"), c(3, 2, 2)) ) I want to construct a new variable "exposure" defined as follows: 1) If "fini" earlier than 2006-01-01 --> "exposure" = 1 2) If "fini" in [2006-01-01, 2006-06-30] --> "exposure" = "2007-01-01" - "fini" 3) If "fini" in [2006-07-01, 2006-12-31] --> "exposure" = 0.5 So the desired output would be the following data table: idfini exposure group 1: 2 2005-04-201.00A 2: 2 2005-04-201.00A 3: 2 2005-04-201.00A 4: 5 2006-02-190.87B 5: 5 2006-02-190.87B 6: 7 2006-10-080.50A 7: 7 2006-10-080.50A I have tried: DT <- DT[ , list(id, fini, exposure = 0, group)] DT.new <- lapply(DT, function(exposure){ exposure[fini < as.Date("2006-01-01")] <- 1 # 1st case exposure[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30")] <- difftime(as.Date("2007-01-01"), fini, units="days")/365.25 # 2nd case exposure[fini >= as.Date("2006-07-01") & fini <= as.Date("2006-12-31")] <- 0.5 # 3rd case exposure # return value }) But I get an error message. Thanks for any help!! Frank S. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.