Re: [R] Assigning categorical values to dates
Thank you all so much for your time and your help! I am truly grateful for the suggested solutions, but more importantly, for the lessons! Nate Parsons On Thu, Jul 22, 2021 at 4:13 AM Eric Berger wrote: > While the base R solution using 'factor' appears to win based on elegance, > chapeau to the creativity of the other suggestions. > For those who are not aware, R 4.1.0 introduced two features: (1) native > pipe |> and (2) new shorter syntax for anonymous functions. > Erich's suggestion used the native pipe and Rui went with the spirit and > added an anonymous function using the new syntax. > > Everyone has their preferred coding style. I tend to prefer fewer lines of > code (if there is no cost in understanding). > I think the new anonymous function syntax helps in this regard and I see > no reason to use piping if not necessary. > So here is a modified, one-line version of Rui's last suggestion (sans the > amazing observation about handling interactions). > > mutate(date_df, cycle=(\(ranks) match(dates, > ranks))(sort(unique(dates > > Eric > > > > > On Thu, Jul 22, 2021 at 11:11 AM Uwe Ligges < > lig...@statistik.tu-dortmund.de> wrote: > >> For a data.frame d, I'd simply do >> >> d$cycle <- factor(d$dates, labels=1:3) >> >> but I have not idea about tibbles. >> >> >> Best, >> Uwe Ligges >> >> >> On 22.07.2021 05:12, N. F. Parsons wrote: >> > Hi all, >> > >> > If I have a tibble as follows: >> > >> > tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), >> > rep("2021-07-18", 4))) >> > >> > how in the world do I add a column that evaluates each of those dates >> and >> > assigns it a categorical value such that >> > >> > datescycle >> > >> > 2021-07-04 1 >> > 2021-07-04 1 >> > 2021-07-25 3 >> > 2021-07-25 3 >> > 2021-07-25 3 >> > 2021-07-18 2 >> > 2021-07-18 2 >> > 2021-07-18 2 >> > 2021-07-18 2 >> > >> > Not to further complicate matters, but some months I may only have one >> > date, and some months I will have 4 dates - so thats not a fixed >> quantity. >> > We've literally been doing this by hand at my job and I'd like to >> automate >> > it. >> > >> > Thanks in advance! >> > >> > Nate Parsons >> > >> > [[alternative HTML version deleted]] >> > >> > __ >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning categorical values to dates
I had no idea that ‘cur_group_id()’ existed!?!! Will definitely try that. Thank you!!! — Nathan Parsons, B.SC, M.Sc, G.C. Ph.D. Candidate, Dept. of Sociology, Portland State University Adjunct Professor, Dept. of Sociology, Washington State University Graduate Advocate, American Association of University Professors (OR) Recent work (https://www.researchgate.net/profile/Nathan_Parsons3/publications) Schedule an appointment (https://calendly.com/nate-parsons) > On Wednesday, Jul 21, 2021 at 11:54 PM, Rui Barradas (mailto:ruipbarra...@sapo.pt)> wrote: > Hello, > > Here are 3 solutions, one of them the coercion to factor one. > Since you are using tibbles, I assume you also want a dplyr solution. > > > library(dplyr) > > df1 <- tibble(dates = c(rep("2021-07-04", 2), > rep("2021-07-25", 3), > rep("2021-07-18", 4))) > > # base R > as.integer(factor(df1$dates)) > match(df1$dates, unique(sort(df1$dates))) > > # dplyr > df1 %>% group_by(dates) %>% mutate(cycle = cur_group_id()) > > > My favorite is by far the 1st but that's a matter of opinion. > > > Hope this helps, > > Rui Barradas > > > Às 04:46 de 22/07/21, N. F. Parsons escreveu: > > I am not averse to a factor-based solution, but I would still have to > > manually enter that factor each month, correct? If possible, I’d just like > > to point R at that column and have it do the work. > > > > — > > Nathan Parsons, B.SC, M.Sc, G.C. > > > > Ph.D. Candidate, Dept. of Sociology, Portland State University > > Adjunct Professor, Dept. of Sociology, Washington State University > > Graduate Advocate, American Association of University Professors (OR) > > > > Recent work > > (https://www.researchgate.net/profile/Nathan_Parsons3/publications) > > Schedule an appointment (https://calendly.com/nate-parsons) > > > > > On Wednesday, Jul 21, 2021 at 8:30 PM, Tom Woolman > > > mailto:twool...@ontargettek.com)> wrote: > > > > > > Couldn't you convert the date columns to character type data in a data > > > frame, and then convert those strings to factors in a 2nd step? > > > > > > The only downside I think to treating dates as factor levels is that > > > you might have an awful lot of factors if you have a large enough > > > dataset. > > > > > > > > > > > > Quoting "N. F. Parsons" : > > > > > > > Hi all, > > > > > > > > If I have a tibble as follows: > > > > > > > > tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), > > > > rep("2021-07-18", 4))) > > > > > > > > how in the world do I add a column that evaluates each of those dates > > > > and > > > > assigns it a categorical value such that > > > > > > > > dates cycle > > > > > > > > 2021-07-04 1 > > > > 2021-07-04 1 > > > > 2021-07-25 3 > > > > 2021-07-25 3 > > > > 2021-07-25 3 > > > > 2021-07-18 2 > > > > 2021-07-18 2 > > > > 2021-07-18 2 > > > > 2021-07-18 2 > > > > > > > > Not to further complicate matters, but some months I may only have one > > > > date, and some months I will have 4 dates - so thats not a fixed > > > > quantity. > > > > We've literally been doing this by hand at my job and I'd like to > > > > automate > > > > it. > > > > > > > > Thanks in advance! > > > > > > > > Nate Parsons > > > > > > > > [[alternative HTML version deleted]] > > > > > > > > __ > > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > __ > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [EXT] Re: Assigning categorical values to dates
@Tom Okay, yeah. That might actually be an elegant solution. I will mess around with it. Thank you - I’m not in the habit of using factors and am not super familiar with how they automatically sort themselves. @Andrew Yes. Each month is a different 30,000 row file upon which this task must be performed. @Bert If you’re not interested in being helpful, why comment? Am I interupting your clubhouse time? I’m legitimately stumped by this one and reaching out in earnest. “You’ve been told how to do it” Seriously? We all have different backgrounds and knowledge levels with the entire atlas of the wonderful world of R and I neither need or want your opinion on my corner of it. Don’t be a Hooke. I’m not here to impress or inspire confidence in you - I’m here with a question that has had me spinning my wheels for the better part of a day and need fresh perspectives. Your response certainly inspires no confidence in me as to the nature of your character or your knowledge on the topic. Best regards all, — Nathan Parsons, B.SC, M.Sc, G.C. Ph.D. Candidate, Dept. of Sociology, Portland State University Adjunct Professor, Dept. of Sociology, Washington State University Graduate Advocate, American Association of University Professors (OR) Recent work (https://www.researchgate.net/profile/Nathan_Parsons3/publications) Schedule an appointment (https://calendly.com/nate-parsons) > On Wednesday, Jul 21, 2021 at 9:12 PM, Andrew Robinson (mailto:a...@unimelb.edu.au)> wrote: > I wonder if you mean that you want the levels of the factor to reset within > each month? That is not obvious from your example, but implied by your > question. > > Andrew > > > -- > Andrew Robinson > Director, CEBRA and Professor of Biosecurity, > School/s of BioSciences and Mathematics & Statistics > University of Melbourne, VIC 3010 Australia > Tel: (+61) 0403 138 955 > Email: a...@unimelb.edu.au > Website: https://researchers.ms.unimelb.edu.au/~apro@unimelb/ > > I acknowledge the Traditional Owners of the land I inhabit, and pay my > respects to their Elders. > > > > > > On 22 Jul 2021, 1:47 PM +1000, N. F. Parsons , > wrote: > > External email: Please exercise caution > > > > I am not averse to a factor-based solution, but I would still have to > > manually enter that factor each month, correct? If possible, I’d just like > > to point R at that column and have it do the work. > > > > — > > Nathan Parsons, B.SC, M.Sc, G.C. > > > > Ph.D. Candidate, Dept. of Sociology, Portland State University > > Adjunct Professor, Dept. of Sociology, Washington State University > > Graduate Advocate, American Association of University Professors (OR) > > > > Recent work > > (https://www.researchgate.net/profile/Nathan_Parsons3/publications) > > Schedule an appointment (https://calendly.com/nate-parsons) > > > > > On Wednesday, Jul 21, 2021 at 8:30 PM, Tom Woolman > > > mailto:twool...@ontargettek.com)> wrote: > > > > > > Couldn't you convert the date columns to character type data in a data > > > frame, and then convert those strings to factors in a 2nd step? > > > > > > The only downside I think to treating dates as factor levels is that > > > you might have an awful lot of factors if you have a large enough > > > dataset. > > > > > > > > > > > > Quoting "N. F. Parsons" : > > > > > > > Hi all, > > > > > > > > If I have a tibble as follows: > > > > > > > > tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), > > > > rep("2021-07-18", 4))) > > > > > > > > how in the world do I add a column that evaluates each of those dates > > > > and > > > > assigns it a categorical value such that > > > > > > > > dates cycle > > > > > > > > 2021-07-04 1 > > > > 2021-07-04 1 > > > > 2021-07-25 3 > > > > 2021-07-25 3 > > > > 2021-07-25 3 > > > > 2021-07-18 2 > > > > 2021-07-18 2 > > > > 2021-07-18 2 > > > > 2021-07-18 2 > > > > > > > > Not to further complicate matters, but some months I may only have one > > > > date, and some months I will have 4 dates - so thats not a fixed > > > > quantity. > > > > We've literally been doing this by hand at my job and I'd like to > > > > automate > > > > it. > > > > > > > > Thanks in advance! > > > > > > > > Nate Parsons > > > > > > &g
Re: [R] Assigning categorical values to dates
I am not averse to a factor-based solution, but I would still have to manually enter that factor each month, correct? If possible, I’d just like to point R at that column and have it do the work. — Nathan Parsons, B.SC, M.Sc, G.C. Ph.D. Candidate, Dept. of Sociology, Portland State University Adjunct Professor, Dept. of Sociology, Washington State University Graduate Advocate, American Association of University Professors (OR) Recent work (https://www.researchgate.net/profile/Nathan_Parsons3/publications) Schedule an appointment (https://calendly.com/nate-parsons) > On Wednesday, Jul 21, 2021 at 8:30 PM, Tom Woolman (mailto:twool...@ontargettek.com)> wrote: > > Couldn't you convert the date columns to character type data in a data > frame, and then convert those strings to factors in a 2nd step? > > The only downside I think to treating dates as factor levels is that > you might have an awful lot of factors if you have a large enough > dataset. > > > > Quoting "N. F. Parsons" : > > > Hi all, > > > > If I have a tibble as follows: > > > > tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), > > rep("2021-07-18", 4))) > > > > how in the world do I add a column that evaluates each of those dates and > > assigns it a categorical value such that > > > > dates cycle > > > > 2021-07-04 1 > > 2021-07-04 1 > > 2021-07-25 3 > > 2021-07-25 3 > > 2021-07-25 3 > > 2021-07-18 2 > > 2021-07-18 2 > > 2021-07-18 2 > > 2021-07-18 2 > > > > Not to further complicate matters, but some months I may only have one > > date, and some months I will have 4 dates - so thats not a fixed quantity. > > We've literally been doing this by hand at my job and I'd like to automate > > it. > > > > Thanks in advance! > > > > Nate Parsons > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assigning categorical values to dates
Hi all, If I have a tibble as follows: tibble(dates = c(rep("2021-07-04", 2), rep("2021-07-25", 3), rep("2021-07-18", 4))) how in the world do I add a column that evaluates each of those dates and assigns it a categorical value such that datescycle 2021-07-04 1 2021-07-04 1 2021-07-25 3 2021-07-25 3 2021-07-25 3 2021-07-18 2 2021-07-18 2 2021-07-18 2 2021-07-18 2 Not to further complicate matters, but some months I may only have one date, and some months I will have 4 dates - so thats not a fixed quantity. We've literally been doing this by hand at my job and I'd like to automate it. Thanks in advance! Nate Parsons [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R versions > 4.0.2 fail to start in Windows 64-bit
I have a 64-bit Windows machine and I've installed R versions 4.0.0 through 4.0.5 and only versions 4.0.2 and below will successfully start. This morning I installed 4.0.5, and when I start R, the R-gui starts for less than a second and then disappears. I tried to open R using RStudio and I got the following messages: `The R session failed to start.` and ` The R session process exited with code -1073740791` The log for this error is: ``` ERROR system error 10053 (An established connection was aborted by the software in your host machine) [request-uri: /events/get_events]; OCCURRED AT void __thiscall rstudio::session::HttpConnectionImpl::sendResponse(const class rstudio::core::http::Response &) src/cpp/session/http/SessionWin32HttpConnectionListener.cpp:113; LOGGED FROM: void __thiscall rstudio::session::HttpConnectionImpl::sendResponse(const class rstudio::core::http::Response &) src/cpp/session/http/SessionWin32HttpConnectionListener.cpp:118 ``` I'm not sure what is going on, but this same thing happens with R 4.0.3 and R 4.0.4, but not with R 4.0.2. I would like to use the most recent version of R if possible, but for now I am stuck with 4.0.2. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: estimating genotyping error rate
Hello, I have SNP data from genotyping. I would like to estimate the error rate between replicated samples using R. How can I proceed? Thanks Meriam __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R help: fviz_nbclust’ is not available (for R version 3.5.2)
Thanks for your valuable clarifications. I tried all the steps again but the problem remains. In fact, "fviz_nbclust" is a function inside the package "factoextra". I run each step very carefully but the problem remains...It doesn't make sense because I have installed factoextra. This warning appears: could not find function "fviz_nbclust" On Wed, Jan 16, 2019 at 1:22 PM Jeff Newmiller wrote: > > Concept 1: You don't install functions... you install packages that have > functions in them. There is a function fviz_nbclust in factoextra. > > Concept 2: Once a package is installed, you do NOT have to install it again, > e.g. every time you want to do that analysis. Making the installation part of > your script is not advised. > > Concept 3: Typically we do use the library function with a package name at > the beginning of every session where we want to use functions from that > package. However, that is optional... you could also just invoke the function > directly using factoextra::fviz_nbclust(...blahblah...). Having the library > function shortens this and if the package is not installed it provides a > clear error message that can be a reminder to the user to install the package. > > Execute your code line by line and solve the first error you encounter by > examining the error message and reviewing what that line of code is designed > to do. > > On January 16, 2019 11:00:07 AM PST, N Meriam wrote: > >Hello, > >I'm struggling to install a function called "fviz_nbclus". > > > >My code is the following: > >pkgs <- c("factoextra", "NbClust") > >install.packages(pkgs) > >library(factoextra) > >library(NbClust) > ># Standardize the data > >load("df4.rda") > >library(FunCluster) > > > >install.packages("fviz_nbclust") > >#fviz_nbclust(df4, FUNcluster, method = c("silhouette", "wss", > >"gap_stat")) > > > >Installing package into ‘C:/Users/DELL/Documents/R/win-library/3.5’ > >(as ‘lib’ is unspecified) > >Warning in install.packages : > > package ‘fviz_nbclust’ is not available (for R version 3.5.2) > > > >Best, > >Meriam > > > >__ > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > -- > Sent from my phone. Please excuse my brevity. -- Meriam Nefzaoui MSc. in Plant Breeding and Genetics Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R help: fviz_nbclust’ is not available (for R version 3.5.2)
Hello, I'm struggling to install a function called "fviz_nbclus". My code is the following: pkgs <- c("factoextra", "NbClust") install.packages(pkgs) library(factoextra) library(NbClust) # Standardize the data load("df4.rda") library(FunCluster) install.packages("fviz_nbclust") #fviz_nbclust(df4, FUNcluster, method = c("silhouette", "wss", "gap_stat")) Installing package into ‘C:/Users/DELL/Documents/R/win-library/3.5’ (as ‘lib’ is unspecified) Warning in install.packages : package ‘fviz_nbclust’ is not available (for R version 3.5.2) Best, Meriam __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Overlapping legend in a circular dendrogram
Yes I know. Sorry if I reposted this but it's simply because I've received an email mentioning that the file was too big that's why I modified my question and reposted it. I don't want to oblige anyone to respond. I really thought the issue was my file (too big so nobody received it). Thanks for your understanding, Best Myriam On Fri, Jan 11, 2019 at 3:03 PM Bert Gunter wrote: > > This is the 3rd time you've posted this. Please stop re-posting! > > Your question is specialized and involved, and you have failed to provide a > reproducible example/data. We are not obliged to respond. > > You may do better contacting the maintainer, found by ?maintainer, as > recommended by the posting guide for specialized queries such as this. > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Fri, Jan 11, 2019 at 12:47 PM N Meriam wrote: >> >> Hi, I'm facing some issues when generationg a circular dendrogram. >> The labels on the left which are my countries are overlapping with the >> circular dendrogram (middle). Same happens with the labels (regions) >> located on the right. >> I run the following code and I'd like to know what should be changed >> in my code in order to avoid that. >> >> load("hc1.rda") >> library(cluster) >> library(ape) >> library(dendextend) >> library(circlize) >> library(RColorBrewer) >> >> labels = hc1$labels >> n = length(labels) >> dend = as.dendrogram(hc1) >> markcountry=as.data.frame(markcountry1) >> #Country colors >> groupCodes=as.character(as.factor(markcountry[,2])) >> colorCodes=rainbow(length(unique(groupCodes))) #c("blue","red") >> names(colorCodes)=unique(groupCodes) >> labels_colors(dend) <- colorCodes[groupCodes][order.dendrogram(dend)] >> >> #Region colors >> groupCodesR=as.character(as.factor(markcountry[,3])) >> colorCodesR=rainbow(length(unique(groupCodesR))) #c("blue","red") >> names(colorCodesR)=unique(groupCodesR) >> >> circos.par(cell.padding = c(0, 0, 0, 0)) >> circos.initialize(factors = "foo", xlim = c(1, n)) # only one sector >> max_height = attr(dend, "height") # maximum height of the trees >> >> #Region graphics >> circos.trackPlotRegion(ylim = c(0, 1.5), panel.fun = function(x, y) { >> circos.rect(1:361-0.5, rep(0.5, 361), 1:361-0.1, rep(0.8,361), col = >> colorCodesR[groupCodesR][order.dendrogram(dend)], border = NA) >> }, bg.border = NA) >> >> #labels graphics >> circos.trackPlotRegion(ylim = c(0, 0.5), bg.border = NA, >>panel.fun = function(x, y) { >> >>circos.text(1:361-0.5, >> rep(0.5,361),labels(dend), adj = c(0, 0.5), >>facing = "clockwise", niceFacing = >> TRUE, >> col = labels_colors(dend), cex = 0.45) >> >> }) >> dend = color_branches(dend, k = 6, col = 1:6) >> >> #Dendrogram graphics >> circos.trackPlotRegion(ylim = c(0, max_height), bg.border = NA, >>track.height = 0.4, panel.fun = function(x, y) { >> circos.dendrogram(dend, max_height = 0.55) >>}) >> legend("left",names(colorCodes),col=colorCodes,text.col=colorCodes,bty="n",pch=15,cex=0.8) >> legend("right",names(colorCodesR),col=colorCodesR,text.col=colorCodesR,bty="n",pch=15,cex=0.35) >> >> Cheers, >> Myriam >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. -- Meriam Nefzaoui MSc. in Plant Breeding and Genetics Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: Overlapping legend in a circular dendrogram
Hi, I'm facing some issues when generationg a circular dendrogram. The labels on the left which are my countries are overlapping with the circular dendrogram (middle). Same happens with the labels (regions) located on the right. I run the following code and I'd like to know what should be changed in my code in order to avoid that. load("hc1.rda") library(cluster) library(ape) library(dendextend) library(circlize) library(RColorBrewer) labels = hc1$labels n = length(labels) dend = as.dendrogram(hc1) markcountry=as.data.frame(markcountry1) #Country colors groupCodes=as.character(as.factor(markcountry[,2])) colorCodes=rainbow(length(unique(groupCodes))) #c("blue","red") names(colorCodes)=unique(groupCodes) labels_colors(dend) <- colorCodes[groupCodes][order.dendrogram(dend)] #Region colors groupCodesR=as.character(as.factor(markcountry[,3])) colorCodesR=rainbow(length(unique(groupCodesR))) #c("blue","red") names(colorCodesR)=unique(groupCodesR) circos.par(cell.padding = c(0, 0, 0, 0)) circos.initialize(factors = "foo", xlim = c(1, n)) # only one sector max_height = attr(dend, "height") # maximum height of the trees #Region graphics circos.trackPlotRegion(ylim = c(0, 1.5), panel.fun = function(x, y) { circos.rect(1:361-0.5, rep(0.5, 361), 1:361-0.1, rep(0.8,361), col = colorCodesR[groupCodesR][order.dendrogram(dend)], border = NA) }, bg.border = NA) #labels graphics circos.trackPlotRegion(ylim = c(0, 0.5), bg.border = NA, panel.fun = function(x, y) { circos.text(1:361-0.5, rep(0.5,361),labels(dend), adj = c(0, 0.5), facing = "clockwise", niceFacing = TRUE, col = labels_colors(dend), cex = 0.45) }) dend = color_branches(dend, k = 6, col = 1:6) #Dendrogram graphics circos.trackPlotRegion(ylim = c(0, max_height), bg.border = NA, track.height = 0.4, panel.fun = function(x, y) { circos.dendrogram(dend, max_height = 0.55) }) legend("left",names(colorCodes),col=colorCodes,text.col=colorCodes,bty="n",pch=15,cex=0.8) legend("right",names(colorCodesR),col=colorCodesR,text.col=colorCodesR,bty="n",pch=15,cex=0.35) Cheers, Myriam __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Overlapping legend in a circular dendrogram
Dear all, I run the following code and I get this graphic (Imageattached). What should I change in my code in order to adjust the overlapping objects? load("hc1.rda") library(cluster) library(ape) library(dendextend) library(circlize) library(RColorBrewer) labels = hc1$labels n = length(labels) dend = as.dendrogram(hc1) markcountry=as.data.frame(markcountry1) #Country colors groupCodes=as.character(as.factor(markcountry[,2])) colorCodes=rainbow(length(unique(groupCodes))) #c("blue","red") names(colorCodes)=unique(groupCodes) labels_colors(dend) <- colorCodes[groupCodes][order.dendrogram(dend)] #Region colors groupCodesR=as.character(as.factor(markcountry[,3])) colorCodesR=rainbow(length(unique(groupCodesR))) #c("blue","red") names(colorCodesR)=unique(groupCodesR) circos.par(cell.padding = c(0, 0, 0, 0)) circos.initialize(factors = "foo", xlim = c(1, n)) # only one sector max_height = attr(dend, "height") # maximum height of the trees #Region graphics circos.trackPlotRegion(ylim = c(0, 1.5), panel.fun = function(x, y) { circos.rect(1:361-0.5, rep(0.5, 361), 1:361-0.1, rep(0.8,361), col = colorCodesR[groupCodesR][order.dendrogram(dend)], border = NA) }, bg.border = NA) #labels graphics circos.trackPlotRegion(ylim = c(0, 0.5), bg.border = NA, panel.fun = function(x, y) { circos.text(1:361-0.5, rep(0.5,361),labels(dend), adj = c(0, 0.5), facing = "clockwise", niceFacing = TRUE, col = labels_colors(dend), cex = 0.45) }) dend = color_branches(dend, k = 6, col = 1:6) #Dendrogram graphics circos.trackPlotRegion(ylim = c(0, max_height), bg.border = NA, track.height = 0.4, panel.fun = function(x, y) { circos.dendrogram(dend, max_height = 0.55) }) legend("left",names(colorCodes),col=colorCodes,text.col=colorCodes,bty="n",pch=15,cex=0.8) legend("right",names(colorCodesR),col=colorCodesR,text.col=colorCodesR,bty="n",pch=15,cex=0.35) Thanks, Meriam __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R help: circular dendrogram
Dear all, I generated a circular dendrogram with R (see attached). I have a total of 360 landraces. What I want to do next is generate a different color for each cluster and also generate colors to show the country/region. I don't know if it's also possible to put a code number (associated with each landrace) in front of each ramification. I want to have an explicit dendrogram. Rplot01.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Warning message: NAs introduced by coercion
Yes, sorry. I attached the file once again. Well, still getting the same warning. > class(genod) <- "numeric" Warning message: In class(genod) <- "numeric" : NAs introduced by coercion > class(genod) [1] "matrix" Then, I run the following code and it gives this: > filn <-"simTunesian.gds" > snpgdsCreateGeno(filn, genmat = genod, + sample.id = sample.id, snp.id = snp.id, + snp.chromosome = snp.chromosome, + snp.position = snp.position, + snp.allele = snp.allele, snpfirstdim=TRUE) > # calculate similarity matrix > # Open the GDS file > (genofile <- snpgdsOpen(filn)) File: C:\Users\DELL\Documents\TEST\simTunesian.gds (1.4M) +[ ] * |--+ sample.id { Str8 363 ZIP_ra(42.5%), 755B } |--+ snp.id { Int32 15752 ZIP_ra(35.1%), 21.6K } |--+ snp.position { Int32 15752 ZIP_ra(34.7%), 21.3K } |--+ snp.chromosome { Float64 15752 ZIP_ra(0.18%), 230B } |--+ snp.allele { Str8 15752 ZIP_ra(0.16%), 108B } \--+ genotype { Bit2 15752x363, 1.4M } * > ibs <- snpgdsIBS(genofile, remove.monosnp = FALSE, num.thread=1) Identity-By-State (IBS) analysis on genotypes: Excluding 0 SNP on non-autosomes Working space: 363 samples, 15,752 SNPs using 1 (CPU) core IBS:the sum of all selected genotypes (0,1,2) = 3658952 Tue Jan 08 15:38:00 2019(internal increment: 42880) [==] 100%, completed in 0s Tue Jan 08 15:38:00 2019Done. > # maximum similarity value > max(ibs$ibs) [1] NaN > # minimum similarity value > min(ibs$ibs) [1] NaN As you can see, I can't continue my analysis (heat map plot, clustering with hclust) because values are NaN. On Tue, Jan 8, 2019 at 2:01 PM David L Carlson wrote: > > Your attached file is not a .csv file since the field are not separated by > commas (just rename the mydata.csv to mydata.txt). > > The command "genod2 <- as.matrix(genod)" created a character matrix from the > data frame genod. When you try to force genod2 to numeric, the marker column > becomes NAs which is probably not what you want. > > The error message is because you passed genod (a data frame) to the > snpgdsCreateGeno() function not genod2 (the matrix you created from genod). > > > David L. Carlson > Department of Anthropology > Texas A&M University > > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of N Meriam > Sent: Tuesday, January 8, 2019 1:38 PM > To: Michael Dewey > Cc: r-help@r-project.org > Subject: Re: [R] Warning message: NAs introduced by coercion > > Here's a portion of what my data looks like (text file format attached). > When running in R, it gives me this: > > > df4 <- read.csv(file = "mydata.csv", header = TRUE) > > require(SNPRelate) > > library(gdsfmt) > > myd <- df4 > > myd <- df4 > > names(myd)[-1] > [1] "marker" "X88""X9" "X17""X25" > > myd[,1] > [1] 3 4 5 6 8 10 > # the data must be 0,1,2 with 3 as missing so you have r > > sample.id <- names(myd)[-1] > > snp.id <- myd[,1] > > snp.position <- 1:length(snp.id) # not needed for ibs > > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs > > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs > # genotype data must have - in 3 > > genod <- myd[,-1] > > genod[is.na(genod)] <- 3 > > genod[genod=="0"] <- 0 > > genod[genod=="1"] <- 2 > > genod2 <- as.matrix(genod) > > head(genod2) > marker X88 X9 > X17 X25 > [1,] "100023173|F|0-47:G>A-47:G>A" "0""3""3" "3" > [2,] "1043336|F|0-7:A>G-7:A>G" "2""0""3" "0" > [3,] "1212218|F|0-49:A>G-49:A>G" "0""0""0" "0" > [4,] "1019554|F|0-14:T>C-14:T>C" "0" "0""3" "0" > [5,] "100024550|F|0-16:G>A-16:G>A" "3""3""3" "3" > [6,] "1106702|F|0-8:C>A-8:C>A" "0" "0" "0" "0" > > class(genod2) <- "numeric" > Warning message: In class(genod2) <- "numeric" : NAs introduced by coercion > > head(genod2) > marker X88 X9 X17 X25 > [1,] NA 0
Re: [R] Warning message: NAs introduced by coercion
Here's a portion of what my data looks like (text file format attached). When running in R, it gives me this: > df4 <- read.csv(file = "mydata.csv", header = TRUE) > require(SNPRelate) > library(gdsfmt) > myd <- df4 > myd <- df4 > names(myd)[-1] [1] "marker" "X88""X9" "X17""X25" > myd[,1] [1] 3 4 5 6 8 10 # the data must be 0,1,2 with 3 as missing so you have r > sample.id <- names(myd)[-1] > snp.id <- myd[,1] > snp.position <- 1:length(snp.id) # not needed for ibs > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs # genotype data must have - in 3 > genod <- myd[,-1] > genod[is.na(genod)] <- 3 > genod[genod=="0"] <- 0 > genod[genod=="1"] <- 2 > genod2 <- as.matrix(genod) > head(genod2) marker X88 X9 X17 X25 [1,] "100023173|F|0-47:G>A-47:G>A" "0""3""3" "3" [2,] "1043336|F|0-7:A>G-7:A>G" "2""0""3" "0" [3,] "1212218|F|0-49:A>G-49:A>G" "0""0""0" "0" [4,] "1019554|F|0-14:T>C-14:T>C" "0" "0""3" "0" [5,] "100024550|F|0-16:G>A-16:G>A" "3""3""3" "3" [6,] "1106702|F|0-8:C>A-8:C>A" "0" "0" "0" "0" > class(genod2) <- "numeric" Warning message: In class(genod2) <- "numeric" : NAs introduced by coercion > head(genod2) marker X88 X9 X17 X25 [1,] NA 0 3 3 3 [2,] NA 2 0 3 0 [3,] NA 0 0 0 0 [4,] NA 0 0 3 0 [5,] NA 3 3 3 3 [6,] NA 0 0 0 0 > class(genod2) <- "numeric" > class(genod2) [1] "matrix" # read data > filn <-"simTunesian.gds" > snpgdsCreateGeno(filn, genmat = genod, + sample.id = sample.id, snp.id = snp.id, + snp.chromosome = snp.chromosome, + snp.position = snp.position, + snp.allele = snp.allele, snpfirstdim=TRUE) Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id, : is.matrix(genmat) is not TRUE Can't find a solution to my problem...my guess is that the problem comes from converting the column 'marker' factor to numerical. Best, Meriam On Tue, Jan 8, 2019 at 11:28 AM Michael Dewey wrote: > > Dear Meriam > > Your csv file did not come through as attachments are stripped unless of > certain types and you post is very hard to read since you are posting in > HTML. Try renaming the file to .txt and set your mailer to send > plain text then people may be able to help you better. > > Michael > > On 08/01/2019 15:35, N Meriam wrote: > > I see... > > Here's a portion of what my data looks like (csv file attached). > > I run again and here are the results: > > > > df4 <- read.csv(file = "mydata.csv", header = TRUE) > > > >> require(SNPRelate)> library(gdsfmt)> myd <- df4> myd <- df4> > >> names(myd)[-1][1] "marker" "X88""X9" "X17""X25" > > > >> myd[,1][1] 3 4 5 6 8 10 > > > > > >> # the data must be 0,1,2 with 3 as missing so you have r> sample.id <- > >> names(myd)[-1]> snp.id <- myd[,1]> snp.position <- 1:length(snp.id) # not > >> needed for ibs> snp.chromosome <- rep(1, each=length(snp.id)) # not needed > >> for ibs> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs> # > >> genotype data must have - in 3> genod <- myd[,-1]> genod[is.na(genod)] <- > >> 3> genod[genod=="0"] <- 0> genod[genod=="1"] <- 2 > > > >> genod2 <- as.matrix(genod)> head(genod2) marker > >> X88 X9 X17 X25 > > [1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3" > > [2,] "1043336|F|0-7:A>G-7:A>G" "2" "0" "3" "0" > > [3,] "1212218|F|0-49:A>G-49:A>G" "0" "0" "0" "0" > > [4,] "
Re: [R] Warning message: NAs introduced by coercion
I see... Here's a portion of what my data looks like (csv file attached). I run again and here are the results: df4 <- read.csv(file = "mydata.csv", header = TRUE) > require(SNPRelate)> library(gdsfmt)> myd <- df4> myd <- df4> > names(myd)[-1][1] "marker" "X88""X9" "X17""X25" > myd[,1][1] 3 4 5 6 8 10 > # the data must be 0,1,2 with 3 as missing so you have r> sample.id <- > names(myd)[-1]> snp.id <- myd[,1]> snp.position <- 1:length(snp.id) # not > needed for ibs> snp.chromosome <- rep(1, each=length(snp.id)) # not needed > for ibs> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs> # > genotype data must have - in 3> genod <- myd[,-1]> genod[is.na(genod)] <- 3> > genod[genod=="0"] <- 0> genod[genod=="1"] <- 2 > genod2 <- as.matrix(genod)> head(genod2) marker > X88 X9 X17 X25 [1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3" [2,] "1043336|F|0-7:A>G-7:A>G" "2" "0" "3" "0" [3,] "1212218|F|0-49:A>G-49:A>G" "0" "0" "0" "0" [4,] "1019554|F|0-14:T>C-14:T>C" "0" "0" "3" "0" [5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3" [6,] "1106702|F|0-8:C>A-8:C>A" "0" "0" "0" "0" > class(genod2) <- "numeric"Warning message:In class(genod2) <- "numeric" : NAs > introduced by coercion> head(genod2) marker X88 X9 X17 X25 [1,] NA 0 3 3 3 [2,] NA 2 0 3 0 [3,] NA 0 0 0 0 [4,] NA 0 0 3 0 [5,] NA 3 3 3 3 [6,] NA 0 0 0 0 > class(genod2) <- "numeric"> class(genod2)[1] "matrix" > # read data > filn <-"simTunesian.gds"> snpgdsCreateGeno(filn, genmat = > genod,+ sample.id = sample.id, snp.id = snp.id,+ > snp.chromosome = snp.chromosome,+ snp.position = > snp.position,+ snp.allele = snp.allele, > snpfirstdim=TRUE)Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = > sample.id, : is.matrix(genmat) is not TRUE Thanks, Meriam On Tue, Jan 8, 2019 at 9:02 AM PIKAL Petr wrote: > Hi > > see in line > > > -Original Message- > > From: R-help On Behalf Of N Meriam > > Sent: Tuesday, January 8, 2019 3:08 PM > > To: r-help@r-project.org > > Subject: [R] Warning message: NAs introduced by coercion > > > > Dear all, > > > > I have a .csv file called df4. (15752 obs. of 264 variables). > > I apply this code but couldn't continue further other analyses, a warning > > message keeps coming up. Then, I want to determine max and min > > similarity values, > > heat map plot, cluster...etc > > > > > require(SNPRelate) > > > library(gdsfmt) > > > myd <- read.csv(file = "df4.csv", header = TRUE) > > > names(myd)[-1] > > myd[,1] > > > myd[1:10, 1:10] > > # the data must be 0,1,2 with 3 as missing so you have r > > > sample.id <- names(myd)[-1] > > > snp.id <- myd[,1] > > > snp.position <- 1:length(snp.id) # not needed for ibs > > > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs > > > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs > > # genotype data must have - in 3 > > > genod <- myd[,-1] > > > genod[is.na(genod)] <- 3 > > > genod[genod=="0"] <- 0 > > > genod[genod=="1"] <- 2 > > > genod[1:10,1:10] > > > genod <- as.matrix(genod) > > matrix can have only one type of data so you probaly changed it to > character by such construction. > > > > class(genod) <- "numeric" > > This tries to change all "numeric" values to numbers but if it cannot it > sets it to NA. > > something like > > > head(iris) > Sepal.Length Sepal.Width Petal.Length Petal.Width Species > 1 5.1 3.5 1.4 0.2 setosa > 2 4.9 3.0 1.4 0.2 setosa > 3 4.7 3.2 1.3 0.2 setosa > 4 4.6 3.1 1.5 0.2 setosa > 5 5.0 3.6 1.4 0.2 setosa > 6 5.4 3.9 1.7 0.4 setosa
[R] Warning message: NAs introduced by coercion
Dear all, I have a .csv file called df4. (15752 obs. of 264 variables). I apply this code but couldn't continue further other analyses, a warning message keeps coming up. Then, I want to determine max and min similarity values, heat map plot, cluster...etc > require(SNPRelate) > library(gdsfmt) > myd <- read.csv(file = "df4.csv", header = TRUE) > names(myd)[-1] myd[,1] > myd[1:10, 1:10] # the data must be 0,1,2 with 3 as missing so you have r > sample.id <- names(myd)[-1] > snp.id <- myd[,1] > snp.position <- 1:length(snp.id) # not needed for ibs > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs # genotype data must have - in 3 > genod <- myd[,-1] > genod[is.na(genod)] <- 3 > genod[genod=="0"] <- 0 > genod[genod=="1"] <- 2 > genod[1:10,1:10] > genod <- as.matrix(genod) > class(genod) <- "numeric" *Warning message:In class(genod) <- "numeric" : NAs introduced by coercion* Maybe I could illustrate more with details so I can be more specific? Please, let me know. I would appreciate your help. Thanks, Meriam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mysterious rounding digits output
On Thu, May 31, 2018 at 03:30:42PM +1000, Jim Lemon wrote: > Because there are no values in column ddd less than 1. Whoa! Thank you for pointing that out. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mysterious rounding digits output
R version 3.5.0 (2018-04-23) -- "Joy in Playing" Platform: x86_64-pc-linux-gnu (64-bit) options(digits=3) itemInfo <- structure(list("aaa" = c(1.39633732316667, 1.32598263816667, 1.1165832407, 1.23651072616667, 1.0536867998, 1.0310073738, 0.9630728395, 0.7483865045, 0.62008664617, 0.5411017985, 0.49639760783, 0.45952804467, 0.42787704783, 0.40208597967, 0.28316118123, 0.23689627723), "bbb" = c(6.22533860696667, 5.229736804, 4.94816041772833, 4.17020503255333, 4.00453781427167, 3.56058007398333, 3.0125202404, 2.2378235873, 2.14863910661167, 1.90460903044777, 1.62001089796667, 1.56341257968151, 1.23618558850667, 1.10086688908262, 0.661981500639833, 0.47397754310745), "ccc" = c(0.5165911355, 0.46220470767, NA, 0.21963592433, 0.44186378083, 0.36150286583, NA, 0.59613794667, NA, 0.22698477157, NA, 0.36092266158, 0.2145347068, 0.28775624948, NA, NA ), "ddd" = c(5.77538400186667, 5.115877113, NA, 4.71294520316667, 4.25952652129833, 3.68879921863167, NA, 2.01942456211145, NA, 2.02032557108, NA, 1.3818108759571, 1.80436759778167, 1.20789851993367, NA, NA), "eee" = c(2.4972534717, -2.67340172316667, NA, 5.6419520633, 2.0763355523, 2.548949539, NA, 0.465537272243167, NA, 2.34255027516667, NA, 0.5400824922975, 2.1935000655, 0.89000797687, NA, NA)), row.names = c("skill", "predict", "waiting", "complex", "novelty", "creative", "evaluated", "body", "control", "stakes", "spont", "chatter", "present", "reward", "feedback", "goal"), class = "data.frame") itemInfo # examine column ddd When I try this, column ddd has 1 fewer digits than expected. See the attached screenshot. Why don't all the columns have the same number of digits displayed? -- Joshua N. Pritikin, Ph.D. Virginia Institute for Psychiatric and Behavioral Genetics Virginia Commonwealth University PO Box 980126 800 E Leigh St, Biotech One, Suite 1-133 Richmond, VA 23219 http://exuberant-island.surge.sh __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if/else help
Thanks very much for your detailed reply to my post. Very helpful/useful tool(s) you’ve provide me. Best wishes, B. From: William Dunlap [mailto:wdun...@tibco.com] Sent: Wednesday, September 21, 2016 10:48 AM To: Crombie, Burnette N Cc: r-help@r-project.org Subject: Re: [R] if/else help If you write your code as functions you can avoid the nasty 'if(exists("x"))x<-...' business this by writing default values for arguments to your function. They will be computed only when they are used. E.g., analyzeData <- function(a=0, b=0, c=0, d="x", r4 = data.frame(a, b, c, d)) { summary(r4) } > analyzeData(c=101:102) a b c d Min. :0 Min. :0 Min. :101.0 x:2 1st Qu.:0 1st Qu.:0 1st Qu.:101.2 Median :0 Median :0 Median :101.5 Mean :0 Mean :0 Mean :101.5 3rd Qu.:0 3rd Qu.:0 3rd Qu.:101.8 Max. :0 Max. :0 Max. :102.0 > analyzeData(r4=data.frame(a=10:11,b=20:21,c=30:31,d=c("x","y"))) a b c d Min. :10.00 Min. :20.00 Min. :30.00 x:1 1st Qu.:10.25 1st Qu.:20.25 1st Qu.:30.25 y:1 Median :10.50 Median :20.50 Median :30.50 Mean :10.50 Mean :20.50 Mean :30.50 3rd Qu.:10.75 3rd Qu.:20.75 3rd Qu.:30.75 Max. :11.00 Max. :21.00 Max. :31.00 Bill Dunlap TIBCO Software wdunlap tibco.com<http://tibco.com> On Tue, Sep 20, 2016 at 12:31 PM, Crombie, Burnette N mailto:bcrom...@utk.edu>> wrote: If a data.frame (r4) does not exist in my R environment, I would like to create it before I move on to the next step in my script. How do I make that happen? Here is what I want to do from a code perspective: if (exists(r4)) { is.data.frame(get(r4)) } else { a <- 0, b <- 0, c <- 0, d <- "x", r4 <- data.frame(cbind(a,b,c,d)) } Thanks for your help, B [[alternative HTML version deleted]] __ R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if/else help
Thank you for your time, Don. Exactly what I was looking for - a one-liner. Feedback from others on this post has been good to expand my knowledge, though. I'm too old for homework but have just started using R if/else, loops, and functions and trying to get the hang of them. Best wishes - B -Original Message- From: MacQueen, Don [mailto:macque...@llnl.gov] Sent: Wednesday, September 21, 2016 11:26 AM To: Crombie, Burnette N ; r-help@r-project.org Subject: Re: [R] if/else help Hopefully this is not a homework question. The other responses are fine, but I would suggest the simplest way to do exactly what you ask is if (!exists('r4')) r4 <- data.frame(a=0, b=0, c=0, d='x') The exists() function requires a character string for its first argument, i.e., the name of the object, not the object itself (check the help page for exists). Using "get" to get it doesn't make sense if it already exists. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 9/20/16, 12:31 PM, "R-help on behalf of Crombie, Burnette N" wrote: >If a data.frame (r4) does not exist in my R environment, I would like >to create it before I move on to the next step in my script. How do I >make that happen? Here is what I want to do from a code perspective: > >if (exists(r4)) >{ >is.data.frame(get(r4)) >} >else >{ >a <- 0, b <- 0, c <- 0, d <- "x", r4 <- data.frame(cbind(a,b,c,d)) } > >Thanks for your help, >B > > [[alternative HTML version deleted]] > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] if/else help
If a data.frame (r4) does not exist in my R environment, I would like to create it before I move on to the next step in my script. How do I make that happen? Here is what I want to do from a code perspective: if (exists(r4)) { is.data.frame(get(r4)) } else { a <- 0, b <- 0, c <- 0, d <- "x", r4 <- data.frame(cbind(a,b,c,d)) } Thanks for your help, B [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extract baseline from prop.odds function in timereg package
Hi everyone! I am using the prop.odds() function in the timereg package. I am trying to extract the estimated baseline value, G(t), described in the package documentation. Does anyone know how this baseline value can be extracted from the output? Thanks in advance for your help! Lauren [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to round only one df row & how to keep 3rd sigdif if zero
Thanks for taking the time to share your thoughts, PP. I always extensively google & search before resorting to R forum. In my real dataset, not in the example I created for the forum, I had tried converting the matrix to a dataframe but it retained the unwanted format. And, these tables are being used in a report generated with the rtf package, so I have to get the format right for outside the console. Because of another unrelated issue, though, I had to use a different approach to creating the dataframe with counts/rates added, so the issue was circumvented. Cheers. -Original Message- From: PIKAL Petr [mailto:petr.pi...@precheza.cz] Sent: Thursday, June 18, 2015 10:56 AM To: Crombie, Burnette N; r-help@r-project.org Subject: RE: [R] How to round only one df row & how to keep 3rd sigdif if zero Hi You need to distinguish between an object and printing an object on console. When you print an object you can use several options for formating. ?sprintf, ?formatC > formatC(t(a), digits=1, format="f") [,1] [,2] [,3] count "1.0" "2.0" "3.0" rate "16.7" "33.3" "50.0" > Also when you transpose "a" the result is not data frame but matrix. > str(t(a)) num [1:2, 1:3] 1 16.7 2 33.3 3 50 - attr(*, "dimnames")=List of 2 ..$ : chr [1:2] "count" "rate" ..$ : NULL > str(a) 'data.frame': 3 obs. of 2 variables: $ count: num 1 2 3 $ rate : num 16.7 33.3 50 > If you used google or other internet search options you would get plenty of results yourself. try "formatting numbers R" Cheers Petr > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of > bcrombie > Sent: Thursday, June 18, 2015 3:09 PM > To: r-help@r-project.org > Subject: [R] How to round only one df row & how to keep 3rd sigdif if > zero > > # How do I round only one row of a dataframe? > # After transposing a dataframe of counts & rates, all values took on > the most # of signif digits in the dataset (rates), but I want counts > to remain only one digit. > # Also, how can I keep 3 significant digits in R when the 3rd is a > zero? > count <- c(1, 2, 3) > rate <- c(16.7, 33.3, 50.0) > a <- data.frame(count,rate) > a > # count rate > # 1 1 16.7 > # 2 2 33.3 > # 3 3 50.0 > a <- t(a) > a > # [,1] [,2] [,3] > # count 1.0 2.03 > # rate 16.7 33.3 50 > > > > -- > View this message in context: http://r.789695.n4.nabble.com/How-to- > round-only-one-df-row-how-to-keep-3rd-sigdif-if-zero-tp4708819.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html and provide commented, minimal, self-contained, > reproducible code. Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům. Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému. Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu. V případě, že je tento e-mail součástí obchodního jednání: - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu. - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou. - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech. - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender o
[R] including internal data in a package
CRAN check is issuing a complaint, Found the following calls to data() loading into the global environment: File ‘OpenMx/R/MxAlgebra.R’: data(omxSymbolTable) See section ‘Good practice’ in ‘?data’. I tried placing an rda file in the package's R/ directory, but now I get a new CRAN check complaint, Subdirectory 'R' contains invalid file names: ‘omxSymbolTable.rda’ Furthermore, I can't figure out how to load this file. I found this 2013 post, http://r.789695.n4.nabble.com/Good-practice-for-data-for-R-packages-td4660313.html "The objects will be available in your NAMESPACE." -- I don't understand. Can somebody clarify? Thanks. -- Joshua N. Pritikin Department of Psychology University of Virginia 485 McCormick Rd, Gilmer Hall Room 102 Charlottesville, VA 22904 http://people.virginia.edu/~jnp3bc __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stupid thing with the correlation
Well, yes, I do agree with you. And thanks a lot, I found the FAQ 7.31. very useful. Best,N. On Friday, January 30, 2015 12:44 PM, Duncan Murdoch wrote: On 30/01/2015 5:22 AM, n omranian via R-help wrote: > Hi All, > I'm getting actually nuts. I don't understand the following lines in R: > Here is the data, all rows are exactly the same! >> ord_data[pid,] > c0m2 c0m4 c0m8 c0m16 c0m24 c0m48 c0p2 c0p4 >c0p8 c0p16 c0p24 c0p48 c24m2 c24m4 c24m8 c24m16 > 13336382 5.632195 6.442133 5.818143 5.7683 5.862075 6.533181 5.807341 > 6.709774 5.664199 5.54022 5.385181 6.531977 5.29061 5.776121 6.136176 6.34699 > 13465320 5.632195 6.442133 5.818143 5.7683 5.862075 6.533181 5.807341 > 6.709774 5.664199 5.54022 5.385181 6.531977 5.29061 5.776121 6.136176 6.34699 > 13467455 5.632195 6.442133 5.818143 5.7683 5.862075 6.533181 5.807341 > 6.709774 5.664199 5.54022 5.385181 6.531977 5.29061 5.776121 6.136176 6.34699 > 13518680 5.632195 6.442133 5.818143 5.7683 5.862075 6.533181 5.807341 > 6.709774 5.664199 5.54022 5.385181 6.531977 5.29061 5.776121 6.136176 6.34699 > c24m24 c24m48 c24p2 c24p4 c24p8 c24p16 c24p24 c24p48 > 13336382 5.489161 6.043948 5.807802 5.761756 5.833559 5.293438 5.92068 > 5.759683 > 13465320 5.489161 6.043948 5.807802 5.761756 5.833559 5.293438 5.92068 > 5.759683 > 13467455 5.489161 6.043948 5.807802 5.761756 5.833559 5.293438 5.92068 > 5.759683 > 13518680 5.489161 6.043948 5.807802 5.761756 5.833559 5.293438 5.92068 > 5.759683 > > Now I found correlation: >> pcor <- cor(t((ord_data[pid,]))) >> pcor > 13336382 13465320 13467455 13518680 > 13336382 1 1 1 1 > 13465320 1 1 1 1 > 13467455 1 1 1 1 > 13518680 1 1 1 1 > But, then I get this funny result !!! > all(pcor[,1]==1) > [1] FALSE >> pcor[2,2]==1 > [1] TRUE >> pcor[3,2]==1 > [1] FALSE > Could anybody please comment on this?Many thanks. > [[alternative HTML version deleted]] Please just send your messages to r-help, not all those other addresses, and please don't reply to an existing thread. Your answer is given in FAQ 7.31. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stop a function
Hi All, I'm getting actually nuts. I don't understand the following lines in R: Here is the data, all rows are exactly the same! > ord_data[pid,] c0m2 c0m4 c0m8 c0m16 c0m24 c0m48 c0p2 c0p4 c0p8 c0p16 c0p24 c0p48 c24m2 c24m4 c24m8 c24m16 13336382 5.632195 6.442133 5.818143 5.7683 5.862075 6.533181 5.807341 6.709774 5.664199 5.54022 5.385181 6.531977 5.29061 5.776121 6.136176 6.34699 13465320 5.632195 6.442133 5.818143 5.7683 5.862075 6.533181 5.807341 6.709774 5.664199 5.54022 5.385181 6.531977 5.29061 5.776121 6.136176 6.34699 13467455 5.632195 6.442133 5.818143 5.7683 5.862075 6.533181 5.807341 6.709774 5.664199 5.54022 5.385181 6.531977 5.29061 5.776121 6.136176 6.34699 13518680 5.632195 6.442133 5.818143 5.7683 5.862075 6.533181 5.807341 6.709774 5.664199 5.54022 5.385181 6.531977 5.29061 5.776121 6.136176 6.34699 c24m24 c24m48 c24p2 c24p4 c24p8 c24p16 c24p24 c24p48 13336382 5.489161 6.043948 5.807802 5.761756 5.833559 5.293438 5.92068 5.759683 13465320 5.489161 6.043948 5.807802 5.761756 5.833559 5.293438 5.92068 5.759683 13467455 5.489161 6.043948 5.807802 5.761756 5.833559 5.293438 5.92068 5.759683 13518680 5.489161 6.043948 5.807802 5.761756 5.833559 5.293438 5.92068 5.759683 Now I found correlation: > pcor <- cor(t((ord_data[pid,]))) > pcor 13336382 13465320 13467455 13518680 13336382 1 1 1 1 13465320 1 1 1 1 13467455 1 1 1 1 13518680 1 1 1 1 But, then I get this funny result !!! all(pcor[,1]==1) [1] FALSE > pcor[2,2]==1 [1] TRUE > pcor[3,2]==1 [1] FALSE Could anybody please comment on this?Many thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stupid thing with the correlation
Hi All, I'm getting actually nuts. I don't understand the following lines in R: Here is the data, all rows are exactly the same! > ord_data[pid,] c0m2 c0m4 c0m8 c0m16 c0m24 c0m48 c0p2 c0p4 c0p8 c0p16 c0p24 c0p48 c24m2 c24m4 c24m8 c24m16 13336382 5.632195 6.442133 5.818143 5.7683 5.862075 6.533181 5.807341 6.709774 5.664199 5.54022 5.385181 6.531977 5.29061 5.776121 6.136176 6.34699 13465320 5.632195 6.442133 5.818143 5.7683 5.862075 6.533181 5.807341 6.709774 5.664199 5.54022 5.385181 6.531977 5.29061 5.776121 6.136176 6.34699 13467455 5.632195 6.442133 5.818143 5.7683 5.862075 6.533181 5.807341 6.709774 5.664199 5.54022 5.385181 6.531977 5.29061 5.776121 6.136176 6.34699 13518680 5.632195 6.442133 5.818143 5.7683 5.862075 6.533181 5.807341 6.709774 5.664199 5.54022 5.385181 6.531977 5.29061 5.776121 6.136176 6.34699 c24m24 c24m48 c24p2 c24p4 c24p8 c24p16 c24p24 c24p48 13336382 5.489161 6.043948 5.807802 5.761756 5.833559 5.293438 5.92068 5.759683 13465320 5.489161 6.043948 5.807802 5.761756 5.833559 5.293438 5.92068 5.759683 13467455 5.489161 6.043948 5.807802 5.761756 5.833559 5.293438 5.92068 5.759683 13518680 5.489161 6.043948 5.807802 5.761756 5.833559 5.293438 5.92068 5.759683 Now I found correlation: > pcor <- cor(t((ord_data[pid,]))) > pcor 13336382 13465320 13467455 13518680 13336382 1 1 1 1 13465320 1 1 1 1 13467455 1 1 1 1 13518680 1 1 1 1 But, then I get this funny result !!! all(pcor[,1]==1) [1] FALSE > pcor[2,2]==1 [1] TRUE > pcor[3,2]==1 [1] FALSE Could anybody please comment on this?Many thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple imputed files
Hi, I think you want the {mitools} package. http://cran.r-project.org/web/packages/mitools/mitools.pdf. Anthony Damico's site, asdfree.com, has a lot of good code examples using various government datasets. Nate On Mon, Jan 26, 2015 at 5:23 AM, hnlki wrote: > Dear, > > My dataset consists out of 5 imputed files (that I did not imputed myself). > Is was wondering what is the best way to analyse them in R. I am aware that > packages to perform multiple imputation (like Mice & Amelia) exist, but > they > are used to perform MI. As my data is already imputed, I would like to know > how I can split it and how I should obtain pooled regression results. If I > can use the existing MI packages, how should I define my imputation > variable? > > Kind regards, > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/multiple-imputed-files-tp4702289.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Make 2nd col of 2-col df into header row of same df then adjust col1 data display
That is the solution I had tried first (yes, it's nice!), but it doesn't provide the other PViol.Type's that aren't necessarily in my dataset. That's where my problem is. I'm closer to the cure, though, and think I've thought of a solution as soon as I have time. I'll update everyone then. -- BNC -Original Message- From: John Kane [mailto:jrkrid...@inbox.com] Sent: Friday, December 19, 2014 8:44 AM To: Sven E. Templer; Chel Hee Lee Cc: R Help List; Crombie, Burnette N Subject: Re: [R] Make 2nd col of 2-col df into header row of same df then adjust col1 data display Very pretty. I could have saved myself about 1/2 hour of mucking about if I had thought ot "length". John Kane Kingston ON Canada > -Original Message- > From: sven.temp...@gmail.com > Sent: Fri, 19 Dec 2014 10:13:55 +0100 > To: chl...@mail.usask.ca > Subject: Re: [R] Make 2nd col of 2-col df into header row of same df > then adjust col1 data display > > Another solution: > > CaseID <- c("1015285", "1005317", "1012281", "1015285", "1015285", > "1007183", "1008833", "1015315", "1015322", "1015285") > Primary.Viol.Type <- c("AS.Age", "HS.Hours", "HS.Hours", "HS.Hours", > "RK.Records_CL", "OT.Overtime", "OT.Overtime", "OT.Overtime", > "V.Poster_Other", > "V.Poster_Other") > > library(reshape2) > dcast(data.frame(CaseID, Primary.Viol.Type), CaseID~Primary.Viol.Type, > length) > > # result: > > Using Primary.Viol.Type as value column: use value.var to override. >CaseID AS.Age HS.Hours OT.Overtime RK.Records_CL V.Poster_Other > 1 1005317 01 0 0 0 > 2 1007183 00 1 0 0 > 3 1008833 00 1 0 0 > 4 1012281 01 0 0 0 > 5 1015285 11 0 1 1 > 6 1015315 00 1 0 0 > 7 1015322 00 0 0 1 > > > best, s. > > On 19 December 2014 at 06:35, Chel Hee Lee wrote: >> Please take a look at my code again. The error message says that >> object 'Primary.Viol.Type' not found. Have you ever created the object >> 'Primary.Viol.Type'? It will be working if you replace >> 'Primary.Viol.Type' >> by 'PViol.Type.Per.Case.Original$Primary.Viol.Type' where 'factor()' >> is used. I hope this helps. >> >> Chel Hee Lee >> >> On 12/18/2014 08:57 PM, Crombie, Burnette N wrote: >>> >>> Chel, your solution is fantastic on the dataset I submitted in my >>> question but it is not working when I import my real dataset into R. >>> Do I need to vectorize the columns in my real dataset after >>> importing? I tried a few things (###) but not making progress: >>> >>> MERGE_PViol.Detail.Per.Case <- >>> read.csv("~/FOIA_FLSA/MERGE_PViol.Detail.Per.Case_for_rtf10.csv", >>> stringsAsFactors=TRUE) >>> >>> ### select only certain columns >>> PViol.Type.Per.Case.Original <- >>> MERGE_PViol.Detail.Per.Case[,c("CaseID", >>> "Primary.Viol.Type")] >>> >>> ### >>> write.csv(PViol.Type.Per.Case,file="PViol.Type.Per.Case.Select.csv") >>> ### PViol.Type.Per.Case.Original <- >>> read.csv("~/FOIA_FLSA/PViol.Type.Per.Case.Select.csv") >>> ### PViol.Type.Per.Case.Original$X <- NULL >>> ###PViol.Type.Per.Case.Original[] <- >>> lapply(PViol.Type.Per.Case.Original, >>> as.character) >>> >>> PViol.Type <- c("CaseID", >>> "BW.BackWages", >>> "LD.Liquid_Damages", >>> "MW.Minimum_Wage", >>> "OT.Overtime", >>> "RK.Records_FLSA", >>> "V.Poster_Other", >>> "AS.Age", >>> "BW.WHMIS_BackWages", >>> "HS.Hours", >>> "OA.HazOccupationAg", >>> "ON.HazOccupationNonAg", >>> "R3.Reg3AgeOccupation", >>> "RK.Records_CL", &
Re: [R] Make 2nd col of 2-col df into header row of same df then adjust col1 data display
I want to achieve a table that looks like a grid of 1's for all cases in a survey. I'm an R beginner and don't have a clue how to do all the things you just suggested. I really appreciate the time you took to explain all of those options, though. -- BNC -Original Message- From: Boris Steipe [mailto:boris.ste...@utoronto.ca] Sent: Thursday, December 18, 2014 5:29 AM To: Crombie, Burnette N Cc: r-help@r-project.org Subject: Re: [R] Make 2nd col of 2-col df into header row of same df then adjust col1 data display What you are describing sounds like a very spreadsheet-y thing. - The information is already IN your dataframe, and easy to get out by subsetting. Depending on your usecase, that may actually be the "best". - If the number of CaseIDs is large, I would use a hash of lists (if the data is sparse), or hash of named vectors if it's not sparse. Lookup is O(1) so that may be the best. (Cf package hash, and explanations there). - If it must be the spreadsheet-y thing, you could make a matrix with rownames and colnames taken from unique() of your respective dataframe. Instead of 1 and NA I probably would use TRUE/FALSE. - If it takes less time to wait for the results than to look up how apply() works, you can write a simple loop to populate your matrix. Otherwise apply() is much faster. - You could even use a loop to build the datastructure, checking for every cbind() whether the value in column 1 already exists in the table - but that's terrible and would make a kitten die somewhere on every iteration. All of these are possible, and you haven't told us enough about what you want to achieve to figure out what the "best" is. If you choose one of the options and need help with the code, let us know. Cheers, B. On Dec 17, 2014, at 10:15 PM, bcrombie wrote: > # I have a dataframe that contains 2 columns: > CaseID <- c('1015285', > '1005317', > '1012281', > '1015285', > '1015285', > '1007183', > '1008833', > '1015315', > '1015322', > '1015285') > > Primary.Viol.Type <- c('AS.Age', > 'HS.Hours', > 'HS.Hours', > 'HS.Hours', > 'RK.Records_CL', > 'OT.Overtime', > 'OT.Overtime', > 'OT.Overtime', > 'V.Poster_Other', > 'V.Poster_Other') > > PViol.Type.Per.Case.Original <- data.frame(CaseID,Primary.Viol.Type) > > # CaseID's can be repeated because there can be up to 14 > Primary.Viol.Type's per CaseID. > > # I want to transform this dataframe into one that has 15 columns, > where the first column is CaseID, and the rest are the 14 primary > viol. types. The CaseID column will contain a list of the unique > CaseID's (no replicates) and for each of their rows, there will be a > "1" under a column corresponding to a primary violation type recorded > for that CaseID. So, technically, there could be zero to 14 "1's" in a > CaseID's row. > > # For example, the row for CaseID '1015285' above would have a "1" > under "AS.Age", "HS.Hours", "RK.Records_CL", and "V.Poster_Other", but have > "NA" > under the rest of the columns. > > PViol.Type <- c("CaseID", >"BW.BackWages", > "LD.Liquid_Damages", > "MW.Minimum_Wage", > "OT.Overtime", > "RK.Records_FLSA", > "V.Poster_Other", > "AS.Age", > "BW.WHMIS_BackWages", > "HS.Hours", > "OA.HazOccupationAg", > "ON.HazOccupationNonAg", > "R3.Reg3AgeOccupation", > "RK.Records_CL", > "V.Other") > > PViol.Type.Columns <- t(data.frame(PViol.Type) > > # What is the best way to do this in R? > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Make-2nd-col-of-2-col-df-into-header-row > -of-same-df-then-adjust-col1-data-display-tp4700878.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Missing Data Imputation for Complex Survey Data
Dear all, I've got a bit of a challenge on my hands. I've got survey data produced by a government agency for which I want to use the person-weights in my analyses. This is best accomplished by specifying weights in {survey} and then calculating descriptive statistics/models through functions in that package. However, there is also missingness in this data that I'd like to handle with imputation via {mi}. To properly use imputed datasets in regression, they need to be pooled using the lm.mi function in {mi}. However, I can't figure out how to carry out a regression on data that is properly weighted that has also had its missing values imputed, because both packages use their own mutually incompatible data objects. Does anyone have any thoughts on this? I've done a lot of reading and I'm not really seeing anything on point. Thanks in advance! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about searchTwitter{twitteR}
You are only able to search twitter history for a short period of time. gnip.com and similar companies offer historical tweets for sale. cn On Sunday, September 7, 2014 9:21:34 AM UTC-5, Axel Urbiz wrote: > > Hello, > > The function searchTwitter() with the arguments supplied as below would > give me a different number of results on different days I run this code. > Maybe it is my lack of understanding about what the date arguments are > supposed to do in this function, but I would think I should be getting the > same tweets? > > > tweets <- searchTwitter('my text search', > n = 1000, > since = '2013-09-01', > until = '2014-08-31') > > > Thanks, > Axel. > > [[alternative HTML version deleted]] > > __ > r-h...@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] EOF error reading csv file
Hi All, Thanks for the suggestions. It is the problem with that particular name which has single ' in it. I renamed and able to load the file. Best regards, S.N.V. Krishna -Original Message- From: David L Carlson [mailto:dcarl...@tamu.edu] Sent: Monday, June 23, 2014 12:29 AM To: Chitra Baniya; S N V Krishna; r-help@r-project.org Subject: RE: [R] EOF error reading csv file The error message "EOF within quoted string" is telling you have an unbalanced " or ' in the .csv file. - David L Carlson Department of Anthropology Texas A&M University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Chitra Baniya Sent: Saturday, June 21, 2014 7:15 PM To: kris...@primps.com.sg; r-help@r-project.org Subject: [R] EOF error reading csv file Can someone go through the same and suggest what I am missing out. > cftc = read.table("cftcdata_ncn.csv", sep = ',', header = TRUE, fill = TRUE) Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : EOF within quoted string Hi, I guessed you have also tried with function read.csv instead of read.table. Thanks *Chitra Bahadur Baniya, PhDAssociate ProfessorCentral Department of BotanyTribhuvan UniversityKirtipurKathmandu, Nepal* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] EOF error reading csv file
Hi, I am confronted with this error while trying to read csv file into R session. Though it is warning message, I noticed that the whole file was not read properly. After having gone through the whole file, unable to identify error in file. I am copying the last 2 rows in original csv file after which the reading was not proper. (cannot enclose file because of big size) TRANSCONTINENTAL GAS - ZONE 6 (NY) (BASIS) - ICE FUTURES ENERGY DIV 2/25/2014 81271 51032 37508 14592 31154 1710 0 1490 1296 943 0 341 69914 73499 11357 7772 WAHA HUB - WEST TEXAS DELIVERED/BUYER'S INDEX - ICE FUTURES ENERGY DIV 2/25/2014 14232 13331 11786 0 0 0 0 615 0 280 0 0 13611 12401 621 1831 Can someone go through the same and suggest what I am missing out. > cftc = read.table("cftcdata_ncn.csv", sep = ',', header = TRUE, fill = TRUE) Warning message: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : EOF within quoted string > sessionInfo() R version 3.1.0 (2014-04-10) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base > Many thanks for the help. Best regards, Krishna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stop a function
Here is my code... I tried to remove many parts and keep the part which is related to evalWithTimeout. I have two similar while(1) loop in my original code. I only include one here. tnx a lot NO while (perm <= 1000) { setTimeLimit(cpu = Inf, elapsed = Inf) while (1) { data <- main_data[,sample(1:ncol(main_data),size=10,replace=F)] dd <- while(i<= dd) { y <- time[i,] fit <- NULL tryCatch(fit <- {evalWithTimeout({penalized(y,x,lambda1=lambda1[i],lambda2=lambda2[i],fusedl=a,standardize=T,trace=F);}, timeout=360)}, TimeoutException = function(ex) cat("Timeout. Skipping.\n")) setTimeLimit(cpu = Inf, elapsed = Inf) if (is.null(fit)==T) break i<-i+1 } if (i>dd) break } perm<-perm+1 } On Wednesday, May 14, 2014 9:50 AM, "ONKELINX, Thierry" wrote: Yes. Give us a minimal and reproducible example of your code and don't post in HTML. See fortunes::fortune(244) ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey Van: n omranian [mailto:n_omran...@yahoo.com] Verzonden: dinsdag 13 mei 2014 22:13 Aan: ONKELINX, Thierry; r-packages-ow...@r-project.org; r-help@r-project.org; r-help-requ...@r-project.org Onderwerp: Re: [R] stop a function Hi, Another problem arised now. I got this error: Error in match(x, table, nomatch = 0L) : reached CPU time limit I googled it but nothing could help me to get rid of this error. Any comments, help or hints? Thanks a lot, NO On Tuesday, May 13, 2014 2:36 PM, "ONKELINX, Thierry" wrote: Have a look at evalWithTimeout() from the R.utils package Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens n omranian Verzonden: dinsdag 13 mei 2014 14:15 Aan: r-packages-ow...@r-project.org; r-help@r-project.org; r-help-requ...@r-project.org Onderwerp: [R] stop a function Hi all, If I use a function in R which takes some parameters as an input, how can I stop this function in the while loop and try another parameter in case the function takes long time or could not converge. Actually, I'm using "penalized" function in a loop for some fixed lambdas (pre-calculated), for some of them the function converged, but for some not. Now I decide to proceed in this way that if it takes longer than 3 mins, stop the "penalized" function and try the other lambdas. I need to do it in the loop since the loop is in the big program and I can't manually stop and begin again. Looking forward to your reply. NO [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enke
Re: [R] stop a function
Hi, Another problem arised now. I got this error: Error in match(x, table, nomatch = 0L) : reached CPU time limit I googled it but nothing could help me to get rid of this error. Any comments, help or hints? Thanks a lot, NO On Tuesday, May 13, 2014 2:36 PM, "ONKELINX, Thierry" wrote: Have a look at evalWithTimeout() from the R.utils package Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens n omranian Verzonden: dinsdag 13 mei 2014 14:15 Aan: r-packages-ow...@r-project.org; r-help@r-project.org; r-help-requ...@r-project.org Onderwerp: [R] stop a function Hi all, If I use a function in R which takes some parameters as an input, how can I stop this function in the while loop and try another parameter in case the function takes long time or could not converge. Actually, I'm using "penalized" function in a loop for some fixed lambdas (pre-calculated), for some of them the function converged, but for some not. Now I decide to proceed in this way that if it takes longer than 3 mins, stop the "penalized" function and try the other lambdas. I need to do it in the loop since the loop is in the big program and I can't manually stop and begin again. Looking forward to your reply. NO [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stop a function
It' great function :) Thank you so much. Best, NO On Tuesday, May 13, 2014 2:36 PM, "ONKELINX, Thierry" wrote: Have a look at evalWithTimeout() from the R.utils package Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens n omranian Verzonden: dinsdag 13 mei 2014 14:15 Aan: r-packages-ow...@r-project.org; r-help@r-project.org; r-help-requ...@r-project.org Onderwerp: [R] stop a function Hi all, If I use a function in R which takes some parameters as an input, how can I stop this function in the while loop and try another parameter in case the function takes long time or could not converge. Actually, I'm using "penalized" function in a loop for some fixed lambdas (pre-calculated), for some of them the function converged, but for some not. Now I decide to proceed in this way that if it takes longer than 3 mins, stop the "penalized" function and try the other lambdas. I need to do it in the loop since the loop is in the big program and I can't manually stop and begin again. Looking forward to your reply. NO [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stop a function
Hi all, If I use a function in R which takes some parameters as an input, how can I stop this function in the while loop and try another parameter in case the function takes long time or could not converge. Actually, I'm using "penalized" function in a loop for some fixed lambdas (pre-calculated), for some of them the function converged, but for some not. Now I decide to proceed in this way that if it takes longer than 3 mins, stop the "penalized" function and try the other lambdas. I need to do it in the loop since the loop is in the big program and I can't manually stop and begin again. Looking forward to your reply. NO [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] count and sum simultaneously in R pivot table
The script works nicely, Arun. You were right, I pasted code from email instead of Rhelp and didn't reformat properly in R. I appreciate your time with this! -Original Message- It seems like part of the next line is also being run (in the end 'colnames'). For e.g. res2 <- within(as.data.frame(res1),`Count of Case ID` <- dcast(FLSAdata_melt, ViolationDesc + ReasonDesc ~ variable, length, margins=TRUE)[,3])[,c(4,1:3)] colnames #Error: unexpected symbol in "res2 <- within(as.data.frame(res1),`Count of Case ID` <- #dcast(FLSAdata_melt, ViolationDesc + ReasonDesc ~ variable, length, margins=TRUE)#[,3])[,c(4,1:3)] colnames" __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] count and sum simultaneously in R pivot table
A.K., thanks for your reply. I'm getting an error at res2: Error: unexpected symbol in "res2 <- within(as.data.frame(res1),`Count of Case ID` <- dcast(FLSAdata_melt, ViolationDesc + ReasonDesc ~ variable, length, margins=TRUE)[,3])[,c(4,1:3)] colnames" > I've tried a couple of modifications, but obviously don't know what I'm doing because I haven't fixed it. If anything comes to you in the meantime, please advise. Thanks. -Original Message- From: arun [mailto:smartpink...@yahoo.com] Sent: Tuesday, February 18, 2014 2:28 AM To: r-help@r-project.org Cc: Crombie, Burnette N Subject: Re: [R] count and sum simultaneously in R pivot table Hi, Check if this works: library(reshape2) res1 <- acast(FLSAdata_melt, ViolationDesc + ReasonDesc ~ variable, sum, margins=TRUE)[,-4] res2 <- within(as.data.frame(res1),`Count of Case ID` <- dcast(FLSAdata_melt, ViolationDesc + ReasonDesc ~ variable, length, margins=TRUE)[,3])[,c(4,1:3)] colnames(res2)[2:4] <- paste("Sum of",colnames(res2)[2:4]) rownames(res2)[length(rownames(res2))] <- "Grand Total" indx <- grepl("all",rownames(res2)) ord1 <- unlist(tapply(seq_len(nrow(res2)),list(cumsum(c(TRUE,diff(indx)<0))),FUN=function(x) c(tail(x,1),head(x,-1)) ),use.names=FALSE) res3 <- res2[ord1,] rownames(res3) <- gsub("\\_\\(all\\)","",rownames(res3)) A.K. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] with() and within() functions inside lapply() not seeing outside of its environment?
Hi, > I wouldn't call it a bug, but it's a documented limitation, if you know > how to read it. As documented, the expression is evaluated with the > caller's environment as the parent environment. But here the caller is > some code in lapply, not your function f. x is not found there. Thanks! That explains it. > I think this modification works, and is maybe the simplest way to get it > to work: > > f <- function(x){ >y <- list(y1=list()) >mywithin <- function(...) within(...) >y <- lapply(y, mywithin, {z<-x}) >y > } > > The idea here is that the calling frame of f is the environment of > mywithin(), so x is found there. It works. Best regards, Pavel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] with() and within() functions inside lapply() not seeing outside of its environment?
Hi, I have a list of sublists, and I want to add and/or remove elements in each sublist in accordance with a code snippet. I had thought that an elegant way to do that is using a combination of lapply() and within(). However, the code in the within() call doesn't seem to be able to see objects outside of it. For (a simplified) example, f <- function(x){ y <- list(y1=list()) y <- lapply(y, within, {z<-x}) y } f(1) My understanding is that what should happen is that lapply() would execute within(y[["y1"]], {z<-x}), with 1 substituted for x, within() would notice that z has been assigned 1, returning list(z=1), which then gets put into a list as element named "y1", so the function should ultimately return list(y1=list(z=1)) What I get instead (on R 3.0.2 and current trunk, both on Linux) is Error in eval(expr, envir, enclos) : object 'x' not found Am I doing something wrong, or is this a bug? Thanks in advance, Pavel P.S. If I "hard-code" the value for x, i.e., f <- function(){ y <- list(y1=list()) y <- lapply(y, within, {z<-1}) y } f() it returns list(y1=list(z=1)) as expected. P.P.S. with() has the same problem: f <- function(x){ y <- list(y1=list()) w <- lapply(y, with, x) w } f(1) produces the exact same error as within(). __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] condense repetitive code for read.csv and rename.vars
Thanks very much for your contribution, Siraaj. I appreciate you taking the time to help me learn loops, etc. BNC -Original Message- From: Siraaj Khandkar [mailto:sir...@khandkar.net] Sent: Wednesday, August 14, 2013 9:08 PM To: Crombie, Burnette N Cc: r-help@r-project.org Subject: Re: [R] condense repetitive code for read.csv and rename.vars On 08/14/2013 03:43 PM, bcrombie wrote: > Is there a more concise way to write the following code? > > library(gdata) > mydataOUTPUTrtfA <- read.csv("mergedStatstA.csv") > save(mydataOUTPUTrtfA, file="mydataOUTPUTrtfA.RData") mydataOUTPUTrtfA > <- rename.vars(mydataOUTPUTrtfA, from="X", to="Statistics.Calculated", > info=FALSE) > > mydataOUTPUTrtfB <- read.csv("mergedStatstB.csv") > save(mydataOUTPUTrtfB, file="mydataOUTPUTrtfB.RData") mydataOUTPUTrtfB > <- rename.vars(mydataOUTPUTrtfB, from="X", to="Statistics.Calculated", > info=FALSE) > > mydataOUTPUTrtfC <- read.csv("mergedStatstC.csv") > save(mydataOUTPUTrtfC, file="mydataOUTPUTrtfC.RData") mydataOUTPUTrtfC > <- rename.vars(mydataOUTPUTrtfC, from="X", to="Statistics.Calculated", > info=FALSE) > > I will have a series of mydataOUTPUTrtf files spanning a large portion > of the alphabet, so to speak: > e.g. mydataOUTPUTrtfA to mydataOUTPUTrtfG --- thanks for your help > alphabet <- c("FOO", "BAR", "BAZ") for (a in alphabet) { filename <- paste(c("basename", a, ".csv"), collapse="") data <- read.csv(filename) date <- rename.vars( data , from="X" , to="Statistics.Calculated" , info=FALSE ) # do some other stuff with data } You should be able to pick it up from here. In case you need an actual alphabet, it is already predefined: > LETTERS [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" [18] "R" "S" "T" "U" "V" "W" "X" "Y" "Z" > letters [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" [18] "r" "s" "t" "u" "v" "w" "x" "y" "z" > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem loading large xlsx file into r
Thanks Jim, I tried XLConnect but faced with same error. > options(java.parameters = '-Xmx5g') > library(XLConnect) Loading required package: rJava XLConnect 0.2-5 by Mirai Solutions GmbH http://www.mirai-solutions.com , http://miraisolutions.wordpress.com > cftc = > readWorksheetFromFile("d:\\Krishna\\Research\\CFTC_COT\\cftcdata.xlsx", sheet > = 'Sheet1') Error: OutOfMemoryError (Java): Java heap space What is the maximum file size to load into R? is there a better way to load large excel files to R? Many thanks for the help. Regards, Krishna -Original Message- From: Jim Holtman [mailto:jholt...@gmail.com] Sent: Monday, July 22, 2013 5:10 PM To: S N V Krishna Cc: r-help@r-project.org Subject: Re: [R] problem loading large xlsx file into r try the "XLConnect" package and if possible change the "xlsx" to "xls" format for better performance. Sent from my iPad On Jul 22, 2013, at 1:24, S N V Krishna wrote: > Hi, > > I am facing trouble when trying to read large xlsx file into R. please find > the code and error below. The file I was trying to read has 36,500 rows X 188 > col, ~ 37 MB size. > >> options( java.parameters = "-Xmx4g" ) > >> library(xlsx) > Loading required package: xlsxjars > Loading required package: rJava > >> cftc = read.xlsx("d:\\Krishna\\Research\\CFTC_COT\\cftcdata.xlsx", 1) > Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, : > > >> sessionInfo() > R version 3.0.1 (2013-05-16) > Platform: x86_64-w64-mingw32/x64 (64-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] xlsx_0.5.1 xlsxjars_0.5.0 rJava_0.9-5 > > Many thanks for the help and guidance. > > Regards, > > Krishna > >[[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem loading large xlsx file into r
Hi, I am facing trouble when trying to read large xlsx file into R. please find the code and error below. The file I was trying to read has 36,500 rows X 188 col, ~ 37 MB size. > options( java.parameters = "-Xmx4g" ) > library(xlsx) Loading required package: xlsxjars Loading required package: rJava > cftc = read.xlsx("d:\\Krishna\\Research\\CFTC_COT\\cftcdata.xlsx", 1) Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, : java.lang.OutOfMemoryError: Java heap space > sessionInfo() R version 3.0.1 (2013-05-16) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] xlsx_0.5.1 xlsxjars_0.5.0 rJava_0.9-5 Many thanks for the help and guidance. Regards, Krishna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create new matrix from user-defined function
Oh, also thanks for the speed comparisons. Missed that in my first read-through. Very interesting and informative. BNC -Original Message- From: Crombie, Burnette N Sent: Thursday, July 11, 2013 4:40 PM To: 'arun' Cc: R help Subject: RE: [R] create new matrix from user-defined function You understood me perfectly, and I agree is it easier to index using numbers than names. I'm just afraid if my dataset gets too big I'll mess up which index numbers I'm supposed to be using. "data.table()" looks very useful and a good way to approach the issue. Thanks. I really appreciate your (everyone's) help. BNC __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with text patterns in strings
Thanks, Arun. I will study this as soon as possible. I really appreciate your time and R mentoring. Try this: res1<-sapply(vec3,function(x) length(vec2New[grep(x,vec2New)]) ) dat1<-data.frame(res1,Name=names(vec3)) dat1$Name<-factor(dat1$Name,levels=c("early","mid","late","wknd")) with(dat1,tapply(res1,list(Name),FUN=sum)) #early mid late wknd # 0 1 4 6 #or sapply(split(res1,names(vec3)),sum) #early late mid wknd # 0 4 1 6 A.K. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vcovHC and arima() output
Dear all, how can I use vcovHC() to get robust/corrected standard errors from an arima() output? I ran an arima model with AR(1) and got the estimate, se, zvalue and p-value using coeftest(arima.output). However, I cannot use vcovHC(arima.output) to get corrected standard errors. It seems vcovHC works only with lm and plm objects? Is there another way I can get robust/corrected standard errors, or am I missing something? Thank you! Nicole -- I got this error: coeftest(arima.res.total,vcovHC) Error in terms.default(object) : no terms component nor attribute I also tried this: coeftest(arima.res.total, vcovHC=vcovHC(arima.res.total, method="arellano")) I do get an output table, but the standard errors do not change at all from the original coeftest() table, so I'm not sure it did the job. - R version 2.15.2 (2012-10-26) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] C/en_US.UTF-8/C/C/C/C attached base packages: [1] grid splines stats graphics grDevices utils datasets methods base other attached packages: [1] ellipse_0.3-7 corrgram_1.4 seriation_1.0-10 colorspace_1.2-0 gclus_1.3.1 TSP_1.0-7 cluster_1.14.3 car_2.0-15 nnet_7.3-5 [10] tseries_0.10-30 pcse_1.8 arm_1.6-04 foreign_0.8-51 abind_1.4-0 R2WinBUGS_2.1-18 coda_0.16-1 lme4_0.99-0 Matrix_1.0-10 [19] lattice_0.20-10 gplots_2.11.0 KernSmooth_2.23-8 caTools_1.14 gdata_2.12.0 gtools_2.7.0 Hmisc_3.10-1 survival_2.37-2 simcf_0.2.8 [28] lmtest_0.9-30 plm_1.3-1 sandwich_2.2-9 zoo_1.7-9 MASS_7.3-22 Formula_1.1-0 nlme_3.1-106 bdsmatrix_1.3 loaded via a namespace (and not attached): [1] bitops_1.0-4.2 quadprog_1.5-4 stats4_2.15.2 tools_2.15.2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Heteroscedasticity Plots
To detect heteroscedasticity for a multiple linear OLS regression (no time dependencies): What if the residuals vs. fitted values plot shows well behaved residuals (cloud) - but the some of the x versus residuals plots are a megaphone? Also, it seems that textbooks and internet tutorials in R do not agree what is the best plot for detecting heteroscedasticity. What do you use? I found so far: - Y vs X - Res vs X - Res vs Fitted Y - Partial regression plot and lots of standardized/studentized/partial plots. Thank you very much in advance! Nicole - Nicole Janz, PhD Cand. Lecturer at Social Sciences Research Methods Centre 2012/13 University of Cambridge Department of Politics and International Studies www.nicolejanz.de | nj...@cam.ac.uk | Mobile: +44 (0) 7905 70 1 69 4 Skype: nicole.janz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] looking to hire person to convert R to SAS
Hi, I am looking to hire someone to convert a small bit of R code into SAS code. The R code is about 2 word pages long and uses Snell's law to convert likert scales. If you are willing to look at this, or could point me to someone who would, it would be very much appreciated. Thanks in advance! -will [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lanyrd site for useR! 2012
I signed up. I'm doing a talk on real time text classification using node.js and R Cory On Saturday, April 28, 2012 2:20:08 PM UTC-5, Barry Rowlingson wrote: > > There's now a page on lanyrd ("the social conference directory") for > useR! 2012 in Nashville: > > http://lanyrd.com/2012/useR/ > > its basically a site for making social mini-networks for conferences, > so people can post up links, share photos, list talks, etc. > > If people going sign up then it'll look a lot better than UseR! 2011, > where only three people seemed to have attended - but that's more than > went to the JSM in Florida last year (one person). > > It's no replacement for the official website: > http://biostat.mc.vanderbilt.edu/wiki/Main/UseR-2012 - you might just > find it a handy place to keep track of all the conferences you attend, > or find new ones. Frank Harrell has given his blessing to the lanyrd > site (or at least he didnt tell me to tear it down and cease and > desist using the name "useR! 2012" - unlike the people running the > London 01ympic Game$...) > > Barry > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Study design question; MLB; pay and performance.
Dear List Members, I am in the process of designing a study examining pay and performance in Major League Baseball across several seasons, and before I get too deeply into it, I'd like to ask whether the group members think that performance across seasons is independent, or if it needs to be treated like a time-series variable so that lack of independence can be controlled. Any ideas or considerations that need to be taken into account would be appreciated. Regards, Nick Miceli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] socket connection in while(TRUE) loop the best way?
I'm accessing R via a socket connection. I set up a connection using socketConnection and then use readLines inside of a while(TRUE) loop to listen for activity. Is that the best way of doing this sort of activity? It works, that's not the issue, I am just wondering if there's a better way. Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Random sample from truncated distributions
Hi, How can I draw a random sample from a truncated distribution (especially lognormal)? I found the functions for truncated normal but not for many other distributions. Thanks Nikhil -- View this message in context: http://r.789695.n4.nabble.com/Random-sample-from-truncated-distributions-tp4369569p4369569.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] logistic regression: default computed probability
Hello all, Suppose in a logistic regression model, the binary outcome is coded as 0 or 1. In SAS, the default probability computed is for Y = 0 (smaller of the two values) . However, in SPSS the probability computed is for Y = 1 (greater of the two values). Which one does R compute, the probability for the smaller or the greater value? I went through the documentation in a hurry but couldn't find a clue. Thanks Nikhil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Differences in SAS and R defaults
Hello all, I am looking for theories and statistical analyses where the defaults employed in R and SAS are different. As a result, the outputs under the defaults should (at least slightly) differ for the same input. Could anyone kindly point any such instance? Thanks Nikhil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Differences in SAS and R defaults
Hello all, I am looking for theories and statistical analyses where the defaults employed in R and SAS are different. As a result, the outputs under the defaults should (at least slightly) differ for the same input. Could anyone kindly point any such instance? Thanks Nikhil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with Snowball & RWeka
The Java error when attempting to use the stemmers in the Snowball or tm packages on Windows machines is caused by Quicktime. See prior posts in this thread. The workaround is to uninstall Quicktime. After much trial and error on machines spanning WinXP/2k/Vista/7, I finally verified this as follows: 1) Fresh installation of Windows/Java/R. Snowball package works perfectly. 2) Install Quicktime. Java errors produced when attempting to use Snowball package. 3) Uninstall Quicktime. Snowball package works perfectly again. Many thanks to profs Ligges, Hornik, and Feinerer for their kind help in diagnosing this. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error bars
You might try 'bargraph.CI' in R pkg 'sciplot'. ?bargraph.CI HTH, Savi >>> Anna Harris 6/15/2011 1:00 PM >>> Hi, Can anyone help with plotting vertical error bars on a bar graph. I have tried following examples online and in the big R book and writing my own function but I have been unsuccessful and don’t really have an understanding of what it is I am doing. I have calculated my standard errors so basically just need to draw the bars on the graph but just don’t have a clue!!! I don’t even know what information people will need to help me.. The code for my graph is: barplot(tapply(Sporangia,list(Host,Isolate),mean),xlab="Isolate",ylab="Mean Sporangia per Needle",col=c("grey39","grey64","grey89"),beside=T) col=c("grey39","grey64","grey89") legend("topright",inset=.05,title="Host",c("European Larch","Hybrid Larch","Japanese Larch"),fill=col,horiz=FALSE) Any help would be greatly appreciated, Anna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with Snowball & RWeka
I too have this problem. Everything worked fine last year, but after updating R and packages I can no longer do word stemming. Unfortunately, I didn't save the old binaries, otherwise I would just revert back. Hoping someone finds a solution for R on Windows. Thanks! There is a potential solution for R on Mac OS from Kurt Hornik copied below, but I cannot get this to work on Windows. Here's the code I'm running: #1) Using package Snowball library(Snowball) source <- readLines(system.file("words", "porter","voc.txt",package = "Snowball")) result <- SnowballStemmer(source) #2) Using package tm library(tm) data("crude") stemDocument(crude[[1]]) In both instances I got a Java error "Could not initialize the GenericPropertiesCreator. This exception was produced: java.lang.NullPointerException". After receiving this error once in the session, no further error messages are generated. However, SnowballStemmer() and stemDocument() return the original unstemmed text. Possible Solution: For those on Mac OS, Kurt Hornik wrote... These issues seem to be specific to Mac OS X. Recent versions of Weka have added a package management system not unlike R's, to the effect that now when external packages (or the Snowball jar) is loaded their KnowledgeFlow GUI is started, which in turn requires AWT---and from what I understand, this does not work on Mac OS X. Short term, you should be able to Sys.setenv("NOAWT", "true"). More long term, the Weka maintainers have patched their upstream code so that it is possible to turn off the dynamic class discovery altogether, but I have not found the time to test this ... I realize this solution was for Mac OS, but not knowing anything about rJava I tried this on Windows anyways resulting in "Error in Sys.setenv("NOAWT", "true") : all arguments must be named" Here's my session info. R version 2.13.0 Patched (2011-04-21 r55576) Platform: i386-pc-mingw32/i386 (32-bit) (Windows Vista) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices datasets utils methods base other attached packages: [1] Snowball_0.0-7 tm_0.5-6 rcom_2.2-3.1 rscproxy_1.3-1 loaded via a namespace (and not attached): [1] grid_2.13.0 rJava_0.9-0 (same error with multiple older versions) RWeka_0.4-7 RWekajars_3.7.3-1 [5] slam_0.1-22 tools_2.13.0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RGtk2: How to populate an GtkListStore data model?
hello all I am trying to learn how to use the RGtk2 package... so, my first problem is: I don't get the right way for populate my gtkListStore object! any help is welcome... because I am trying several day to mount the code... Thanks in advanced Cleber N. Borges --- # my testing code library(RGtk2) win <- gtkWindowNew() datamodel <- gtkListStoreNew('gchararray') treeview <- gtkTreeViewNew() renderer <- gtkCellRendererText() col_0 <- gtkTreeViewColumnNewWithAttributes(title="TitleXXX", cell=renderer, "text"="Bar") nc_next <- gtkTreeViewInsertColumn(object=treeview, column=col_0, position=0) gtkTreeViewSetModel( treeview, datamodel ) win$add( treeview ) # is there an alternative function for this? # iter <- gtkTreeModelGetIterFirst( datamodel )[[2]] # this function don't give VALID iter # gtkListStoreIterIsValid( datamodel, iter ) result in FALSE iter <- gtkListStoreInsert( datamodel, position=0 )[[2]] gtkListStoreIterIsValid( datamodel, iter ) # the help of this function say to terminated in -1 value # but -1 crash the R-pckage (or the gtk)... gtkListStoreSet(object=datamodel, iter=iter, 0, "textFoo") # don't make any difference in the window... :-( R version 2.13.0 alpha (2011-03-27 r55091) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Portuguese_Brazil.1252 [2] LC_CTYPE=Portuguese_Brazil.1252 [3] LC_MONETARY=Portuguese_Brazil.1252 [4] LC_NUMERIC=C [5] LC_TIME=Portuguese_Brazil.1252 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: [1] RGtk2_2.20.8 loaded via a namespace (and not attached): [1] tools_2.13.0 > my gtk version == 2.16.2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matrix manipulation
Hi all! I have a vector, let's say for example int <- sample(1:20,10); for now: now I have a matrix... M = m x n where the first column is a "feature" column and most likely shares at least one of the int (interesting) numbers. I want to extract the rows where int[] = M[,1] I thought: rownames(int)<-int; rownames(M)<-M[,1]; M[rownames(int),] would work, but it doesn't... (I assume because I have rownames(int) that are not found in M[,1]. Neither does, rownames(M)==rownames(int)... Any help would be greatly appreciated! Thank you! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gtk, RGtk2 and error in callback: delet_event in mai window
Hello All, I am trying to learn about the GUI in R (with GTK+Glade+RGtk2) and in my test I don't get sucess in to make an callback to destroy the application... When I try to define an function for "delet-event callback", I get the error message: (with mouse click in X window) *Error in function () : * * unused argument(s) (, )* So, somebody has a tips for me? Thanks in advanded Cleber > > # make a file GLADE for testing... > tmp <- textConnection(' + + + + + + + + + + + + ') > glade_file <- readLines( tmp ) > close( tmp ); rm( tmp ) > > sink( file='glade_file.txt') > cat( glade_file ) > sink() > > # call the binfings for GTK ( RGtk2_2.20.8 ) > library(RGtk2) *Warning message:* *In inDL(x, as.logical(local), as.logical(now), ...) :* * DLL attempted to change FPU control word from 8001f to 9001f* > > > GUI <- gtkBuilderNew() > res <- gtkBuilderAddFromFile( GUI, filename='glade_file.txt' ) > unlink( 'glade_file.txt' ) > > # callback from delete_event ( small X in the Window ) > window1_delete_event <- function() print('Work or dont work???') > > gtkBuilderConnectSignals( GUI ) > window_main <- gtkBuilderGetObject( GUI, 'window1') > gtkWidgetShowAll( window_main ) > > # > # > # > # with the mouse, click in X window to close!! > # > # > # > *Error in function () : * * unused argument(s) (, )* > > # > > sessionInfo() R version 2.12.2 (2011-02-25) Platform: i386-pc-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Brazil.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RGtk2_2.20.8 > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help with error
Thank you very much Peter. It works fine now Best, Savi >>> Peter Ehlers 3/22/2011 5:49 AM >>> On 2011-03-21 10:37, Savitri N Appana wrote: > Thank you for your suggestion Allan. I should have paid attention to > the posting instructions. > > > Pls find below the sample code from the ?splsda in the caret package. > Note: It used to work fine in R v2.8.1, but this error shows up now, > given that I've modified the code on how the splsda or predict.splsda > functions are called, i.e. as caret:::splsda and caret:::predict.splsda > b/c I am running R v2.12.1 now. > > > sample code below.. > library(caret) > > > data(mdrr) > set.seed(1) > inTrain<- sample(seq(along = mdrrClass), 450) > > nzv<- nearZeroVar(mdrrDescr) > filteredDescr<- mdrrDescr[, -nzv] > > > training<- filteredDescr[inTrain,] > test<- filteredDescr[-inTrain,] > trainMDRR<- mdrrClass[inTrain] > testMDRR<- mdrrClass[-inTrain] > > preProcValues<- preProcess(training) > > > trainDescr<- predict(preProcValues, training) > testDescr<- predict(preProcValues, test) > > > splsFit<- caret:::splsda(trainDescr, trainMDRR, >K = 5, eta = .9, >probMethod = "Bayes") >> splsFit### ERROR is HERE > Error in switch(classifier, logistic = { : EXPR must be a length 1 > vector This message comes from print.splsda() in the spls package. As the caret:::splsda help page indicates, caret's splsda() uses the spls:::spls() function. So, although your splsFit object has class "splsda", you should print it with print.spls(splsdaFit) Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help with error
Thank you for your suggestion Allan. I should have paid attention to the posting instructions. Pls find below the sample code from the ?splsda in the caret package. Note: It used to work fine in R v2.8.1, but this error shows up now, given that I've modified the code on how the splsda or predict.splsda functions are called, i.e. as caret:::splsda and caret:::predict.splsda b/c I am running R v2.12.1 now. sample code below.. library(caret) data(mdrr) set.seed(1) inTrain <- sample(seq(along = mdrrClass), 450) nzv <- nearZeroVar(mdrrDescr) filteredDescr <- mdrrDescr[, -nzv] training <- filteredDescr[inTrain,] test <- filteredDescr[-inTrain,] trainMDRR <- mdrrClass[inTrain] testMDRR <- mdrrClass[-inTrain] preProcValues <- preProcess(training) trainDescr <- predict(preProcValues, training) testDescr <- predict(preProcValues, test) splsFit <- caret:::splsda(trainDescr, trainMDRR, K = 5, eta = .9, probMethod = "Bayes") >splsFit### ERROR is HERE Error in switch(classifier, logistic = { : EXPR must be a length 1 vector confusionMatrix( caret:::predict.splsda(splsFit, testDescr), testMDRR) Again, thank you in advance for any explanation re the error. Best, Savi >>> Allan Engelhardt 03/19/11 4:24 AM >>> As it says at the bottom of every post: > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. Without an example that fails, it is hard to help. Allan On 18/03/11 16:26, Savitri N Appana wrote: > Hi R users, > > I am getting the following error when using the splsda function in R > v2.12.1: > > "Error in switch(classifier, logistic = { : EXPR must be a length 1 > vector" > > What does this mean and how do I fix this? > > Thank you in advance! > > Best, > Savi > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Need help with error
Hi R users, I am getting the following error when using the splsda function in R v2.12.1: "Error in switch(classifier, logistic = { : EXPR must be a length 1 vector" What does this mean and how do I fix this? Thank you in advance! Best, Savi __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sample() issue
> length(sample(25000, 25000*(1-.55))) [1] 11249 > 25000*(1-.55) [1] 11250 > length(sample(25000, 11250)) [1] 11250 > length(sample(25000, 25000*.45)) [1] 11250 So the question is, why do I get 11249 out of the first command and not 11250? I can't figure this one out. Thanks Cory [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] attributable cost estimation using aggregate data
Hello, I am facing with an unusual problem of using aggregate data in order to estimate the attributable cost of a disease, for different stages. My data set consist of mean and std estimates of the cost outcome corresponding to strata coming from cross-classification of a set of factors (age group, gender, co morbidity etc.), as well as the number of observations in those strata. Those estimates are separate for the controls and cases (of more than one disease levels). Some strata have only controls or only cases, and some have only one observation, so no estimate for sd. So in most cases (except for the ?atomic? strata) individual patient data are not available. For example, the data set is something like disease.level stratum cost.meancost.sd n.cases 2STR1 156359.070 NA1 0STR16298.799 6995.153 53 0STR29892.051 11378.500 38 1STR3 24264.470 35450.673 14 0STR4 10946.446 15472.971 81 0STR5 17095.066 20558.138 50 2STR5 44130.380 NA1 0STR6 15979.599 17771.120 41 where disease level 0 indicates control. I am interested in the estimation of the coefficients for the difference disease levels. Since cost is usually very skewed to the right, gamma or log-normal is usually preferred to normal distribution. There is also known heteroscedasticity (higher mean = higher variance) and heterogeneity between strata. I was thinking of applying some of the approaches for meta-analysis, and perhaps a random effects model, addressing those issues. I was referred to lme but I am not sure if it is appropriate (I have no experience with it), or if other methods (e.g. Bayesian hierarchical models with WinBUGS) would be preferred. Any lead or suggestion would be highly appreciated. Thanks, Nicholas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] avoid a loop
Let's suppose I have userids and associated attributes... columns a and b a <- c(1,1,1,2,2,3,3,3,3) b <- c("a","b","c","a","d","a", "b", "e", "f") so a unique list of a would be id <- unique(a) I want a matrix like this... [,1] [,2] [,3] [1,]312 [2,]121 [3,]214 Where element i,j is the number of items in b that id[i] and id[j] share... So for example, in element [1,3] of the result matrix, I want to see 2. That is, id's 1 and 3 share two common elements in b, namely "a" and "b". This is hard to articulate, so sorry for the terrible description here. The way I have solved it is to do a double loop, looping over every member of the id column and comparing it to every other member of id to see how many elements of b they share. This takes forever. Thanks cn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Empty data frame does not maintain column type
Thanks to all three of you for responding. Brilliant answer there, David - using the options() function as exemplified to set stringsAsFactors=FALSE universally solved the issue. Much appreciated, guys. On 6 October 2010 18:27, David Winsemius wrote: > > On Oct 6, 2010, at 1:00 PM, N David Brown wrote: > >> Does anyone know why a data frame created with empty character columns >> converts them to integer columns? > > Quick answer: it's the strngsAsFactors demon but you have invoked that demon > twice, Once with data.frame and the second rime with rbind. See if this > helps: > >> zz <- factor(1:2) >> typeof(zz) > [1] "integer" # it's the storage mode > >> df<-data.frame(a=character(0),b=character(0), stringsAsFactors=FALSE) >> df<-rbind(df,c("a","a")) >> typeof(df[1,1]) > [1] "integer" # curses, foiled again! > > # I tried using strngsAsFactors=FALSE in the rbind call but got garbage: > >> df<-data.frame(a=character(0),b=character(0), stringsAsFactors=FALSE) >> df<-rbind(df,c("a","a"), stringsAsFactors=FALSE) >> df > c..aFALSE.. c..aFALSE...1 > 1 a a > stringsAsFactors FALSE FALSE > > # You can set the global stringsAsFactors option since it appears that your > # rbind invocation called out the devil again via the rbind.data.frame > function. > # So this is how you would prevent that behavior: > >> options(stringsAsFactors= FALSE) >> df<-data.frame(a=character(0),b=character(0)) > > >> df<-rbind(df,c("a","a")) >> str(df) > 'data.frame': 1 obs. of 2 variables: > $ X.a. : chr "a" > $ X.a..1: chr "a" > > -- > david > > >> >>> df<-data.frame(a=character(0),b=character(0)) >>> df<-rbind(df,c("a","a")) >>> typeof(df[1,1]) >> >> [1] "integer" >> >> AsIs doesn't help: >> >>> df<-data.frame(a=I(character(0)),b=I(character(0))) >>> df<-rbind(df,I(c("a","a"))) >>> typeof(df[1,1]) >> >> [1] "integer" >> >> Any suggestions on how to overcome this would be appreciated. >> >> Best wishes, >> >> David >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Empty data frame does not maintain column type
Does anyone know why a data frame created with empty character columns converts them to integer columns? > df<-data.frame(a=character(0),b=character(0)) > df<-rbind(df,c("a","a")) > typeof(df[1,1]) [1] "integer" AsIs doesn't help: > df<-data.frame(a=I(character(0)),b=I(character(0))) > df<-rbind(df,I(c("a","a"))) > typeof(df[1,1]) [1] "integer" Any suggestions on how to overcome this would be appreciated. Best wishes, David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] C module causing rounding errors?
Sorry for not including enough information everyone. I have quite a bit of code, so I will just enter relevant pieces... This is how I call C from R: The tstats are tstatistics (difference of mean, divided by sqrt of S1+S2) from an unpermuted matrix. The c code is below... dyn.load("testp.so") obj<-.C("testp",ptests=as.array(permuted_ttests),as.integer(B),permuted1=as.array(permuted),matrix=as.array(Imatrix), as.integer(ncols), as.integer(nrows),as.integer(g)*,as.array(abs(tstats)* ),pvalues=as.array(ps)) to test, I decided not to permute my matrix and just send it the original matrix, Imatrix. Everything in C is a double, or if I use an integer, I cast it to a double (to divide and get mean, etc). I then compare the values of the tstats that I sent into C and the tsats I calculate within C... The following is my C code: void testp(double *permuted_ttests,int *B,double *permuted,double *Imatrix,int *nc,int *nr,int *g,*double *Tinitial*,double *ps) { after which, using the variable above I take the mean of certain elements (currently the unpermuted matrix to test) via other functions using the same double and pointers and I store them in C1 and C2 and solve for an absolute T statistic - and print it (I erased the Ts earlier). for (i=0; i<*nr; i++){ xbardiff = C1[i][0]-C2[i][0]; denom = sqrt(C1[i][2]+C2[i][2]); * Ts[i]=fabs(xbardiff/denom);* * Rprintf("%f Ts\n",Ts[i]);* * if (fabs(Ts[i])>fabs(Tinitial[i])){ //summing of permueted_ttests counter[i]++; Rprintf("ts %f and tinitial %f \n", Ts[i],Tinitial[i]); } * etc... The issue here - is that I get a few that when printed it appears that they were rounded up - causing my counter[i] to ++ in some cases. Should I send more code? Sorry and thank you very much, On Fri, Jul 9, 2010 at 6:18 AM, Duncan Murdoch wrote: > Joseph N. Paulson wrote: > >> Hi all! >> >> I am currently writing a C-module for a for loop in which I permute >> columns >> in a matrix (bootstrapping) and I send a couple of variables into C >> initially. All of it is working, except the initial values I send to R are >> rounded/truncated (I believe rounded). >> >> I am using a 32 bit machine to compile, I am using (I believe) 32 bit >> R >> >> While debugging I print the values I am sending to C, and then I print the >> same values using Rprintf and the number gets rounded to 10^-6, which is >> actually causing some errors for me. Is there any way to correct/prevent >> the >> error? >> >> >> sample output from R >> >> [1,] 1.000 >> [2,] 1.0256242 >> [3,] 1.1826277 >> [4,] -0.6937246 >> [5,] 1.3633604 >> >> sample output from C >> 1.00 >> 1.025624 >> 1.182628 >> 0.693725 >> 1.363360 >> >> > > It looks as though you are confusing the display of numbers with their > internal values. R is printing 7 decimal places, C is printing 6. As far > as we can tell, that's the only difference. > > Duncan Murdoch > > [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- - Joseph N. Paulson [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] C module causing rounding errors?
Hi all! I am currently writing a C-module for a for loop in which I permute columns in a matrix (bootstrapping) and I send a couple of variables into C initially. All of it is working, except the initial values I send to R are rounded/truncated (I believe rounded). I am using a 32 bit machine to compile, I am using (I believe) 32 bit R While debugging I print the values I am sending to C, and then I print the same values using Rprintf and the number gets rounded to 10^-6, which is actually causing some errors for me. Is there any way to correct/prevent the error? sample output from R [1,] 1.000 [2,] 1.0256242 [3,] 1.1826277 [4,] -0.6937246 [5,] 1.3633604 sample output from C 1.00 1.025624 1.182628 0.693725 1.363360 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Idiomatic looping over list name, value pairs in R
Thank you. Your response was enlightening. On Tue, May 4, 2010 at 11:38 PM, Duncan Murdoch wrote: > On 04/05/2010 10:24 AM, Luis N wrote: >> >> Considering the python code: >> >> for k, v in d.items(): do_something(k); do_something_else(v) >> >> I have the following for R: >> >> for (i in c(1:length(d))) { do_something(names(d[i])); >> do_something_else(d[[i]]) } >> >> This does not seem seems idiomatic. What is the best way of doing the >> same with R? >> > > You could do it as > > for (name in names(d)) { > do_something(name) > do_something(d[[name]]) > } > > or > > sapply(names(d), function(name) { > do_something(name) > do_something_else(d[[name]]) > }) > > or > > do_both <- function(name) { > do_something(name) > do_something_else(d[[name]]) > } > sapply(names(d), do_both) > > My choice would be the first version, but yours might differ. > > Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Idiomatic looping over list name, value pairs in R
Considering the python code: for k, v in d.items(): do_something(k); do_something_else(v) I have the following for R: for (i in c(1:length(d))) { do_something(names(d[i])); do_something_else(d[[i]]) } This does not seem seems idiomatic. What is the best way of doing the same with R? Thanks. Luis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] installing R2.11 on RHEL5.3
I am trying to install R2.11 on RHEL5.3. The main code and base packages compile fine and get installed to \usr\local\R211 (which I set using --prefix). I would like to install a set of contrib packages (car, ggplot2, etc.) also to the same location. Once R is fired up, I can do install.packages() and it gets installed in ~/ R/... Is there any way of installing these packages to the main R installation. There are some packages which all of us in the group use and would prefer not to repeat for each user. (if sudo R CMD install is the way to do it, is there a way in which I can specify just the package names and have it figure out the dependencies without me having to download all the tar balls manually?). Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] write.csv size limit in R 2.11.0 -- crashes
The latest patched build 51822 fixed the write.csv problem: http://cran.r-project.org/bin/windows/base/rpatched.html On 4/26/2010 2:10 PM, Duncan Murdoch wrote: On 26/04/2010 4:25 PM, N Klepeis wrote: Hi, I just installed R 2.11.0 Win32 and tried to use write.csv (or write.table) to write a 121000x26 data frame. This crashes R. The dataframe was written OK in R 2.10.1. I tried up to 108000 rows and the file was written OK. But then going to 109000 causes the crash. Anyone else see this? I'll gather some more info before reporting a bug. Please try R 2.11.0 patched. There was a problem in the date/time formatting routines which I fixed a few days ago in revision 51811 that could be involved here. (You may need to wait a few days: CRAN is offline, and the mirrors I checked hadn't updated to this version yet.) You could also try an R-devel build; it never had the bug. Duncan Murdoch P.S. Bug discussions are generally better in R-devel rather than R-help __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] write.csv size limit in R 2.11.0 -- crashes
Hi, I just installed R 2.11.0 Win32 and tried to use write.csv (or write.table) to write a 121000x26 data frame. This crashes R. The dataframe was written OK in R 2.10.1. I tried up to 108000 rows and the file was written OK. But then going to 109000 causes the crash. Anyone else see this? I'll gather some more info before reporting a bug. --Neil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BRugs
Thanks for the reply Bob, but it still does not work, you see. I ran this model, just with the main effects and it ran fine. n=length(bi.bmi) Lgen=2 Lrace=5 Lagegp=13 Lstra=15 Lpsu=2 bi.bmi.model=function(){ # likelihood for(i in 1:n){ bi.bmi[i]~ dbern(p[i]) logit(p[i])<- a0 + a1[agegp[i]]+a2[gen[i]]+a3[race[i]] + g[stra[i]]+ u[psu[i],stra[i]] } # constraints for a1, a2, a3 a1[1]<-0.0 a2[1]<-0.0 a3[1]<-0.0 # priors a0~ dnorm(0.0, 1.0E-4) for(j in 2:Lagegp){a1[j]~ dnorm(0.0, 1.0E-4)} for(j in 2:Lgen){ a2[j]~ dnorm(0.0, 1.0E-4)} for(k in 2:Lrace){ a3[k]~ dnorm(0.0, 1.0E-4)} for(l in 1:Lstra){ g[l]~dunif(0, 100) } for( m in 1:Lpsu){ for(l in 1:Lstra){ u[m,l]~ dnorm(0.0, tau.u) }} tau.u<-pow(sigma.u, -2) sigma.u~ dunif(0.0,100) } library(BRugs) writeModel(bi.bmi.model, con='bi.bmi.model.txt') model.data=list( 'n','Lagegp', 'Lgen', 'Lrace', 'Lstra', 'Lpsu', 'bi.bmi','agegp', 'gen', 'race','stra', 'psu') model.init=function(){ list( sigma.u=runif(1), a0=rnorm(1), a1=c(NA, rep(0,12)), a2=c(NA, rep(0, 1)), a3=c(NA, rep(0, 4)), g=rep(0,Lstra), u=matrix(rep(0, 30), nrow=2)) } model.parameters=c( 'a0', 'a1', 'a2', 'a3') model.bugs=BRugsFit(modelFile='bi.bmi.model.txt', data=model.data, inits=model.init, numChains=1, para=model.parameters, nBurnin=50, nIter=100) This is just with the main effects, and this does not give me any problems, and I also ran the following model with interaction term between gen and race, and it also ran fine. for (i in 1:n){ bi.bmi[i]~ dbern(p[i]) logit(p[i])<- a0 + a1[agegp[i]]+a2[gen[i]]+a3[race[i]] + a23[gen[i], race[i]] + gam[stra[i]]+ u[psu[i],stra[i]] } # constraints for a2, a3, a12 and a13 a1[1]<-0.0 a2[1]<-0.0 a3[1]<-0.0 a23[1,1]<-0.0 #gen x race for(j in 2:Lrace){ a23[1,j]<-0.0} for(k in 2:Lgen){ a23[k,1]<-0.0} # priors a0~ dnorm(0.0, 1.0E-4) for(i in 2:Lagegp){a1[i]~dnorm(0.0, 1.0E-4)} for(i in 2:Lgen){ a2[i]~ dnorm(0.0, 1.0E-4)} for(i in 2:Lrace){ a3[i]~ dnorm(0.0, 1.0E-4)} for(i in 2:Lgen){ for(j in 2:Lrace){ a23[i,j]~ dnorm(0.0, 1.0E-4) }} for(i in 1:Lstra){ gam[i]~dunif(0, 1000) } for( i in 1:Lpsu){ for(j in 1:Lstra){ u[i,j]~ dnorm(0.0, tau.u) }} tau.u<-pow(sigma.u, -2) sigma.u~ dunif(0.0,100) } So, the error happens only when I try to plug in interaction with the agegp. I still don't know how to correct it. Thanks -- View this message in context: http://n4.nabble.com/BRugs-tp2015395p2016164.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question on 'within' and 'parse' commands
Hi, Why can't I pass an expression to `within' by way of textual input to the 'parse' function? e.g., > x <- data.frame(a=1:5,b=LETTERS[1:5]) > x a b 1 1 A 2 2 B 3 3 C 4 4 D 5 5 E > within(x, parse(text="a<-a*10; b<-2:6")) a b 1 1 A 2 2 B 3 3 C 4 4 D 5 5 E > within(x, parse(text="a<-a*10; b<-2:6")[[1]]) a b 1 1 A 2 2 B 3 3 C 4 4 D 5 5 E This would be very useful to allow for arbitrary evaluation of multi-line commands at runtime. Of course, I can edit the 'within.data.frame' function as follows, but isn't there some way to make 'within' more generally like the 'eval' command? alternative: within.data.frame <- function (data, textCMD, ...) { parent <- parent.frame() e <- evalq(environment(), data, parent) eval(parse(text=textCMD), e) # used to be eval(substitute(expr), e) l <- as.list(e) l <- l[!sapply(l, is.null)] nD <- length(del <- setdiff(names(data), (nl <- names(l data[nl] <- l if (nD) data[del] <- if (nD == 1) NULL else vector("list", nD) data } --Neil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rcart - classification and regression trees (CART)
Hi, I am trying to use CART to find an ideal cut-off value for a simple diagnostic test (ie when the test score is above x, diagnose the condition). When I put in the model fit=rpart(outcome ~ predictor1(TB144), method="class", data=data8) sometimes it gives me a tree with multiple nodes for the same predictor (see below for example of tree with 1 or multiple nodes). Is there a way to tell it to make only 1 node? Or is it safe to assume that the cut-off value on the primary node is the ideal cut-off? Thanks! Katie http://n4.nabble.com/file/n964970/smartDNA%2BCART%2B-%2BTB144n.jpg http://n4.nabble.com/file/n964970/smartDNA%2BCART%2B-%2BTB122n.jpg -- View this message in context: http://n4.nabble.com/rcart-classification-and-regression-trees-CART-tp964970p964970.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rpart - classification and regression trees (CART)
Actually, that's the first thing I thought too, but they weren't listed in that order in my model statement (model that I used is below): fit=rpart(pres ~ TB144 + TB118 + TB129 + TB139 + TB114 + TB131 + TB122, method="class", data=data8) Would the selection of the best split when improvement is the same have anything to do with the Gini Index? I read on another site that the best split is determined by the amount of homogeneity (or impurity as measured by the Gini Index) resulting from a split (more homogeneity is better). TB122 does have less variability (ie smaller standard deviation around the mean) than the others, could that be why it was chosen despite having the same "level of merit" as the other predictors? Therneau, Terry M., Ph.D. wrote: > > When two variables have exactly the same figure of merit, they will be > listed in the output in the same order in which they appeared in your > model statement. >Terry Therneau > > -- begin inclusion --- > I had a question regarding the rpart command in R. I used seven > continuous > predictor variables in the model and the variable called "TB122" was > chosen > for the first split. But in looking at the output, there are 4 > variables > that improve the predicted membership equally (TB122, TB139, TB144, and > TB118) - output pasted below. > > Node number 1: 268 observations,complexity param=0.6 > predicted class=0 expected loss=0.3 > class counts: 19771 >probabilities: 0.735 0.265 > left son=2 (188 obs) right son=3 (80 obs) > Primary splits: > TB122 < 80 to the left, improve=50, (0 missing) > TB139 < 90 to the left, improve=50, (0 missing) > TB144 < 90 to the left, improve=50, (0 missing) > TB118 < 90 to the left, improve=50, (0 missing) > TB129 < 100 to the left, improve=40, (0 missing) > > --- end inclusion --- > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://n4.nabble.com/rpart-classification-and-regression-trees-CART-tp962680p963620.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rpart - classification and regression trees (CART)
Hi, I had a question regarding the rpart command in R. I used seven continuous predictor variables in the model and the variable called "TB122" was chosen for the first split. But in looking at the output, there are 4 variables that improve the predicted membership equally (TB122, TB139, TB144, and TB118) - output pasted below. Node number 1: 268 observations,complexity param=0.6 predicted class=0 expected loss=0.3 class counts: 19771 probabilities: 0.735 0.265 left son=2 (188 obs) right son=3 (80 obs) Primary splits: TB122 < 80 to the left, improve=50, (0 missing) TB139 < 90 to the left, improve=50, (0 missing) TB144 < 90 to the left, improve=50, (0 missing) TB118 < 90 to the left, improve=50, (0 missing) TB129 < 100 to the left, improve=40, (0 missing) I need to know what methods R is using to select the best variable for the node. Somewhere I read that the best split = greatest improvement in predictive accuracy = maximum homogeneity of yes/no groups resulting from the split = reduction of impurity. I also read that the Gini index, Chi-square, or G-square can be used evaluate the level of impurity. For this function in R: 1) Why exactly did R pick TB122 over the other variables despite the fact that they all had the same level of improvement? Was TB122 chosen to be the first node because the groups "TB122<80" and "TB122>80" were the most homogeneous (ie had the least impurity)? 2) If R is using impurity to determine the best nodes, which method (the Gini index, Chi-square, or G-square) is R using? Thanks! Katie -- View this message in context: http://n4.nabble.com/rpart-classification-and-regression-trees-CART-tp962680p962680.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] histbackback function
Hi, I'm trying to recreate a sensitivity-specificity graph using the histbackback function. The only problem is that these graphs are typically drawn with vertical rather than horizontal bar plots (and the histbackback function only seems to work with horiz=TRUE argument, using "horiz=FALSE" doesn't work). Does anyone know if: 1) there's a different graphing function that would accomplish this graph? 2) if there isn't, is there a way to rotate the graph 90 degrees clockwise? 3) with the histbackback function, is there a way to display percent instead of proportion on the x-axis? 4) with the histbackback function, is there a way to set the width of the bins (ie I want bars in increments of 10 or 20 instead of 50 , "bin=10" didn't work) 5) with the histbackback function, is there a way to display the y-axis values where the two histograms are back to back, rather than on the y-axis? Below, I've pasted the graph that I'm trying to recreate, as well as the code and graph from my current (unsuccessful) attempt. options(digits=1) require(Hmisc) out <- histbackback(split(data8$TB144, data8$PRES), probability=TRUE, main="Back to Back Histogram", ylab="TB144 1min rates") barplot(-out$left, col="ivory2", horiz=TRUE, space=0, add=TRUE, axes=FALSE) barplot(out$right, col="ivory4", horiz=TRUE, space=0, add=TRUE, axes=FALSE) grid(nx=NULL, ny = NULL, col = "lightgray", lty = "dotted", lwd = par("lwd"), equilogs = TRUE) http://n4.nabble.com/file/n954888/sens-spec%2Bgraph.jpg http://n4.nabble.com/file/n954888/back-to-back%2Bhistogram%2B2.jpg Thanks! Katie -- View this message in context: http://n4.nabble.com/histbackback-function-tp954888p954888.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which "apply" function to use?
Excellent - the "as.data.frame" trick was just what I needed! Many thanks, Nick From: baptiste auguie [baptiste.aug...@googlemail.com] Sent: 14 September 2009 17:48 To: Masca, N. Cc: r-h...@stat.math.ethz.ch Subject: Re: [R] Which "apply" function to use? Hi, try this, rowMeans(as.data.frame(Coefs)) # or apply(as.data.frame(Coefs), 1, mean) HTH, baptiste 2009/9/14 Masca, N. mailto:nm...@leicester.ac.uk>> Dear All, I have a problem which *should* be pretty straightforward to resolve - but I can't work out how! I have a list of 3 coefficient estimates for 4 different datasets: Coefs<-list(c(1,0.6,0.5),c(0.98,0.65,0.4),c(1.05,0.55,0.45),c(0.99,0.50,0.47)) All I want to do is take the sum (or mean) of each coefficient across the 4 datasets. I can do this using a "for" loop, but it gets very messy - as I need to do this several times I was hoping someone might have a better solution using one of the "apply" functions. Any ideas? Many thanks for any help you can provide. Cheers, Nick __ R-help@r-project.org<mailto:R-help@r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- _ Baptiste Auguié School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK http://newton.ex.ac.uk/research/emag __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Which "apply" function to use?
Dear All, I have a problem which *should* be pretty straightforward to resolve - but I can't work out how! I have a list of 3 coefficient estimates for 4 different datasets: Coefs<-list(c(1,0.6,0.5),c(0.98,0.65,0.4),c(1.05,0.55,0.45),c(0.99,0.50,0.47)) All I want to do is take the sum (or mean) of each coefficient across the 4 datasets. I can do this using a "for" loop, but it gets very messy - as I need to do this several times I was hoping someone might have a better solution using one of the "apply" functions. Any ideas? Many thanks for any help you can provide. Cheers, Nick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] binary digit
Does R has package for providing work for binary digit: arithmetic operation, convert to/from decimal digit, etc? I not found it, but think that CRAN contain it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] configure encoding by default
I want using russian letters for my diagrams. I do it in this manner m <- "заголовок" Encode(m) <- "UTF-8" plot(1,1,main=m) But it is not convenient . How to configure R for using UTF-8 for all string, to work without Encode-function, as plot(1,1,main="заголовок") [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem in installation of "Rgraphviz" package
Hi,there, I could install "Rgraphviz" in R version 2.6. I could not install this in R version 2.7 or 2.8. Please try in version2.6. ram basnet wrote: > > Dear R users, > > I am not so used to this R software. I have to use the package " > Rgraphviz" but found some problem in the installation process. I download > this package and store in R library but i am not getting this package in R > installation list. > I made review in google search net and use the following command: > > ### > source("http://bioconductor.org/biocLite.R";) > biocLite("Rgraphviz") > > set.seed(123) > V <- letters[1:10] > M <- 1:4 > g1 <- randomGraph(V, M, 0.2) > library("graph") > library("grid") > library("Rgraphviz") > ### > > I got following Error message: > > " > Error in inDL(x, as.logical(local), as.logical(now), ...) : > unable to load shared library > 'C:/PROGRA~1/R/R-27~1.0/library/Rgraphviz/libs/Rgraphviz.dll': > LoadLibrary failure: The specified module could not be found. > Error : .onLoad failed in 'loadNamespace' for 'Rgraphviz' > Error: package/namespace load failed for 'Rgraphviz' > "' > May be some of users can recognize this problem and request for solutions. > > Thanks in advance. > > Sincerely, > > Ram Kumar Basnet > Graduate student, > Wageningen University, > The Netherlands. > > > > [[alternative HTML version deleted]] > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- View this message in context: http://www.nabble.com/Problem-in-installation-of-%22Rgraphviz%22-package-tp20196715p24019809.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to sort data frame order by column?
I have a data frame, for exampe > dat <- data.frame(a=rnorm(5),b=rnorm(5),c=rnorm(5)) ab c 1 -0.1731141 0.002453991 0.1180976 2 1.2142024 -0.413897606 0.7617472 3 -0.9428484 -0.609312786 0.5132441 4 0.1343336 0.178208961 0.7509650 5 -0.1402286 -0.333476839 -0.4959459 How to make dat2 from dat, where source data frame be ordered by any column? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] there are fontencoding problem in Sweave
2009/4/17 Peter Dalgaard > Doesn't \usepackage[noae]{Sweave} do the trick? Sweave.sty has this > conditionalized. > > Yes, it is working. Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] there are fontencoding problem in Sweave
I want write article by russian language using Sweave. For cyrillic text LaTeX use T2A encoding \usepackage[T2A]{fontenc} But in Sweave.sty we find: \RequirePackage[T1]{fontenc} It is source of critical problem. For example Rnw file $ cat estimation.Rnw \documentclass[A4paper]{article} \usepackage[T2A]{fontenc} \usepackage[utf8]{inputenc} \usepackage[russian,english]{babel} \begin{document} ÐÐ»Ñ Ð½Ð°Ñала попÑÑаемÑÑ Ð°Ð¿ÑокÑимиÑоваÑÑ ÑезÑлÑÑаÑÑ Ð½Ð¾ÑмаÑивной завиÑимоÑÑи. $$ \Delta T_k = 800(C_p + 0.07C_{Cu})F^{1/3} $$ \end{document} will be translate to this estimate.tex $ R CMD Sweave estimation.Rnw $ cat estimate.tex \documentclass[A4paper]{article} \usepackage[T2A]{fontenc} \usepackage[utf8]{inputenc} \usepackage[russian,english]{babel} \usepackage{Sweave} \begin{document} ÐÐ»Ñ Ð½Ð°Ñала попÑÑаемÑÑ Ð°Ð¿ÑокÑимиÑоваÑÑ ÑезÑлÑÑаÑÑ Ð½Ð¾ÑмаÑивной завиÑимоÑÑи. $$ \Delta T_k = 800(C_p + 0.07C_{Cu})F^{1/3} $$ \end{document} and if I try to compile to pdf, LaTeX don't do it by reason $ pdflatex estimation.tex This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6) %&-line parsing enabled. entering extended mode (./estimation.tex LaTeX2e <2005/12/01> Babel and hyphenation patterns for english, usenglishmax, dumylang, noh yphenation, croatian, ukrainian, russian, bulgarian, czech, slovak, danish, dut ch, finnish, basque, french, german, ngerman, ibycus, greek, monogreek, ancient greek, hungarian, italian, latin, mongolian, norsk, icelandic, interlingua, tur kish, coptic, romanian, welsh, serbian, slovenian, estonian, esperanto, upperso rbian, indonesian, polish, portuguese, spanish, catalan, galician, swedish, loa ded. (/usr/share/texmf-texlive/tex/latex/base/article.cls Document Class: article 2005/09/16 v1.4f Standard LaTeX document class (/usr/share/texmf-texlive/tex/latex/base/size10.clo)) (/usr/share/texmf-texlive/tex/latex/base/fontenc.sty (/usr/share/texmf-texlive/tex/latex/cyrillic/t2aenc.def) (/usr/share/texmf-texlive/tex/latex/cyrillic/t2acmr.fd)) (/usr/share/texmf-texlive/tex/latex/base/inputenc.sty (/usr/share/texmf-texlive/tex/latex/base/utf8.def (/usr/share/texmf-texlive/tex/latex/base/t1enc.dfu) (/usr/share/texmf-texlive/tex/latex/base/ot1enc.dfu) (/usr/share/texmf-texlive/tex/latex/base/omsenc.dfu) (/usr/share/texmf-texlive/tex/latex/base/t2aenc.dfu))) (/usr/share/texmf-texlive/tex/generic/babel/babel.sty (/usr/share/texmf-texlive/tex/generic/babel/russianb.ldf (/usr/share/texmf-texlive/tex/generic/babel/babel.def)) (/usr/share/texmf-texlive/tex/generic/babel/english.ldf)) (/usr/share/texmf/tex/latex/R/Sweave.sty (/usr/share/texmf-texlive/tex/latex/base/ifthen.sty) (/usr/share/texmf-texlive/tex/latex/graphics/graphicx.sty (/usr/share/texmf-texlive/tex/latex/graphics/keyval.sty) (/usr/share/texmf-texlive/tex/latex/graphics/graphics.sty (/usr/share/texmf-texlive/tex/latex/graphics/trig.sty) (/etc/texmf/tex/latex/config/graphics.cfg) (/usr/share/texmf-texlive/tex/latex/pdftex-def/pdftex.def))) (/usr/share/texmf-texlive/tex/latex/fancyvrb/fancyvrb.sty Style option: `fancyvrb' v2.6, with DG/SPQR fixes <1998/07/17> (tvz) No file fancyvrb.cfg. ) (/usr/share/texmf/tex/latex/R/upquote.sty (/usr/share/texmf-texlive/tex/latex/base/textcomp.sty (/usr/share/texmf-texlive/tex/latex/base/ts1enc.def (/usr/share/texmf-texlive/tex/latex/base/ts1enc.dfu (/usr/share/texmf-texlive/tex/latex/base/fontenc.sty (/usr/share/texmf-texlive/tex/latex/cyrillic/t2aenc.def)) (/usr/share/texmf-texlive/tex/latex/ae/ae.sty (/usr/share/texmf-texlive/tex/latex/base/fontenc.sty (/usr/share/texmf-texlive/tex/latex/base/t1enc.def) (/usr/share/texmf-texlive/tex/latex/ae/t1aer.fd LaTeX Warning: Unused global option(s): [A4paper]. (./estimation.aux) (/usr/share/texmf-texlive/tex/latex/base/ts1cmr.fd) ! LaTeX Error: Command \CYRD unavailable in encoding T1. See the LaTeX manual or LaTeX Companion for explanation. Type H for immediate help. ... l.10 Ð Ð»Ñ Ð½Ð°Ñала попÑÑаемÑÑ Ð°Ð¿ÑокÑимиÑоваÑÑ... ? ! LaTeX Error: Command \CYRD unavailable in encoding T1. --- this is the error message I know how to win this situation. For this I must edit Sweave.sty, change T1 to T2A, and move usepakage{Sweave} operator in estimate.tex file to top of declaration usepakage block, thus \usepackage{Sweave} \usepackage[T2A]{fontenc} \usepackage[utf8]{inputenc} \usepackage[russian,english]{babel} But this method similar to the rape. Which is the correct, regular method of declaration of necessary fontecoding in Sweave? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate Specificity and Sensitivity for a given threshold value
Hi Pierre-Jean, Sensitivity (Se) and specificity (Sp) are calculated for cutoffs stored in the "performance" x.values of your prediction for Se and Sp: For example, let's generate the performance for Se and Sp: sens <- performance(pred,"sens") spec <- performance(pred,"spec") Now, you can have acces to: [EMAIL PROTECTED] # (or [EMAIL PROTECTED]), which is the list of cutoffs [EMAIL PROTECTED] # for the corresponding Se [EMAIL PROTECTED] # for the corresponding Sp You can for example sum up this information in a table: (SeSp <- cbind ([EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED])) You can also write a function to give Se and Sp for a specific cutoff, but you will have to define what to do for cutoffs not stored in the list. For example, the following function keeps the closest stored cutoff to give corresponding Se and Sp (but this is not always the best solution, you may want to define your own way to interpolate): se.sp <- function (cutoff, performance){ sens <- performance(pred,"sens") spec <- performance(pred,"spec") num.cutoff <- which.min(abs([EMAIL PROTECTED] - cutoff)) return(list([EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] [[1]][num.cutoff])) } se.sp(.5, pred) Hope this helps, Nael On Thu, Nov 13, 2008 at 5:59 PM, <[EMAIL PROTECTED]>wrote: > Hi Frank, > > Thank you for your answer. > In fact, I don't use this for clinical research practice. > I am currently testing several scoring methods and I'd like > to know which one is the most effective and which threshold > value I should apply to discriminate positives and negatives. > So, any idea for my problem ? > > Pierre-Jean > > -Original Message- > From: Frank E Harrell Jr [mailto:[EMAIL PROTECTED] > Sent: Thursday, November 13, 2008 5:00 PM > To: Breton, Pierre-Jean-EXT R&D/FR > Cc: r-help@r-project.org > Subject: Re: [R] Calculate Specificity and Sensitivity for a given > threshold value > > Kaliss wrote: > > Hi list, > > > > > > I'm new to R and I'm currently using ROCR package. > > Data in input look like this: > > > > DIAGNOSIS SCORE > > 1 0.387945 > > 1 0.50405 > > 1 0.435667 > > 1 0.358057 > > 1 0.583512 > > 1 0.387945 > > 1 0.531795 > > 1 0.527148 > > 0 0.526397 > > 0 0.372935 > > 1 0.861097 > > > > And I run the following simple code: > > d <- read.table("inputFile", header=TRUE); pred <- prediction(d$SCORE, > > > d$DIAGNOSIS); perf <- performance( pred, "tpr", "fpr"); > > plot(perf) > > > > So building the curve works easily. > > My question is: can I have the specificity and the sensitivity for a > > score threshold = 0.5 (for example)? How do I compute this ? > > > > Thank you in advance > > Beware of the utility/loss function you are implicitly assuming with > this approach. It is quite oversimplified. In clinical practice the > cost of a false positive or false negative (which comes from a cost > function and the simple forward probability of a positive diagnosis, > e.g., from a basic logistic regression model if you start with a cohort > study) vary with the type of patient being diagnosed. > > Frank > > -- > Frank E Harrell Jr Professor and Chair School of Medicine > Department of Biostatistics Vanderbilt > University > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] detect repeated number in a vector
Can this be an answer ? which(v %in% names(table(v)[table(v)>1])) [1] 2 5 Nael On Wed, Oct 8, 2008 at 8:36 PM, liujb <[EMAIL PROTECTED]> wrote: > > Dear R users, > > I have this vector that consists numeric numbers. Is there a command that > detects the repeated numbers in a vector and returns the index of the > repeated numbers (or the actual numbers)? For example, v <- c(3,4,5,7,4). > The command would return me index 2 and 5 (or the repeated number, 4). > > Thank you very much, > Julia > -- > View this message in context: > http://www.nabble.com/detect-repeated-number-in-a-vector-tp19884768p19884768.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vectorized sub, gsub, grep, etc.
Hi John, Wouldn't you get the same with just mapply(sub, patt, repl, X) ? Nael On Tue, Oct 7, 2008 at 9:58 PM, Thaden, John J <[EMAIL PROTECTED]> wrote: > R pattern-matching and replacement functions are > vectorized: they can operate on vectors of targets. > However, they can only use one pattern and replacement. > Here is code to apply a different pattern and replacement > for every target. My question: can it be done better? > > sub2 <- function(pattern, replacement, x) { >len <- length(x) >if (length(pattern) == 1) >pattern <- rep(pattern, len) >if (length(replacement) == 1) >replacement <- rep(replacement, len) >FUN <- function(i, ...) { >sub(pattern[i], replacement[i], x[i], fixed = TRUE) >} >idx <- 1:length(x) >sapply(idx, FUN) > } > > #Example > X <- c("ab", "cd", "ef") > patt <- c("b", "cd", "a") > repl <- c("B", "CD", "A") > sub2(patt, repl, X) > > -John > > Confidentiality Notice: This e-mail message, including a...{{dropped:8}} > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How can I easily rbind a list of data frames into one data frame?
Try do.call("rbind", nameofyourlist) Nael On Sat, Sep 27, 2008 at 8:51 AM, Matthew Pettis <[EMAIL PROTECTED]>wrote: > Hi, > > I have a list output from the 'lapply' function where the value of > each element of a list is a data frame (each data frame in the list > has the same column types). How can I rbind all of the list entry > values into one data frame? > > Thanks, > Matt > > -- > It is from the wellspring of our despair and the places that we are > broken that we come to repair the world. > -- Murray Waas > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Return a list
The answers that were previously given allow you to easily extract results from your returned list, but if I understand well, this list is created only because you cannot return several arguments whereas you need to keep the values of a, b, c, etc. Am I right? Another solution would be to directly "send" the values you want to keep into the environment where they are needed. The following example supposes you need to keep "a" only in the upper environment from which your function was launched, and "b" in another one (e.g. .GlobalEnv). Hope this may help. Nael > # Here is a function such as yours: > test <- function(){ + a <- 1 + b <- 2 + return(list(a=a, b=b, c=c)) + } > > result <- test() > (a <- result$a) [1] 1 > (b <- result$b) [1] 2 > > rm(a, b) > > # Now our variables will be automatically assigned into the chosen environment > test2 <- function(){ + a <- 1 + b <- 2 + assign("a", a, envir=parent.frame(n=1)) + assign("b", b, envir=.GlobalEnv) + return(NULL) + } > > # Suppose test2 is launched by another function > test2.launcher <- function() { + test2() + print(paste("a exists inside test2.launcher:", exists("a"))) + print(paste("b exists inside test2.launcher:", exists("b"))) + return (NULL) + } > test2.launcher() [1] "a exists inside test2.launcher: TRUE" [1] "b exists inside test2.launcher: TRUE" NULL > exists("a")# a still exists in the upper environment [1] FALSE > exists("b")# b does not [1] TRUE On Fri, Sep 26, 2008 at 9:39 PM, Wacek Kusnierczyk < [EMAIL PROTECTED]> wrote: > Mike Prager wrote: > > "Stefan Fritsch" <[EMAIL PROTECTED]> wrote: > > > > > >> I have several output variables which I give back with the list command. > >> > >> test <- function {return(list(a,b,c,d,e,f,g,...))} > >> > >> After the usage of the function I want to assign the variables to the > output variables. > >> > >> result <- test() > >> > >> a <- result$a > >> b <- result$b > >> c <- result$c > >> d <- result$d > >> ... > >> > >> is there a more elegant way to assign these variables, without writing > them all down? > >> > >> > > arguably ugly and risky, but simple: > > for (name in names(result)) assign(name, result[[name]]) > > (note, for this to work you actually need to name the components of the > returned list: return(list(a=a,b=b,...))) > > vQ > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matrix _0.999375-14 "Note" under CRAN Check, Hmisc_3.4-3 has "Warning", Dpackage_1.0-5 has an "Error"
Sorrry for re-sending this message as 1) a non-subscriber initially, then 2) from an un-subscribed e-mail. As context, I am a newbie, but preparing for a moderately deep dive into new areas af analysis while becoming familiar with R, at the same time. I have looked at the dependencies, amd imports for the Baysean and Econometrics View related analytics in R and have found that the Matrix package referenced above has a Note under check, as Hmisc has a Warning, and Dpackage has an error. These look like fairly core packages for some interesting applications, but I do not know the typical approach for finding out whether there is any work planned on these, or if it a do-it-yourself follow-on after seeing such outputs from the checks. How do you suggest I proceed in getting working versions? Regards, Nick __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] extract species in a phylog tree
Hi, I am working with a phylog tree and I would like to extract a subset of the tree based on the species names (conserving the evolutionary distance and relationships between the pairs of species I am interested in). I see there is an option to select a subset of the tree using node names (phylog.extract), but this is not what I need since one node could take me to several species, some of which I am not interested in. This happens, for example, when I have a phylogenetic tree for, say, all bird species of North America, but I am only interested on those 20 that occur in my community, and I would like to extract the phylog tree for these species. Would anyone have any advice on how to proceed? Thanks a lot, Christine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.